From coleen.phillimore at oracle.com  Mon Oct  2 14:55:14 2017
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Mon, 2 Oct 2017 10:55:14 -0400
Subject: RFR(M): 8166317: InterpreterCodeSize should be computed
In-Reply-To: <CA+3eh125R_kgucQ0pfUAgmW9a+y=hx05_N4eomHBSzd1viEztg@mail.gmail.com>
References: <CA+3eh13aijzoGexJXDB7tTnFfmqE4BffbASUysSb2=osD7F4nA@mail.gmail.com>
 <50cda0ab-f403-372a-ce51-1a27d8821448@oracle.com>
 <CA+3eh11HCkBF8KkMG5-o-Ouji=KLqQ=FtztLWo6u3Han3yxoKw@mail.gmail.com>
 <e00bf17e-0ee5-75ba-8eb3-f766b984fa9f@oracle.com>
 <CA+3eh10Sxx5hsXABdeWCZ_P=AY5aR6F37Gs3cdFEv9P_K8GUrw@mail.gmail.com>
 <CA+3eh13fpKiZZgmHCAORgOjxdrEiTipN4zOxAxdz=++GiMZ0SQ@mail.gmail.com>
 <6704868d-caa7-51e0-4741-5d62f90d837c@oracle.com>
 <CA+3eh12_Z9S-oyMNJf8MLwtveDXAx+2xAqsKjOZjS5o_nm6WUQ@mail.gmail.com>
 <8c522d38-90db-2864-0778-6d5948b1f50c@oracle.com>
 <7fee08f1-8304-3026-19e9-844e618e98ea@oracle.com>
 <CA+3eh125jBF2OEMMJ_22N__UHx=Kj6NMKtpfnfGnnf0WUk4Wbw@mail.gmail.com>
 <2bb4136a-8c0e-ac4c-0c03-af38ff79ab40@oracle.com>
 <CA+3eh125R_kgucQ0pfUAgmW9a+y=hx05_N4eomHBSzd1viEztg@mail.gmail.com>
Message-ID: <5b5219a5-960e-363b-2bdc-3613f1dae62c@oracle.com>


I can sponsor this for you once you rebase, and fix these compilation 
errors.
Thanks,
Coleen

On 9/30/17 12:28 AM, Volker Simonis wrote:
> Hi Vladimir,
>
> thanks a lot for remembering these changes!
>
> Regards,
> Volker
>
>
> Vladimir Kozlov <vladimir.kozlov at oracle.com 
> <mailto:vladimir.kozlov at oracle.com>> schrieb am Fr. 29. Sep. 2017 um 
> 15:47:
>
>     I hit build failure when tried to push changes:
>
>     src/hotspot/share/code/codeBlob.hpp(162) : warning C4267: '=' :
>     conversion from 'size_t' to 'int', possible loss of data
>     src/hotspot/share/code/codeBlob.hpp(163) : warning C4267: '=' :
>     conversion from 'size_t' to 'int', possible loss of data
>
>     I am going to fix it by casting (int):
>
>     +? void adjust_size(size_t used) {
>     +? ? _size = (int)used;
>     +? ? _data_offset = (int)used;
>     +? ? _code_end = (address)this + used;
>     +? ? _data_end = (address)this + used;
>     +? }
>
>     Note, CodeCache size can't more than 2Gb (max_int) so such casting
>     is fine.
>
>     Vladimir
>
>     On 9/6/17 6:20 AM, Volker Simonis wrote:
>     > On Tue, Sep 5, 2017 at 9:36 PM,? <coleen.phillimore at oracle.com
>     <mailto:coleen.phillimore at oracle.com>> wrote:
>     >>
>     >> I was going to make the same comment about the friend
>     declaration in v1, so
>     >> v2 looks better to me.? Looks good.? Thank you for finding a
>     solution to
>     >> this problem that we've had for a long time.? I will sponsor
>     this (remind me
>     >> if I forget after the 18th).
>     >>
>     >
>     > Thanks Coleen! I've updated
>     >
>     > http://cr.openjdk.java.net/~simonis/webrevs/2017/8166317.v2/
>     <http://cr.openjdk.java.net/%7Esimonis/webrevs/2017/8166317.v2/>
>     >
>     > in-place and added you as a second reviewer.
>     >
>     > Regards,
>     > Volker
>     >
>     >
>     >> thanks,
>     >> Coleen
>     >>
>     >>
>     >>
>     >> On 9/5/17 1:17 PM, Vladimir Kozlov wrote:
>     >>>
>     >>> On 9/5/17 9:49 AM, Volker Simonis wrote:
>     >>>>
>     >>>> On Fri, Sep 1, 2017 at 6:16 PM, Vladimir Kozlov
>     >>>> <vladimir.kozlov at oracle.com
>     <mailto:vladimir.kozlov at oracle.com>> wrote:
>     >>>>>
>     >>>>> May be add new CodeBlob's method to adjust sizes instead of
>     directly
>     >>>>> setting
>     >>>>> them in? CodeCache::free_unused_tail(). Then you would not
>     need friend
>     >>>>> class
>     >>>>> CodeCache in CodeBlob.
>     >>>>>
>     >>>>
>     >>>> Changed as suggested (I didn't liked the friend declaration
>     as well :)
>     >>>>
>     >>>>> Also I think adjustment to header_size should be done in
>     >>>>> CodeCache::free_unused_tail() to limit scope of code who
>     knows about
>     >>>>> blob
>     >>>>> layout.
>     >>>>>
>     >>>>
>     >>>> Yes, that's much cleaner. Please find the updated webrev here:
>     >>>>
>     >>>> http://cr.openjdk.java.net/~simonis/webrevs/2017/8166317.v2/
>     <http://cr.openjdk.java.net/%7Esimonis/webrevs/2017/8166317.v2/>
>     >>>
>     >>>
>     >>> Good.
>     >>>
>     >>>>
>     >>>> I've also found another "day 1" problem in StubQueue::next():
>     >>>>
>     >>>>? ? ? Stub* next(Stub* s) const ? ? ? ? { int i =
>     >>>> index_of(s) + stub_size(s);
>     >>>> - ? ? ? ? ?if (i ==
>     >>>> _buffer_limit) i = 0;
>     >>>> + ? ? ? ? ?// Only wrap
>     >>>> around in the non-contiguous case (see stubss.cpp)
>     >>>> + ? ? ? ? ?if (i ==
>     >>>> _buffer_limit && _queue_end < _buffer_limit) i = 0;
>     >>>> ? ? ? ? ? ?return (i ==
>     >>>> _queue_end) ? NULL : stub_at(i);
>     >>>> ? ? ? ? ?}
>     >>>>
>     >>>> The problem was that the method was not prepared to handle
>     the case
>     >>>> where _buffer_limit == _queue_end == _buffer_size which lead
>     to an
>     >>>> infinite recursion when iterating over a StubQueue with
>     >>>> StubQueue::next() until next() returns NULL (as this was for
>     example
>     >>>> done with -XX:+PrintInterpreter). But with the new, trimmed
>     CodeBlob
>     >>>> we run into exactly this situation.
>     >>>
>     >>>
>     >>> Okay.
>     >>>
>     >>>>
>     >>>> While doing this last fix I also noticed that
>     "StubQueue::stubs_do()",
>     >>>> "StubQueue::queues_do()" and "StubQueue::register_queue()"
>     don't seem
>     >>>> to be used anywhere in the open code base (please correct me
>     if I'm
>     >>>> wrong). What do you think, maybe we should remove this code in a
>     >>>> follow up change if it is really not needed?
>     >>>
>     >>>
>     >>> register_queue() is used in constructor. Other 2 you can remove.
>     >>> stub_code_begin() and stub_code_end() are not used too -remove.
>     >>> I thought we run on linux with flag which warn about unused code.
>     >>>
>     >>>>
>     >>>> Finally, could you please run the new version through JPRT
>     and sponsor
>     >>>> it once jdk10/hs will be opened again?
>     >>>
>     >>>
>     >>> Will do when jdk10 "consolidation" is finished. Please, remind
>     me later if
>     >>> I forget.
>     >>>
>     >>> Thanks,
>     >>> Vladimir
>     >>>
>     >>>>
>     >>>> Thanks,
>     >>>> Volker
>     >>>>
>     >>>>> Thanks,
>     >>>>> Vladimir
>     >>>>>
>     >>>>>
>     >>>>> On 9/1/17 8:46 AM, Volker Simonis wrote:
>     >>>>>>
>     >>>>>>
>     >>>>>> Hi,
>     >>>>>>
>     >>>>>> I've decided to split the fix for the
>     'CodeHeap::contains_blob()'
>     >>>>>> problem into its own issue "8187091:
>     ReturnBlobToWrongHeapTest fails
>     >>>>>> because of problems in CodeHeap::contains_blob()"
>     >>>>>> (https://bugs.openjdk.java.net/browse/JDK-8187091) and
>     started a new
>     >>>>>> review thread for discussing it at:
>     >>>>>>
>     >>>>>>
>     >>>>>>
>     http://mail.openjdk.java.net/pipermail/hotspot-dev/2017-September/028206.html
>     >>>>>>
>     >>>>>> So please lets keep this thread for discussing the
>     interpreter code
>     >>>>>> size issue only. I've prepared a new version of the webrev
>     which is
>     >>>>>> the same as the first one with the only difference that the
>     change to
>     >>>>>> 'CodeHeap::contains_blob()' has been removed:
>     >>>>>>
>     >>>>>>
>     http://cr.openjdk.java.net/~simonis/webrevs/2017/8166317.v1/
>     <http://cr.openjdk.java.net/%7Esimonis/webrevs/2017/8166317.v1/>
>     >>>>>>
>     >>>>>> Thanks,
>     >>>>>> Volker
>     >>>>>>
>     >>>>>>
>     >>>>>> On Thu, Aug 31, 2017 at 6:35 PM, Volker Simonis
>     >>>>>> <volker.simonis at gmail.com
>     <mailto:volker.simonis at gmail.com>> wrote:
>     >>>>>>>
>     >>>>>>>
>     >>>>>>> On Thu, Aug 31, 2017 at 6:05 PM, Vladimir Kozlov
>     >>>>>>> <vladimir.kozlov at oracle.com
>     <mailto:vladimir.kozlov at oracle.com>> wrote:
>     >>>>>>>>
>     >>>>>>>>
>     >>>>>>>> Very good change. Thank you, Volker.
>     >>>>>>>>
>     >>>>>>>> About contains_blob(). The problem is that AOTCompiledMethod
>     >>>>>>>> allocated
>     >>>>>>>> in
>     >>>>>>>> CHeap and not in aot code section (which is RO):
>     >>>>>>>>
>     >>>>>>>>
>     >>>>>>>>
>     >>>>>>>>
>     http://hg.openjdk.java.net/jdk10/hs/hotspot/file/8acd232fb52a/src/share/vm/aot/aotCompiledMethod.hpp#l124
>     >>>>>>>>
>     >>>>>>>> It is allocated in CHeap after AOT library is loaded. Its
>     >>>>>>>> code_begin()
>     >>>>>>>> points to AOT code section but AOTCompiledMethod* points
>     outside it
>     >>>>>>>> (to
>     >>>>>>>> normal malloced space) so you can't use (char*)blob address.
>     >>>>>>>>
>     >>>>>>>
>     >>>>>>> Thanks for the explanation - now I got it.
>     >>>>>>>
>     >>>>>>>> There are 2 ways to fix it, I think.
>     >>>>>>>> One is to add new field to CodeBlobLayout and set it to
>     blob* address
>     >>>>>>>> for
>     >>>>>>>> normal CodeCache blobs and to code_begin for AOT code.
>     >>>>>>>> Second is to use contains(blob->code_end() - 1) assuming
>     that AOT
>     >>>>>>>> code
>     >>>>>>>> is
>     >>>>>>>> never zero.
>     >>>>>>>>
>     >>>>>>>
>     >>>>>>> I'll give it a try tomorrow and will send out a new webrev.
>     >>>>>>>
>     >>>>>>> Regards,
>     >>>>>>> Volker
>     >>>>>>>
>     >>>>>>>> Thanks,
>     >>>>>>>> Vladimir
>     >>>>>>>>
>     >>>>>>>>
>     >>>>>>>> On 8/31/17 5:43 AM, Volker Simonis wrote:
>     >>>>>>>>>
>     >>>>>>>>>
>     >>>>>>>>>
>     >>>>>>>>> On Thu, Aug 31, 2017 at 12:14 PM, Claes Redestad
>     >>>>>>>>> <claes.redestad at oracle.com
>     <mailto:claes.redestad at oracle.com>> wrote:
>     >>>>>>>>>>
>     >>>>>>>>>>
>     >>>>>>>>>>
>     >>>>>>>>>>
>     >>>>>>>>>>
>     >>>>>>>>>> On 2017-08-31 08:54, Volker Simonis wrote:
>     >>>>>>>>>>>
>     >>>>>>>>>>>
>     >>>>>>>>>>>
>     >>>>>>>>>>>
>     >>>>>>>>>>> While working on this, I found another problem which
>     is related to
>     >>>>>>>>>>> the
>     >>>>>>>>>>> fix of JDK-8183573 and leads to crashes when executing
>     the JTreg
>     >>>>>>>>>>> test
>     >>>>>>>>>>> compiler/codecache/stress/ReturnBlobToWrongHeapTest.java.
>     >>>>>>>>>>>
>     >>>>>>>>>>> The problem is that JDK-8183573 replaced
>     >>>>>>>>>>>
>     >>>>>>>>>>>? ? ? ? virtual bool contains_blob(const CodeBlob*
>     blob) const {
>     >>>>>>>>>>> return
>     >>>>>>>>>>> low_boundary() <= (char*) blob && (char*) blob < high(); }
>     >>>>>>>>>>>
>     >>>>>>>>>>> by:
>     >>>>>>>>>>>
>     >>>>>>>>>>>? ? ? ? bool contains_blob(const CodeBlob* blob) const
>     { return
>     >>>>>>>>>>> contains(blob->code_begin()); }
>     >>>>>>>>>>>
>     >>>>>>>>>>> But that my be wrong in the corner case where the size
>     of the
>     >>>>>>>>>>> CodeBlob's payload is zero (i.e. the CodeBlob consists
>     only of the
>     >>>>>>>>>>> 'header' - i.e. the C++ object itself) because in that
>     case
>     >>>>>>>>>>> CodeBlob::code_begin() points right behind the
>     CodeBlob's header
>     >>>>>>>>>>> which
>     >>>>>>>>>>> is a memory location which doesn't belong to the
>     CodeBlob anymore.
>     >>>>>>>>>>
>     >>>>>>>>>>
>     >>>>>>>>>>
>     >>>>>>>>>>
>     >>>>>>>>>>
>     >>>>>>>>>> I recall this change was somehow necessary to allow merging
>     >>>>>>>>>> AOTCodeHeap::contains_blob and CodeHead::contains_blob into
>     >>>>>>>>>> one devirtualized method, so you need to ensure all AOT
>     tests
>     >>>>>>>>>> pass with this change (on linux-x64).
>     >>>>>>>>>>
>     >>>>>>>>>
>     >>>>>>>>> All of hotspot/test/aot and hotspot/test/jvmci executed
>     and passed
>     >>>>>>>>> successful. Are there any other tests I should check?
>     >>>>>>>>>
>     >>>>>>>>> That said, it is a little hard to follow the stages of
>     your change.
>     >>>>>>>>> It
>     >>>>>>>>> seems like
>     >>>>>>>>>
>     http://cr.openjdk.java.net/~redestad/scratch/codeheap_contains.00/
>     <http://cr.openjdk.java.net/%7Eredestad/scratch/codeheap_contains.00/>
>     >>>>>>>>> was reviewed [1] but then finally the slightly changed
>     version from
>     >>>>>>>>>
>     http://cr.openjdk.java.net/~redestad/scratch/codeheap_contains.01/
>     <http://cr.openjdk.java.net/%7Eredestad/scratch/codeheap_contains.01/>
>     >>>>>>>>> was
>     >>>>>>>>> checked in and linked to the bug report.
>     >>>>>>>>>
>     >>>>>>>>> The first, reviewed version of the change still had a
>     correct
>     >>>>>>>>> version
>     >>>>>>>>> of 'CodeHeap::contains_blob(const CodeBlob* blob)' while
>     the second,
>     >>>>>>>>> checked in version has the faulty version of that method.
>     >>>>>>>>>
>     >>>>>>>>> I don't know why you finally did that change to
>     'contains_blob()'
>     >>>>>>>>> but
>     >>>>>>>>> I don't see any reason why we shouldn't be able to
>     directly use the
>     >>>>>>>>> blob's address for inclusion checking. From what I
>     understand, it
>     >>>>>>>>> should ALWAYS be contained in the corresponding CodeHeap
>     so no
>     >>>>>>>>> reason
>     >>>>>>>>> to mess with 'CodeBlob::code_begin()'.
>     >>>>>>>>>
>     >>>>>>>>> Please let me know if I'm missing something.
>     >>>>>>>>>
>     >>>>>>>>> [1]
>     >>>>>>>>>
>     >>>>>>>>>
>     >>>>>>>>>
>     http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2017-July/026624.html
>     >>>>>>>>>
>     >>>>>>>>>> I can't help to wonder if we'd not be better served by
>     disallowing
>     >>>>>>>>>> zero-sized payloads. Is this something that can ever
>     actually
>     >>>>>>>>>> happen except by abuse of the white box API?
>     >>>>>>>>>>
>     >>>>>>>>>
>     >>>>>>>>> The corresponding test (ReturnBlobToWrongHeapTest.java)
>     specifically
>     >>>>>>>>> wants to allocate "segment sized" blocks which is most
>     easily
>     >>>>>>>>> achieved
>     >>>>>>>>> by allocation zero-sized CodeBlobs. And I think there's
>     nothing
>     >>>>>>>>> wrong
>     >>>>>>>>> about it if we handle the inclusion tests correctly.
>     >>>>>>>>>
>     >>>>>>>>> Thank you and best regards,
>     >>>>>>>>> Volker
>     >>>>>>>>>
>     >>>>>>>>>> /Claes
>     >>
>     >>
>


From harold.seigel at oracle.com  Mon Oct  2 14:59:25 2017
From: harold.seigel at oracle.com (harold seigel)
Date: Mon, 2 Oct 2017 10:59:25 -0400
Subject: RFR (L) 8186777: Make Klass::_java_mirror an OopHandle
In-Reply-To: <383dcc42-47ea-3d3e-5565-15f8950c35ae@oracle.com>
References: <9adb92ce-fb2d-11df-01dc-722e482a4d40@oracle.com>
 <383dcc42-47ea-3d3e-5565-15f8950c35ae@oracle.com>
Message-ID: <ad077222-0560-8dd8-ee88-931b979ef93a@oracle.com>

Hi Coleen,

The hs runtime changes look good.

Thanks! Harold


On 9/28/2017 5:36 PM, coleen.phillimore at oracle.com wrote:
>
> Thank you to Stefan Karlsson offlist for pointing out that the 
> previous .01 version of this webrev breaks CMS in that it doesn't 
> remember ClassLoaderData::_handles that are changed and added while 
> concurrent marking is in progress.? I've fixed this bug to move the 
> Klass::_modified_oops and _accumulated_modified_oops to the 
> ClassLoaderData and use these fields in the CMS remarking phase to 
> catch any new handles that are added.?? This also fixes this bug 
> https://bugs.openjdk.java.net/browse/JDK-8173988 .
>
> In addition, the previous version of this change removed an 
> optimization during young collection, which showed some uncertain 
> performance regression in young pause times, so I added this 
> optimization back to not walk ClassLoaderData during young collections 
> if all the oops are old.? The performance results of SPECjbb2015 now 
> are slightly better, but not significantly.
>
> This latest patch has been tested on tier1-5 on linux x64 and windows 
> x64 in mach5 test harness.
>
> http://cr.openjdk.java.net/~coleenp/8186777.02/webrev/
>
> Can I get at least 3 reviewers?? One from each of the compiler, gc, 
> and runtime group at least since there are changes to all 3.
>
> Thanks!
> Coleen
>
>
> On 9/6/17 12:04 PM, coleen.phillimore at oracle.com wrote:
>> Summary: Add indirection for fetching mirror so that GC doesn't have 
>> to follow CLD::_klasses
>>
>> Thank you to Tom Rodriguez for Graal changes and Rickard for the C2 
>> changes.
>>
>> Ran nightly tests through Mach5 and RBT.?? Early performance testing 
>> showed good performance improvment in GC class loader data processing 
>> time, but nmethod processing time continues to dominate. Also 
>> performace testing showed no throughput regression.?? I'm rerunning 
>> both of these performance testing and will post the numbers.
>>
>> bug link https://bugs.openjdk.java.net/browse/JDK-8186777
>> open webrev at http://cr.openjdk.java.net/~coleenp/8186777.01/webrev
>>
>> Thanks,
>> Coleen


From coleen.phillimore at oracle.com  Mon Oct  2 15:05:51 2017
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Mon, 2 Oct 2017 11:05:51 -0400
Subject: RFR (L) 8186777: Make Klass::_java_mirror an OopHandle
In-Reply-To: <ad077222-0560-8dd8-ee88-931b979ef93a@oracle.com>
References: <9adb92ce-fb2d-11df-01dc-722e482a4d40@oracle.com>
 <383dcc42-47ea-3d3e-5565-15f8950c35ae@oracle.com>
 <ad077222-0560-8dd8-ee88-931b979ef93a@oracle.com>
Message-ID: <2abe18fc-9ff3-b17f-700e-4cd8ff5c7ee1@oracle.com>


Thank you, Harold!
Coleen

On 10/2/17 10:59 AM, harold seigel wrote:
> Hi Coleen,
>
> The hs runtime changes look good.
>
> Thanks! Harold
>
>
> On 9/28/2017 5:36 PM, coleen.phillimore at oracle.com wrote:
>>
>> Thank you to Stefan Karlsson offlist for pointing out that the 
>> previous .01 version of this webrev breaks CMS in that it doesn't 
>> remember ClassLoaderData::_handles that are changed and added while 
>> concurrent marking is in progress.? I've fixed this bug to move the 
>> Klass::_modified_oops and _accumulated_modified_oops to the 
>> ClassLoaderData and use these fields in the CMS remarking phase to 
>> catch any new handles that are added.?? This also fixes this bug 
>> https://bugs.openjdk.java.net/browse/JDK-8173988 .
>>
>> In addition, the previous version of this change removed an 
>> optimization during young collection, which showed some uncertain 
>> performance regression in young pause times, so I added this 
>> optimization back to not walk ClassLoaderData during young 
>> collections if all the oops are old.? The performance results of 
>> SPECjbb2015 now are slightly better, but not significantly.
>>
>> This latest patch has been tested on tier1-5 on linux x64 and windows 
>> x64 in mach5 test harness.
>>
>> http://cr.openjdk.java.net/~coleenp/8186777.02/webrev/
>>
>> Can I get at least 3 reviewers?? One from each of the compiler, gc, 
>> and runtime group at least since there are changes to all 3.
>>
>> Thanks!
>> Coleen
>>
>>
>> On 9/6/17 12:04 PM, coleen.phillimore at oracle.com wrote:
>>> Summary: Add indirection for fetching mirror so that GC doesn't have 
>>> to follow CLD::_klasses
>>>
>>> Thank you to Tom Rodriguez for Graal changes and Rickard for the C2 
>>> changes.
>>>
>>> Ran nightly tests through Mach5 and RBT.?? Early performance testing 
>>> showed good performance improvment in GC class loader data 
>>> processing time, but nmethod processing time continues to dominate. 
>>> Also performace testing showed no throughput regression.?? I'm 
>>> rerunning both of these performance testing and will post the numbers.
>>>
>>> bug link https://bugs.openjdk.java.net/browse/JDK-8186777
>>> open webrev at http://cr.openjdk.java.net/~coleenp/8186777.01/webrev
>>>
>>> Thanks,
>>> Coleen
>


From tobias.hartmann at oracle.com  Mon Oct  2 15:18:58 2017
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Mon, 2 Oct 2017 17:18:58 +0200
Subject: RFR (L) 8186777: Make Klass::_java_mirror an OopHandle
In-Reply-To: <383dcc42-47ea-3d3e-5565-15f8950c35ae@oracle.com>
References: <9adb92ce-fb2d-11df-01dc-722e482a4d40@oracle.com>
 <383dcc42-47ea-3d3e-5565-15f8950c35ae@oracle.com>
Message-ID: <34c406b6-e993-d662-8fb8-4e7586775b53@oracle.com>

Hi Coleen,

On 28.09.2017 23:36, coleen.phillimore at oracle.com wrote:
> http://cr.openjdk.java.net/~coleenp/8186777.02/webrev/
> 
> Can I get at least 3 reviewers?? One from each of the compiler, gc, and runtime group at least since there are changes 
> to all 3.

The compiler changes look good to me.

Found a little typo:
- In line 1776 of memnode.cpp: it should be "loads" instead of "load"

I just wanted to mention that SharkIntrinsics::do_Object_getClass() would need to be fixed as well but I've seen that 
you filed JDK-8171853 [1] to remove Shark which is broken with JDK 9 anyway.

Best regards,
Tobias

[1] https://bugs.openjdk.java.net/browse/JDK-8171853

From coleen.phillimore at oracle.com  Mon Oct  2 15:24:48 2017
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Mon, 2 Oct 2017 11:24:48 -0400
Subject: CFV: New hotspot Group Member: Ioi Lam
Message-ID: <c80c0f72-aa17-1a18-d0b1-a759afd8283d@oracle.com>

I hereby nominate Ioi Lam (OpenJDK user name: iklam) to Membership in 
the hotspot Group.

Ioi has been working on the hotspot project for over 5 years and is a 
Reviewer in the JDK 9 Project with 79 changes.?? He is an expert in the 
area of class data sharing.

Votes are due by Monday, October 16, 2017.

Only current Members of the hotspot Group [1] are eligible to vote on 
this nomination. Votes must be cast in the open by replying to this 
mailing list.

For Lazy Consensus voting instructions, see [2].

Coleen

[1]http://openjdk.java.net/census#hotspot
[2]http://openjdk.java.net/groups/#member-vote

From coleen.phillimore at oracle.com  Mon Oct  2 15:31:22 2017
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Mon, 2 Oct 2017 11:31:22 -0400
Subject: RFR (L) 8186777: Make Klass::_java_mirror an OopHandle
In-Reply-To: <34c406b6-e993-d662-8fb8-4e7586775b53@oracle.com>
References: <9adb92ce-fb2d-11df-01dc-722e482a4d40@oracle.com>
 <383dcc42-47ea-3d3e-5565-15f8950c35ae@oracle.com>
 <34c406b6-e993-d662-8fb8-4e7586775b53@oracle.com>
Message-ID: <394576b9-ff3a-acf3-fe6a-a0f924afaa8d@oracle.com>


On 10/2/17 11:18 AM, Tobias Hartmann wrote:
> Hi Coleen,
>
> On 28.09.2017 23:36, coleen.phillimore at oracle.com wrote:
>> http://cr.openjdk.java.net/~coleenp/8186777.02/webrev/
>>
>> Can I get at least 3 reviewers?? One from each of the compiler, gc, 
>> and runtime group at least since there are changes to all 3.
>
> The compiler changes look good to me.
>
> Found a little typo:
> - In line 1776 of memnode.cpp: it should be "loads" instead of "load"

Thank you Tobias.? I fixed this typo.
>
> I just wanted to mention that SharkIntrinsics::do_Object_getClass() 
> would need to be fixed as well but I've seen that you filed 
> JDK-8171853 [1] to remove Shark which is broken with JDK 9 anyway.

Yes, I think we've broken shark for a while now and it should be 
removed, unless someone in the open wants to take it over.?? I don't 
have any idea how to build it anymore.

Thanks!
Coleen
>
> Best regards,
> Tobias
>
> [1] https://bugs.openjdk.java.net/browse/JDK-8171853


From daniel.daugherty at oracle.com  Mon Oct  2 15:33:03 2017
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Mon, 2 Oct 2017 09:33:03 -0600
Subject: CFV: New hotspot Group Member: Ioi Lam
In-Reply-To: <c80c0f72-aa17-1a18-d0b1-a759afd8283d@oracle.com>
References: <c80c0f72-aa17-1a18-d0b1-a759afd8283d@oracle.com>
Message-ID: <384bce26-f5ae-304e-4607-39e55f23ff11@oracle.com>

Vote: yes

Dan


On 10/2/17 9:24 AM, coleen.phillimore at oracle.com wrote:
> I hereby nominate Ioi Lam (OpenJDK user name: iklam) to Membership in 
> the hotspot Group.
>
> Ioi has been working on the hotspot project for over 5 years and is a 
> Reviewer in the JDK 9 Project with 79 changes.?? He is an expert in 
> the area of class data sharing.
>
> Votes are due by Monday, October 16, 2017.
>
> Only current Members of the hotspot Group [1] are eligible to vote on 
> this nomination. Votes must be cast in the open by replying to this 
> mailing list.
>
> For Lazy Consensus voting instructions, see [2].
>
> Coleen
>
> [1]http://openjdk.java.net/census#hotspot
> [2]http://openjdk.java.net/groups/#member-vote
>


From bob.vandette at oracle.com  Mon Oct  2 15:46:48 2017
From: bob.vandette at oracle.com (Bob Vandette)
Date: Mon, 2 Oct 2017 11:46:48 -0400
Subject: RFR: 8146115 - Improve docker container detection and resource
 configuration usage 
In-Reply-To: <833ba1a5-49fc-bb24-ff99-994011af52aa@oracle.com>
References: <74630458-926E-4B3E-B967-6F6ADCA0A406@oracle.com>
 <2d9dd746-63e1-cade-28f9-5ca1ae1c253e@oracle.com>
 <200F07CB-35DA-492B-B78D-9EC033EE0431@oracle.com>
 <833ba1a5-49fc-bb24-ff99-994011af52aa@oracle.com>
Message-ID: <10D254F1-ADA7-4EEB-A4AA-9BF6F42B72E0@oracle.com>


> On Sep 27, 2017, at 9:20 PM, David Holmes <David.Holmes at oracle.com> wrote:
> 
> Hi Bob,
> 
> On 28/09/2017 1:45 AM, Bob Vandette wrote:
>> David,  Thank you for taking the time and providing a detailed review of these changes.
>> Where I haven?t responded, I?ll update the implementation based on your comments.
> 
> Okay. I've trimmed below to only leave things I have follow up on.
> 
>>> If this is all confined to Linux only then this should be a linux-only flag and all the changes should be confined to linux code. No shared osContainer API is needed as it can be defined as a nested class of os::Linux, and will only be called from os_linux.cpp.
>> I received feedback on my other Container work where I was asked to
>> make sure it was possible to support other container technologies.
>> The addition of the shared osContainer API is to prepare for this and
>> recognize that this will eventually be supported other platforms.
> 
> The problem is that the proposed osContainer API is totally cgroup centric. That API might not make sense for a different container technology. Even if Docker is used on different platforms, does it use cgroups on those other platforms? Until we have at least two examples we want to support we don't know how to formulate a generic API. So in my opinion we should initially keep this Linux specific as a proof-of-concept for future more general container support.

I was trying to prepare for the JEP implementation where M&M and JFR hooks will need a shared API to call.
I was expecting to return a not supported error code on platforms that didn?t have the os specific implementations.
I did take a look at a few other types of containers (VMWare?s SDK for example) and they all had similar types of functions
for retrieving the number of cpus and quotas along with the memory limits, swap and free space.  I assumed that we
could clean up the shared APIs once we did the second container support.

In any case that work can be done by the JEP integration so I?m ok with making this os/linux specific but I still would
like to keep this support in it?s own file (osContainer_linux.cpp and osContainer_linux.hpp) so all the cgroup processing
is kept separate and these files don?t have to move later.  This would make it easier to support  alternate types of containers.
I also wanted to avoid adding lots more size to os_linux.cpp.  It?s already too big.

Bob.

> 
>>>> Since the dynamic selection of CPUs based on cpusets, quotas and shares
>>>> may not satisfy every users needs, I?ve added an additional flag to allow the
>>>> number of CPUs to be overridden.  This flag is named -XX:ActiveProcessorCount=xx.
>>> 
>>> I would suggest that ActiveProcessorCount be constrained to being >1 - this is in line with our plans to get rid of AssumeMP/os::is_MP and always build in MP support. Otherwise a count of 1 today won't behave the same as a count of 1 in the future.
>> What if I return true for is_MP anytime ActiveProcessorCount is set.   I?d like to provide the ability of specifying a single processor.
> 
> If I make the AssumeMP change for 18.3 as planned then this won't be an issue. I'd better get onto that :)
> 
>>> 
>>> Also you have defined this globally but only accounted for it on Linux. I think it makes sense to support this flag on all platforms (a generalization of AssumeMP). Otherwise it needs to be defined as a Linux-only flag in the pd_globals.hpp file
>> Good idea.
> 
> You could even factor this out as a separate issue/task independent of the container work.
> 
>>> Style issue:
>>> 
>>> 2121     if (i < 0) st->print("OSContainer::active_processor_count() failed");
>>> 2122     else
>>> 
>>> and elsewhere. Please move the st->print to its own line. Others may argue for always using blocks ({}) in if/else.
>> There doesn?t seem to be consistency on this issue.
> 
> No there's no consistency :( And this isn't in the hotspot style guide AFAICS. But I'm sure it's in some other coding guidelines ;-)
> 
>>> 5024   // User has overridden the number of active processors
>>> 5025   if (!FLAG_IS_DEFAULT(ActiveProcessorCount)) {
>>> 5026     log_trace(os)("active_processor_count: "
>>> 5027                   "active processor count set by user : %d",
>>> 5028                   (int)ActiveProcessorCount);
>>> 5029     return ActiveProcessorCount;
>>> 5030   }
>>> 
>>> We don't normally check flags in runtime code like this - this will be executed on every call, and you will see that logging each time. This should be handled during initialization (os::Posix::init()? - if applying this flag globally) - with logging occurring once. The above should just reduce to:
>>> 
>>> if (ActiveProcessorCount > 0) {
>>>  return ActiveProcessorCount; // explicit user control of number of cpus
>>> }
>>> 
>>> Even then I do get concerned about having to always check for the least common cases before the most common one. :(
>> This is not in a highly used function so it should be ok.
> 
> I really don't like seeing the FLAG_IS_DEFAULT in there - and you need to move the logging anyway.
> 
>>> 
>>> The osContainer_<os>.hpp files seem to be unnecessary as they are all empty.
>> I?ll remove them.  I wasn?t sure if there was a convention to move more of osContainer_linux.cpp -> osContainer_linux.hpp.
>> For example: classCgroupSubsystem
> 
> The header is only needed to expose an API for other code to use. Locally defined classes can be kept in the .cpp file.
> 
>>> 34 class CgroupSubsystem: CHeapObj<mtInternal> {
>>> 
>>> You defined this class as CHeapObj and added a destructor to free a few things, but I can't see where the instances of this class will themselves ever be freed
>> What?s the latest thinking on freeing CHeap Objects on termination?  Is it really worth wasting cpu cycles when our
>> process is about to terminate?  If not, I?ll just remove the destructors.
> 
> Philosophically I prefer new APIs to play nice with the invocation API, even if existing API's don't play nice. But that's just me.
> 
>>> 
>>> 62     void set_subsystem_path(char *cgroup_path) {
>>> 
>>> If this takes a "const char*" will it save you from casting string literals to "char*" elsewhere?
>> I tried several different ways of declaring the container accessor functions and
>> always ended up with warnings due to scanf not being able to validate arguments
>> since the format string didn?t end up being a string literal.  I originally was using templates
>> and then ended up with the macros.  I tried several different casts but could resolve the problem.
> 
> Sounds like something Kim Barrett should take a look at :)
> 
> Thanks,
> David


From tobias.hartmann at oracle.com  Mon Oct  2 16:07:40 2017
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Mon, 2 Oct 2017 18:07:40 +0200
Subject: CFV: New hotspot Group Member: Ioi Lam
In-Reply-To: <c80c0f72-aa17-1a18-d0b1-a759afd8283d@oracle.com>
References: <c80c0f72-aa17-1a18-d0b1-a759afd8283d@oracle.com>
Message-ID: <4925faea-082c-4d8b-5e48-49c5299e3f6d@oracle.com>

Vote: yes

On 02.10.2017 17:24, coleen.phillimore at oracle.com wrote:
> I hereby nominate Ioi Lam (OpenJDK user name: iklam) to Membership in the hotspot Group.
> 
> Ioi has been working on the hotspot project for over 5 years and is a Reviewer in the JDK 9 Project with 79 changes.   
> He is an expert in the area of class data sharing.
> 
> Votes are due by Monday, October 16, 2017.
> 
> Only current Members of the hotspot Group [1] are eligible to vote on this nomination. Votes must be cast in the open by 
> replying to this mailing list.
> 
> For Lazy Consensus voting instructions, see [2].
> 
> Coleen
> 
> [1]http://openjdk.java.net/census#hotspot
> [2]http://openjdk.java.net/groups/#member-vote

From jesper.wilhelmsson at oracle.com  Mon Oct  2 16:45:48 2017
From: jesper.wilhelmsson at oracle.com (jesper.wilhelmsson at oracle.com)
Date: Mon, 2 Oct 2017 18:45:48 +0200
Subject: CFV: New hotspot Group Member: Ioi Lam
In-Reply-To: <c80c0f72-aa17-1a18-d0b1-a759afd8283d@oracle.com>
References: <c80c0f72-aa17-1a18-d0b1-a759afd8283d@oracle.com>
Message-ID: <6B9190C0-E9C8-42FC-9F73-38FAC49C1EDF@oracle.com>

Vote: yes
/Jesper

> On 2 Oct 2017, at 17:24, coleen.phillimore at oracle.com wrote:
> 
> I hereby nominate Ioi Lam (OpenJDK user name: iklam) to Membership in the hotspot Group.
> 
> Ioi has been working on the hotspot project for over 5 years and is a Reviewer in the JDK 9 Project with 79 changes.   He is an expert in the area of class data sharing.
> 
> Votes are due by Monday, October 16, 2017.
> 
> Only current Members of the hotspot Group [1] are eligible to vote on this nomination. Votes must be cast in the open by replying to this mailing list.
> 
> For Lazy Consensus voting instructions, see [2].
> 
> Coleen
> 
> [1]http://openjdk.java.net/census#hotspot
> [2]http://openjdk.java.net/groups/#member-vote


From erik.osterlund at oracle.com  Mon Oct  2 16:48:34 2017
From: erik.osterlund at oracle.com (Erik Osterlund)
Date: Mon, 2 Oct 2017 18:48:34 +0200
Subject: RFR (L) 8186777: Make Klass::_java_mirror an OopHandle
In-Reply-To: <383dcc42-47ea-3d3e-5565-15f8950c35ae@oracle.com>
References: <9adb92ce-fb2d-11df-01dc-722e482a4d40@oracle.com>
 <383dcc42-47ea-3d3e-5565-15f8950c35ae@oracle.com>
Message-ID: <B8460BA2-691F-4476-82A4-212EEE92BEB9@oracle.com>

Hi Coleen,

I looked a bit at the code generation part of this change.
It beats me that the indirect load required for resolution of the oop handle was somewhat encapsulated in a resolve oop handle call in the macro assembler (a bit like resolve jobject), but in the corresponding C1 and C2 code, there is no such abstraction. Instead the loads required for resolve are generated straight up. Therefore, if the logic involved in resolving an OopHandle ever changes, it might start to get tricky to chase down where it is being used too.

So I wonder if you would find it useful to encapsulate that into some method on e.g. LIRGenerator for C1 and GraphKit for C2?
In the case of C2 it might be a bit tricky to abstract due to the node matching logic, unless we want to macro expand a new ResolveOopHandleNode, or something like that. Or a matching function maybe.

Just a thought that beat me reading through the changes. I like abstractions!

Thanks,
/Erik

> On 28 Sep 2017, at 23:36, coleen.phillimore at oracle.com wrote:
> 
> 
> Thank you to Stefan Karlsson offlist for pointing out that the previous .01 version of this webrev breaks CMS in that it doesn't remember ClassLoaderData::_handles that are changed and added while concurrent marking is in progress.  I've fixed this bug to move the Klass::_modified_oops and _accumulated_modified_oops to the ClassLoaderData and use these fields in the CMS remarking phase to catch any new handles that are added.   This also fixes this bug https://bugs.openjdk.java.net/browse/JDK-8173988 .
> 
> In addition, the previous version of this change removed an optimization during young collection, which showed some uncertain performance regression in young pause times, so I added this optimization back to not walk ClassLoaderData during young collections if all the oops are old.  The performance results of SPECjbb2015 now are slightly better, but not significantly.
> 
> This latest patch has been tested on tier1-5 on linux x64 and windows x64 in mach5 test harness.
> 
> http://cr.openjdk.java.net/~coleenp/8186777.02/webrev/
> 
> Can I get at least 3 reviewers?  One from each of the compiler, gc, and runtime group at least since there are changes to all 3.
> 
> Thanks!
> Coleen
> 
> 
>> On 9/6/17 12:04 PM, coleen.phillimore at oracle.com wrote:
>> Summary: Add indirection for fetching mirror so that GC doesn't have to follow CLD::_klasses
>> 
>> Thank you to Tom Rodriguez for Graal changes and Rickard for the C2 changes.
>> 
>> Ran nightly tests through Mach5 and RBT.   Early performance testing showed good performance improvment in GC class loader data processing time, but nmethod processing time continues to dominate. Also performace testing showed no throughput regression.   I'm rerunning both of these performance testing and will post the numbers.
>> 
>> bug link https://bugs.openjdk.java.net/browse/JDK-8186777
>> open webrev at http://cr.openjdk.java.net/~coleenp/8186777.01/webrev
>> 
>> Thanks,
>> Coleen


From coleen.phillimore at oracle.com  Mon Oct  2 17:04:19 2017
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Mon, 2 Oct 2017 13:04:19 -0400
Subject: RFR (L) 8186777: Make Klass::_java_mirror an OopHandle
In-Reply-To: <B8460BA2-691F-4476-82A4-212EEE92BEB9@oracle.com>
References: <9adb92ce-fb2d-11df-01dc-722e482a4d40@oracle.com>
 <383dcc42-47ea-3d3e-5565-15f8950c35ae@oracle.com>
 <B8460BA2-691F-4476-82A4-212EEE92BEB9@oracle.com>
Message-ID: <e26fa208-23bb-76ba-b70c-d2857e6581d4@oracle.com>


On 10/2/17 12:48 PM, Erik Osterlund wrote:
> Hi Coleen,
>
> I looked a bit at the code generation part of this change.
> It beats me that the indirect load required for resolution of the oop handle was somewhat encapsulated in a resolve oop handle call in the macro assembler (a bit like resolve jobject), but in the corresponding C1 and C2 code, there is no such abstraction. Instead the loads required for resolve are generated straight up. Therefore, if the logic involved in resolving an OopHandle ever changes, it might start to get tricky to chase down where it is being used too.

Hi Erik,? I wanted the load encaspulated in resolve_oop_handle() in the 
macroAssembler, but I didn't know how to change the c1/c2 code (or 
graal) to do the same.
>
> So I wonder if you would find it useful to encapsulate that into some method on e.g. LIRGenerator for C1 and GraphKit for C2?
> In the case of C2 it might be a bit tricky to abstract due to the node matching logic, unless we want to macro expand a new ResolveOopHandleNode, or something like that. Or a matching function maybe.

Can I file a seperate RFE for this?? I like the idea very much but would 
like to push this larger change first.
>
> Just a thought that beat me reading through the changes. I like abstractions!

Me too!
Thanks,
Coleen
>
> Thanks,
> /Erik
>
>> On 28 Sep 2017, at 23:36, coleen.phillimore at oracle.com wrote:
>>
>>
>> Thank you to Stefan Karlsson offlist for pointing out that the previous .01 version of this webrev breaks CMS in that it doesn't remember ClassLoaderData::_handles that are changed and added while concurrent marking is in progress.  I've fixed this bug to move the Klass::_modified_oops and _accumulated_modified_oops to the ClassLoaderData and use these fields in the CMS remarking phase to catch any new handles that are added.   This also fixes this bug https://bugs.openjdk.java.net/browse/JDK-8173988 .
>>
>> In addition, the previous version of this change removed an optimization during young collection, which showed some uncertain performance regression in young pause times, so I added this optimization back to not walk ClassLoaderData during young collections if all the oops are old.  The performance results of SPECjbb2015 now are slightly better, but not significantly.
>>
>> This latest patch has been tested on tier1-5 on linux x64 and windows x64 in mach5 test harness.
>>
>> http://cr.openjdk.java.net/~coleenp/8186777.02/webrev/
>>
>> Can I get at least 3 reviewers?  One from each of the compiler, gc, and runtime group at least since there are changes to all 3.
>>
>> Thanks!
>> Coleen
>>
>>
>>> On 9/6/17 12:04 PM, coleen.phillimore at oracle.com wrote:
>>> Summary: Add indirection for fetching mirror so that GC doesn't have to follow CLD::_klasses
>>>
>>> Thank you to Tom Rodriguez for Graal changes and Rickard for the C2 changes.
>>>
>>> Ran nightly tests through Mach5 and RBT.   Early performance testing showed good performance improvment in GC class loader data processing time, but nmethod processing time continues to dominate. Also performace testing showed no throughput regression.   I'm rerunning both of these performance testing and will post the numbers.
>>>
>>> bug link https://bugs.openjdk.java.net/browse/JDK-8186777
>>> open webrev at http://cr.openjdk.java.net/~coleenp/8186777.01/webrev
>>>
>>> Thanks,
>>> Coleen


From coleen.phillimore at oracle.com  Mon Oct  2 17:10:15 2017
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Mon, 2 Oct 2017 13:10:15 -0400
Subject: CFV: New hotspot Group Member: Ioi Lam
In-Reply-To: <c80c0f72-aa17-1a18-d0b1-a759afd8283d@oracle.com>
References: <c80c0f72-aa17-1a18-d0b1-a759afd8283d@oracle.com>
Message-ID: <3a248076-9b93-ab0d-0327-3c24931014e6@oracle.com>

Vote: yes

On 10/2/17 11:24 AM, coleen.phillimore at oracle.com wrote:
> I hereby nominate Ioi Lam (OpenJDK user name: iklam) to Membership in 
> the hotspot Group.
>
> Ioi has been working on the hotspot project for over 5 years and is a 
> Reviewer in the JDK 9 Project with 79 changes.?? He is an expert in 
> the area of class data sharing.
>
> Votes are due by Monday, October 16, 2017.
>
> Only current Members of the hotspot Group [1] are eligible to vote on 
> this nomination. Votes must be cast in the open by replying to this 
> mailing list.
>
> For Lazy Consensus voting instructions, see [2].
>
> Coleen
>
> [1]http://openjdk.java.net/census#hotspot
> [2]http://openjdk.java.net/groups/#member-vote


From erik.osterlund at oracle.com  Mon Oct  2 17:13:38 2017
From: erik.osterlund at oracle.com (Erik Osterlund)
Date: Mon, 2 Oct 2017 19:13:38 +0200
Subject: RFR (L) 8186777: Make Klass::_java_mirror an OopHandle
In-Reply-To: <e26fa208-23bb-76ba-b70c-d2857e6581d4@oracle.com>
References: <9adb92ce-fb2d-11df-01dc-722e482a4d40@oracle.com>
 <383dcc42-47ea-3d3e-5565-15f8950c35ae@oracle.com>
 <B8460BA2-691F-4476-82A4-212EEE92BEB9@oracle.com>
 <e26fa208-23bb-76ba-b70c-d2857e6581d4@oracle.com>
Message-ID: <D825566F-C981-4092-A05A-7FAA880B76FC@oracle.com>

Hi Coleen,

> On 2 Oct 2017, at 19:04, coleen.phillimore at oracle.com wrote:
> 
> 
> 
>> On 10/2/17 12:48 PM, Erik Osterlund wrote:
>> Hi Coleen,
>> 
>> I looked a bit at the code generation part of this change.
>> It beats me that the indirect load required for resolution of the oop handle was somewhat encapsulated in a resolve oop handle call in the macro assembler (a bit like resolve jobject), but in the corresponding C1 and C2 code, there is no such abstraction. Instead the loads required for resolve are generated straight up. Therefore, if the logic involved in resolving an OopHandle ever changes, it might start to get tricky to chase down where it is being used too.
> 
> Hi Erik,  I wanted the load encaspulated in resolve_oop_handle() in the macroAssembler, but I didn't know how to change the c1/c2 code (or graal) to do the same.
>> 
>> So I wonder if you would find it useful to encapsulate that into some method on e.g. LIRGenerator for C1 and GraphKit for C2?
>> In the case of C2 it might be a bit tricky to abstract due to the node matching logic, unless we want to macro expand a new ResolveOopHandleNode, or something like that. Or a matching function maybe.
> 
> Can I file a seperate RFE for this?  I like the idea very much but would like to push this larger change first.

Sure, I am fine with that.

>> 
>> Just a thought that beat me reading through the changes. I like abstractions!
> 
> Me too!

:)

Thanks,
/Erik

> Thanks,
> Coleen
>> 
>> Thanks,
>> /Erik
>> 
>>> On 28 Sep 2017, at 23:36, coleen.phillimore at oracle.com wrote:
>>> 
>>> 
>>> Thank you to Stefan Karlsson offlist for pointing out that the previous .01 version of this webrev breaks CMS in that it doesn't remember ClassLoaderData::_handles that are changed and added while concurrent marking is in progress.  I've fixed this bug to move the Klass::_modified_oops and _accumulated_modified_oops to the ClassLoaderData and use these fields in the CMS remarking phase to catch any new handles that are added.   This also fixes this bug https://bugs.openjdk.java.net/browse/JDK-8173988 .
>>> 
>>> In addition, the previous version of this change removed an optimization during young collection, which showed some uncertain performance regression in young pause times, so I added this optimization back to not walk ClassLoaderData during young collections if all the oops are old.  The performance results of SPECjbb2015 now are slightly better, but not significantly.
>>> 
>>> This latest patch has been tested on tier1-5 on linux x64 and windows x64 in mach5 test harness.
>>> 
>>> http://cr.openjdk.java.net/~coleenp/8186777.02/webrev/
>>> 
>>> Can I get at least 3 reviewers?  One from each of the compiler, gc, and runtime group at least since there are changes to all 3.
>>> 
>>> Thanks!
>>> Coleen
>>> 
>>> 
>>>> On 9/6/17 12:04 PM, coleen.phillimore at oracle.com wrote:
>>>> Summary: Add indirection for fetching mirror so that GC doesn't have to follow CLD::_klasses
>>>> 
>>>> Thank you to Tom Rodriguez for Graal changes and Rickard for the C2 changes.
>>>> 
>>>> Ran nightly tests through Mach5 and RBT.   Early performance testing showed good performance improvment in GC class loader data processing time, but nmethod processing time continues to dominate. Also performace testing showed no throughput regression.   I'm rerunning both of these performance testing and will post the numbers.
>>>> 
>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8186777
>>>> open webrev at http://cr.openjdk.java.net/~coleenp/8186777.01/webrev
>>>> 
>>>> Thanks,
>>>> Coleen
> 


From vladimir.kozlov at oracle.com  Mon Oct  2 18:29:12 2017
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Mon, 2 Oct 2017 11:29:12 -0700
Subject: CFV: New hotspot Group Member: Ioi Lam
In-Reply-To: <3a248076-9b93-ab0d-0327-3c24931014e6@oracle.com>
References: <c80c0f72-aa17-1a18-d0b1-a759afd8283d@oracle.com>
 <3a248076-9b93-ab0d-0327-3c24931014e6@oracle.com>
Message-ID: <F0913302-3E87-475A-BEC2-60CA4635004C@oracle.com>

Vote: yes 

Vladimir

> On Oct 2, 2017, at 10:10 AM, coleen.phillimore at oracle.com wrote:
> 
> Vote: yes
> 
>> On 10/2/17 11:24 AM, coleen.phillimore at oracle.com wrote:
>> I hereby nominate Ioi Lam (OpenJDK user name: iklam) to Membership in the hotspot Group.
>> 
>> Ioi has been working on the hotspot project for over 5 years and is a Reviewer in the JDK 9 Project with 79 changes.   He is an expert in the area of class data sharing.
>> 
>> Votes are due by Monday, October 16, 2017.
>> 
>> Only current Members of the hotspot Group [1] are eligible to vote on this nomination. Votes must be cast in the open by replying to this mailing list.
>> 
>> For Lazy Consensus voting instructions, see [2].
>> 
>> Coleen
>> 
>> [1]http://openjdk.java.net/census#hotspot
>> [2]http://openjdk.java.net/groups/#member-vote
> 


From robbin.ehn at oracle.com  Mon Oct  2 20:20:58 2017
From: robbin.ehn at oracle.com (Robbin Ehn)
Date: Mon, 2 Oct 2017 22:20:58 +0200
Subject: RFR: 8146115 - Improve docker container detection and resource
 configuration usage
In-Reply-To: <74630458-926E-4B3E-B967-6F6ADCA0A406@oracle.com>
References: <74630458-926E-4B3E-B967-6F6ADCA0A406@oracle.com>
Message-ID: <7e9322e8-274d-9fb2-f6a5-8cc612e3fe68@oracle.com>

Hi Bob,

As I said in your presentation for RT.
If kernel if configured with cgroup this should always be read (otherwise we get wrong values).
E.g. fedora have had cgroups default on several years (I believe most distros have it on).

- No option is needed at all: right now we have wrong values your fix will provide right ones, why would you ever what to turn that off?
- log target container would make little sense since almost all linuxes run with croups on.
- For cpuset, the processes affinity mask already reflect cgroup setting so you don't need to look into cgroup for that
   If you do, you would miss any processes specific affinity mask. So _cpu_count() should already be returning the right number of CPU's.

Thanks for trying to fixing this!

/Robbin

On 09/22/2017 04:27 PM, Bob Vandette wrote:
> Please review these changes that improve on docker container detection and the
> automatic configuration of the number of active CPUs and total and free memory
> based on the containers resource limitation settings and metric data files.
> 
> http://cr.openjdk.java.net/~bobv/8146115/webrev.00/ <http://cr.openjdk.java.net/~bobv/8146115/webrev.00/>
> 
> These changes are enabled with -XX:+UseContainerSupport.
> 
> You can enable logging for this support via -Xlog:os+container=trace.
> 
> Since the dynamic selection of CPUs based on cpusets, quotas and shares
> may not satisfy every users needs, I?ve added an additional flag to allow the
> number of CPUs to be overridden.  This flag is named -XX:ActiveProcessorCount=xx.
> 
> 
> Bob.
> 
> 
> 

From david.holmes at oracle.com  Mon Oct  2 22:05:16 2017
From: david.holmes at oracle.com (David Holmes)
Date: Tue, 3 Oct 2017 08:05:16 +1000
Subject: CFV: New hotspot Group Member: Ioi Lam
In-Reply-To: <c80c0f72-aa17-1a18-d0b1-a759afd8283d@oracle.com>
References: <c80c0f72-aa17-1a18-d0b1-a759afd8283d@oracle.com>
Message-ID: <5c43ea90-b50d-bfa9-1584-a5820b7040f2@oracle.com>

Vote: yes

David

On 3/10/2017 1:24 AM, coleen.phillimore at oracle.com wrote:
> I hereby nominate Ioi Lam (OpenJDK user name: iklam) to Membership in 
> the hotspot Group.
> 
> Ioi has been working on the hotspot project for over 5 years and is a 
> Reviewer in the JDK 9 Project with 79 changes.?? He is an expert in the 
> area of class data sharing.
> 
> Votes are due by Monday, October 16, 2017.
> 
> Only current Members of the hotspot Group [1] are eligible to vote on 
> this nomination. Votes must be cast in the open by replying to this 
> mailing list.
> 
> For Lazy Consensus voting instructions, see [2].
> 
> Coleen
> 
> [1]http://openjdk.java.net/census#hotspot
> [2]http://openjdk.java.net/groups/#member-vote

From david.holmes at oracle.com  Mon Oct  2 22:46:20 2017
From: david.holmes at oracle.com (David Holmes)
Date: Tue, 3 Oct 2017 08:46:20 +1000
Subject: RFR: 8146115 - Improve docker container detection and resource
 configuration usage
In-Reply-To: <7e9322e8-274d-9fb2-f6a5-8cc612e3fe68@oracle.com>
References: <74630458-926E-4B3E-B967-6F6ADCA0A406@oracle.com>
 <7e9322e8-274d-9fb2-f6a5-8cc612e3fe68@oracle.com>
Message-ID: <33d617e7-7ec4-ebde-efa1-5602189e8470@oracle.com>

Hi Robbin,

I have some views on this :)

On 3/10/2017 6:20 AM, Robbin Ehn wrote:
> Hi Bob,
> 
> As I said in your presentation for RT.
> If kernel if configured with cgroup this should always be read 
> (otherwise we get wrong values).
> E.g. fedora have had cgroups default on several years (I believe most 
> distros have it on).
> 
> - No option is needed at all: right now we have wrong values your fix 
> will provide right ones, why would you ever what to turn that off?

It's not that you would want to turn that off (necessarily) but just 
because cgroups capability exists it doesn't mean they have actually 
been enabled and configured - in which case reading all the cgroup info 
is unnecessary startup overhead. So for now this is opt-in - as was the 
experimental cgroup support we added. Once it becomes clearer how this 
needs to be used we can adjust the defaults. For now this is enabling 
technology only.

> - log target container would make little sense since almost all linuxes 
> run with croups on.

Again the capability is present but may not be enabled/configured.

> - For cpuset, the processes affinity mask already reflect cgroup setting 
> so you don't need to look into cgroup for that
>  ? If you do, you would miss any processes specific affinity mask. So 
> _cpu_count() should already be returning the right number of CPU's.

While the process affinity mask reflect cpusets (and we already use it 
for that reason), it doesn't reflect shares and quotas. And if 
shares/quotas are enforced and someone sets a custom affinity mask, what 
is it all supposed to mean? That's one of the main reasons to allow the 
number of cpu's to be hardwired via a flag. So it's better IMHO to read 
everything from the cgroups if configured to use cgroups.

Cheers,
David

> 
> Thanks for trying to fixing this!
> 
> /Robbin
> 
> On 09/22/2017 04:27 PM, Bob Vandette wrote:
>> Please review these changes that improve on docker container detection 
>> and the
>> automatic configuration of the number of active CPUs and total and 
>> free memory
>> based on the containers resource limitation settings and metric data 
>> files.
>>
>> http://cr.openjdk.java.net/~bobv/8146115/webrev.00/ 
>> <http://cr.openjdk.java.net/~bobv/8146115/webrev.00/>
>>
>> These changes are enabled with -XX:+UseContainerSupport.
>>
>> You can enable logging for this support via -Xlog:os+container=trace.
>>
>> Since the dynamic selection of CPUs based on cpusets, quotas and shares
>> may not satisfy every users needs, I?ve added an additional flag to 
>> allow the
>> number of CPUs to be overridden.? This flag is named 
>> -XX:ActiveProcessorCount=xx.
>>
>>
>> Bob.
>>
>>
>>

From david.holmes at oracle.com  Tue Oct  3 01:33:55 2017
From: david.holmes at oracle.com (David Holmes)
Date: Tue, 3 Oct 2017 11:33:55 +1000
Subject: (XS) RFR: 8188246: Add
 test/hotspot/jtreg/gc/logging/TestPrintReferences.java to ProblemList.txt
Message-ID: <86cbbd16-f353-5214-7a4c-aca3e74afbab@oracle.com>

The test fails intermittently in tier1 testing so we need to exclude it 
until fixed. patch inline below.

webrev: http://cr.openjdk.java.net/~dholmes/8188246/webrev/

Will push under trivial rules as soon as I have one Review.

Thanks,
David

--- old/test/hotspot/jtreg/ProblemList.txt	2017-10-02 21:26:20.127717945 
-0400
+++ new/test/hotspot/jtreg/ProblemList.txt	2017-10-02 21:26:18.043599357 
-0400
@@ -64,6 +64,7 @@
  gc/g1/humongousObjects/TestHeapCounters.java 8178918 generic-all
  gc/stress/gclocker/TestGCLockerWithG1.java 8179226 generic-all
 
gc/survivorAlignment/TestPromotionFromSurvivorToTenuredAfterMinorGC.java 
8177765 generic-all
+gc/logging/TestPrintReferences.java 8188245 generic-all

 
#############################################################################

From daniel.daugherty at oracle.com  Tue Oct  3 01:56:54 2017
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Mon, 2 Oct 2017 19:56:54 -0600
Subject: (XS) RFR: 8188246: Add
 test/hotspot/jtreg/gc/logging/TestPrintReferences.java to ProblemList.txt
In-Reply-To: <86cbbd16-f353-5214-7a4c-aca3e74afbab@oracle.com>
References: <86cbbd16-f353-5214-7a4c-aca3e74afbab@oracle.com>
Message-ID: <dbdc99ea-1b4e-96fa-9067-cd7422024fd6@oracle.com>

On 10/2/17 7:33 PM, David Holmes wrote:
> The test fails intermittently in tier1 testing so we need to exclude 
> it until fixed. patch inline below.
>
> webrev: http://cr.openjdk.java.net/~dholmes/8188246/webrev/

Thumbs up!

Dan


>
> Will push under trivial rules as soon as I have one Review.
>
> Thanks,
> David
>
> --- old/test/hotspot/jtreg/ProblemList.txt??? 2017-10-02 
> 21:26:20.127717945 -0400
> +++ new/test/hotspot/jtreg/ProblemList.txt??? 2017-10-02 
> 21:26:18.043599357 -0400
> @@ -64,6 +64,7 @@
> ?gc/g1/humongousObjects/TestHeapCounters.java 8178918 generic-all
> ?gc/stress/gclocker/TestGCLockerWithG1.java 8179226 generic-all
>
> gc/survivorAlignment/TestPromotionFromSurvivorToTenuredAfterMinorGC.java 
> 8177765 generic-all
> +gc/logging/TestPrintReferences.java 8188245 generic-all
>
>
> ############################################################################# 
>


From serguei.spitsyn at oracle.com  Tue Oct  3 01:58:42 2017
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Mon, 2 Oct 2017 18:58:42 -0700
Subject: (XS) RFR: 8188246: Add
 test/hotspot/jtreg/gc/logging/TestPrintReferences.java to ProblemList.txt
In-Reply-To: <dbdc99ea-1b4e-96fa-9067-cd7422024fd6@oracle.com>
References: <86cbbd16-f353-5214-7a4c-aca3e74afbab@oracle.com>
 <dbdc99ea-1b4e-96fa-9067-cd7422024fd6@oracle.com>
Message-ID: <dde92e43-0569-53f2-d3d3-06ebd032816a@oracle.com>

+1

Thanks,
Serguei

On 10/2/17 18:56, Daniel D. Daugherty wrote:
> On 10/2/17 7:33 PM, David Holmes wrote:
>> The test fails intermittently in tier1 testing so we need to exclude 
>> it until fixed. patch inline below.
>>
>> webrev: http://cr.openjdk.java.net/~dholmes/8188246/webrev/
>
> Thumbs up!
>
> Dan
>
>
>>
>> Will push under trivial rules as soon as I have one Review.
>>
>> Thanks,
>> David
>>
>> --- old/test/hotspot/jtreg/ProblemList.txt??? 2017-10-02 
>> 21:26:20.127717945 -0400
>> +++ new/test/hotspot/jtreg/ProblemList.txt??? 2017-10-02 
>> 21:26:18.043599357 -0400
>> @@ -64,6 +64,7 @@
>> ?gc/g1/humongousObjects/TestHeapCounters.java 8178918 generic-all
>> ?gc/stress/gclocker/TestGCLockerWithG1.java 8179226 generic-all
>>
>> gc/survivorAlignment/TestPromotionFromSurvivorToTenuredAfterMinorGC.java 
>> 8177765 generic-all
>> +gc/logging/TestPrintReferences.java 8188245 generic-all
>>
>>
>> ############################################################################# 
>>
>


From david.holmes at oracle.com  Tue Oct  3 02:00:04 2017
From: david.holmes at oracle.com (David Holmes)
Date: Tue, 3 Oct 2017 12:00:04 +1000
Subject: (XS) RFR: 8188246: Add
 test/hotspot/jtreg/gc/logging/TestPrintReferences.java to ProblemList.txt
In-Reply-To: <dde92e43-0569-53f2-d3d3-06ebd032816a@oracle.com>
References: <86cbbd16-f353-5214-7a4c-aca3e74afbab@oracle.com>
 <dbdc99ea-1b4e-96fa-9067-cd7422024fd6@oracle.com>
 <dde92e43-0569-53f2-d3d3-06ebd032816a@oracle.com>
Message-ID: <5d9bde33-4e41-cbfe-a2c2-9f024610f222@oracle.com>

Thanks Dan and Serguei!

Sorry I already committed before Serguei's email came through.

David

On 3/10/2017 11:58 AM, serguei.spitsyn at oracle.com wrote:
> +1
> 
> Thanks,
> Serguei
> 
> On 10/2/17 18:56, Daniel D. Daugherty wrote:
>> On 10/2/17 7:33 PM, David Holmes wrote:
>>> The test fails intermittently in tier1 testing so we need to exclude 
>>> it until fixed. patch inline below.
>>>
>>> webrev: http://cr.openjdk.java.net/~dholmes/8188246/webrev/
>>
>> Thumbs up!
>>
>> Dan
>>
>>
>>>
>>> Will push under trivial rules as soon as I have one Review.
>>>
>>> Thanks,
>>> David
>>>
>>> --- old/test/hotspot/jtreg/ProblemList.txt??? 2017-10-02 
>>> 21:26:20.127717945 -0400
>>> +++ new/test/hotspot/jtreg/ProblemList.txt??? 2017-10-02 
>>> 21:26:18.043599357 -0400
>>> @@ -64,6 +64,7 @@
>>> ?gc/g1/humongousObjects/TestHeapCounters.java 8178918 generic-all
>>> ?gc/stress/gclocker/TestGCLockerWithG1.java 8179226 generic-all
>>>
>>> gc/survivorAlignment/TestPromotionFromSurvivorToTenuredAfterMinorGC.java 
>>> 8177765 generic-all
>>> +gc/logging/TestPrintReferences.java 8188245 generic-all
>>>
>>>
>>> ############################################################################# 
>>>
>>
> 

From serguei.spitsyn at oracle.com  Tue Oct  3 02:10:55 2017
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Mon, 2 Oct 2017 19:10:55 -0700
Subject: (XS) RFR: 8188246: Add
 test/hotspot/jtreg/gc/logging/TestPrintReferences.java to ProblemList.txt
In-Reply-To: <5d9bde33-4e41-cbfe-a2c2-9f024610f222@oracle.com>
References: <86cbbd16-f353-5214-7a4c-aca3e74afbab@oracle.com>
 <dbdc99ea-1b4e-96fa-9067-cd7422024fd6@oracle.com>
 <dde92e43-0569-53f2-d3d3-06ebd032816a@oracle.com>
 <5d9bde33-4e41-cbfe-a2c2-9f024610f222@oracle.com>
Message-ID: <995cf43d-b3f7-b89f-ecfd-977591c2905a@oracle.com>

On 10/2/17 19:00, David Holmes wrote:
> Thanks Dan and Serguei!
>
> Sorry I already committed before Serguei's email came through.

No problem. :)

Thanks,
Serguei


>
> David
>
> On 3/10/2017 11:58 AM, serguei.spitsyn at oracle.com wrote:
>> +1
>>
>> Thanks,
>> Serguei
>>
>> On 10/2/17 18:56, Daniel D. Daugherty wrote:
>>> On 10/2/17 7:33 PM, David Holmes wrote:
>>>> The test fails intermittently in tier1 testing so we need to 
>>>> exclude it until fixed. patch inline below.
>>>>
>>>> webrev: http://cr.openjdk.java.net/~dholmes/8188246/webrev/
>>>
>>> Thumbs up!
>>>
>>> Dan
>>>
>>>
>>>>
>>>> Will push under trivial rules as soon as I have one Review.
>>>>
>>>> Thanks,
>>>> David
>>>>
>>>> --- old/test/hotspot/jtreg/ProblemList.txt??? 2017-10-02 
>>>> 21:26:20.127717945 -0400
>>>> +++ new/test/hotspot/jtreg/ProblemList.txt??? 2017-10-02 
>>>> 21:26:18.043599357 -0400
>>>> @@ -64,6 +64,7 @@
>>>> ?gc/g1/humongousObjects/TestHeapCounters.java 8178918 generic-all
>>>> ?gc/stress/gclocker/TestGCLockerWithG1.java 8179226 generic-all
>>>>
>>>> gc/survivorAlignment/TestPromotionFromSurvivorToTenuredAfterMinorGC.java 
>>>> 8177765 generic-all
>>>> +gc/logging/TestPrintReferences.java 8188245 generic-all
>>>>
>>>>
>>>> ############################################################################# 
>>>>
>>>
>>


From john.r.rose at oracle.com  Tue Oct  3 06:54:24 2017
From: john.r.rose at oracle.com (John Rose)
Date: Mon, 2 Oct 2017 23:54:24 -0700
Subject: CFV: New hotspot Group Member: Ioi Lam
In-Reply-To: <c80c0f72-aa17-1a18-d0b1-a759afd8283d@oracle.com>
References: <c80c0f72-aa17-1a18-d0b1-a759afd8283d@oracle.com>
Message-ID: <83B6E0A3-7730-431D-B0A2-AF36A6E7C5CA@oracle.com>

Vote: yes

From robbin.ehn at oracle.com  Tue Oct  3 08:00:31 2017
From: robbin.ehn at oracle.com (Robbin Ehn)
Date: Tue, 3 Oct 2017 10:00:31 +0200
Subject: RFR: 8146115 - Improve docker container detection and resource
 configuration usage
In-Reply-To: <33d617e7-7ec4-ebde-efa1-5602189e8470@oracle.com>
References: <74630458-926E-4B3E-B967-6F6ADCA0A406@oracle.com>
 <7e9322e8-274d-9fb2-f6a5-8cc612e3fe68@oracle.com>
 <33d617e7-7ec4-ebde-efa1-5602189e8470@oracle.com>
Message-ID: <12909a67-6876-a40c-85b9-b959ed9f02df@oracle.com>

Hi David,

On 10/03/2017 12:46 AM, David Holmes wrote:
> Hi Robbin,
> 
> I have some views on this :)
> 
> On 3/10/2017 6:20 AM, Robbin Ehn wrote:
>> Hi Bob,
>>
>> As I said in your presentation for RT.
>> If kernel if configured with cgroup this should always be read (otherwise we get wrong values).
>> E.g. fedora have had cgroups default on several years (I believe most distros have it on).
>>
>> - No option is needed at all: right now we have wrong values your fix will provide right ones, why would you ever what to turn that off?
> 
> It's not that you would want to turn that off (necessarily) but just because cgroups capability exists it doesn't mean they have actually been enabled and configured - in 
> which case reading all the cgroup info is unnecessary startup overhead. So for now this is opt-in - as was the experimental cgroup support we added. Once it becomes clearer 
> how this needs to be used we can adjust the defaults. For now this is enabling technology only.

If cgroup are mounted they are on and the only way to know the configuration (such as no limits) is to actual read the cgroup filesystem.
Therefore the flag make no sense.

> 
>> - log target container would make little sense since almost all linuxes run with croups on.
> 
> Again the capability is present but may not be enabled/configured.

The capability is on if cgroup are mount and the only way to know the configuration is to read the cgroup filesystem.

> 
>> - For cpuset, the processes affinity mask already reflect cgroup setting so you don't need to look into cgroup for that
>> ?? If you do, you would miss any processes specific affinity mask. So _cpu_count() should already be returning the right number of CPU's.
> 
> While the process affinity mask reflect cpusets (and we already use it for that reason), it doesn't reflect shares and quotas. And if shares/quotas are enforced and someone 
> sets a custom affinity mask, what is it all supposed to mean? That's one of the main reasons to allow the number of cpu's to be hardwired via a flag. So it's better IMHO to 
> read everything from the cgroups if configured to use cgroups.

I'm not taking about shares and quotes, they should be read of course, but cpuset should be checked such as in _cpu_count.

Here is the bug:

[rehn at rehn-ws dev]$ taskset --cpu-list 0-2,6 java -Xlog:os=debug -cp . ForEver | grep proc
[0.002s][debug][os] Initial active processor count set to 4
^C
[rehn at rehn-ws dev]$ taskset --cpu-list 0-2,6 java -XX:+UseContainerSupport -Xlog:os=debug -cp . ForEver | grep proc
[0.003s][debug][os] Initial active processor count set to 32
^C

_cpu_count already does the right thing.

Thanks, Robbin


> 
> Cheers,
> David
> 
>>
>> Thanks for trying to fixing this!
>>
>> /Robbin
>>
>> On 09/22/2017 04:27 PM, Bob Vandette wrote:
>>> Please review these changes that improve on docker container detection and the
>>> automatic configuration of the number of active CPUs and total and free memory
>>> based on the containers resource limitation settings and metric data files.
>>>
>>> http://cr.openjdk.java.net/~bobv/8146115/webrev.00/ <http://cr.openjdk.java.net/~bobv/8146115/webrev.00/>
>>>
>>> These changes are enabled with -XX:+UseContainerSupport.
>>>
>>> You can enable logging for this support via -Xlog:os+container=trace.
>>>
>>> Since the dynamic selection of CPUs based on cpusets, quotas and shares
>>> may not satisfy every users needs, I?ve added an additional flag to allow the
>>> number of CPUs to be overridden.? This flag is named -XX:ActiveProcessorCount=xx.
>>>
>>>
>>> Bob.
>>>
>>>
>>>

From david.holmes at oracle.com  Tue Oct  3 08:42:43 2017
From: david.holmes at oracle.com (David Holmes)
Date: Tue, 3 Oct 2017 18:42:43 +1000
Subject: RFR: 8146115 - Improve docker container detection and resource
 configuration usage
In-Reply-To: <12909a67-6876-a40c-85b9-b959ed9f02df@oracle.com>
References: <74630458-926E-4B3E-B967-6F6ADCA0A406@oracle.com>
 <7e9322e8-274d-9fb2-f6a5-8cc612e3fe68@oracle.com>
 <33d617e7-7ec4-ebde-efa1-5602189e8470@oracle.com>
 <12909a67-6876-a40c-85b9-b959ed9f02df@oracle.com>
Message-ID: <640fdf30-fc85-112f-ad11-b99cc071053e@oracle.com>

On 3/10/2017 6:00 PM, Robbin Ehn wrote:
> Hi David,
> 
> On 10/03/2017 12:46 AM, David Holmes wrote:
>> Hi Robbin,
>>
>> I have some views on this :)
>>
>> On 3/10/2017 6:20 AM, Robbin Ehn wrote:
>>> Hi Bob,
>>>
>>> As I said in your presentation for RT.
>>> If kernel if configured with cgroup this should always be read 
>>> (otherwise we get wrong values).
>>> E.g. fedora have had cgroups default on several years (I believe most 
>>> distros have it on).
>>>
>>> - No option is needed at all: right now we have wrong values your fix 
>>> will provide right ones, why would you ever what to turn that off?
>>
>> It's not that you would want to turn that off (necessarily) but just 
>> because cgroups capability exists it doesn't mean they have actually 
>> been enabled and configured - in which case reading all the cgroup 
>> info is unnecessary startup overhead. So for now this is opt-in - as 
>> was the experimental cgroup support we added. Once it becomes clearer 
>> how this needs to be used we can adjust the defaults. For now this is 
>> enabling technology only.
> 
> If cgroup are mounted they are on and the only way to know the 
> configuration (such as no limits) is to actual read the cgroup filesystem.
> Therefore the flag make no sense.

No that is exactly why it is opt-in! Why should we have to waste startup 
time reading a bunch of cgroup values just to determine that cgroups are 
not actually being used!

>>
>>> - log target container would make little sense since almost all 
>>> linuxes run with croups on.
>>
>> Again the capability is present but may not be enabled/configured.
> 
> The capability is on if cgroup are mount and the only way to know the 
> configuration is to read the cgroup filesystem.
> 
>>
>>> - For cpuset, the processes affinity mask already reflect cgroup 
>>> setting so you don't need to look into cgroup for that
>>> ?? If you do, you would miss any processes specific affinity mask. So 
>>> _cpu_count() should already be returning the right number of CPU's.
>>
>> While the process affinity mask reflect cpusets (and we already use it 
>> for that reason), it doesn't reflect shares and quotas. And if 
>> shares/quotas are enforced and someone sets a custom affinity mask, 
>> what is it all supposed to mean? That's one of the main reasons to 
>> allow the number of cpu's to be hardwired via a flag. So it's better 
>> IMHO to read everything from the cgroups if configured to use cgroups.
> 
> I'm not taking about shares and quotes, they should be read of course, 
> but cpuset should be checked such as in _cpu_count.
> 
> Here is the bug:
> 
> [rehn at rehn-ws dev]$ taskset --cpu-list 0-2,6 java -Xlog:os=debug -cp . 
> ForEver | grep proc
> [0.002s][debug][os] Initial active processor count set to 4
> ^C
> [rehn at rehn-ws dev]$ taskset --cpu-list 0-2,6 java 
> -XX:+UseContainerSupport -Xlog:os=debug -cp . ForEver | grep proc
> [0.003s][debug][os] Initial active processor count set to 32
> ^C
> 
> _cpu_count already does the right thing.

But how do you then combine that information with the use of shares 
and/or quotas?

David
-----

> Thanks, Robbin
> 
> 
>>
>> Cheers,
>> David
>>
>>>
>>> Thanks for trying to fixing this!
>>>
>>> /Robbin
>>>
>>> On 09/22/2017 04:27 PM, Bob Vandette wrote:
>>>> Please review these changes that improve on docker container 
>>>> detection and the
>>>> automatic configuration of the number of active CPUs and total and 
>>>> free memory
>>>> based on the containers resource limitation settings and metric data 
>>>> files.
>>>>
>>>> http://cr.openjdk.java.net/~bobv/8146115/webrev.00/ 
>>>> <http://cr.openjdk.java.net/~bobv/8146115/webrev.00/>
>>>>
>>>> These changes are enabled with -XX:+UseContainerSupport.
>>>>
>>>> You can enable logging for this support via -Xlog:os+container=trace.
>>>>
>>>> Since the dynamic selection of CPUs based on cpusets, quotas and shares
>>>> may not satisfy every users needs, I?ve added an additional flag to 
>>>> allow the
>>>> number of CPUs to be overridden.? This flag is named 
>>>> -XX:ActiveProcessorCount=xx.
>>>>
>>>>
>>>> Bob.
>>>>
>>>>
>>>>

From robbin.ehn at oracle.com  Tue Oct  3 10:45:10 2017
From: robbin.ehn at oracle.com (Robbin Ehn)
Date: Tue, 3 Oct 2017 12:45:10 +0200
Subject: RFR: 8146115 - Improve docker container detection and resource
 configuration usage
In-Reply-To: <640fdf30-fc85-112f-ad11-b99cc071053e@oracle.com>
References: <74630458-926E-4B3E-B967-6F6ADCA0A406@oracle.com>
 <7e9322e8-274d-9fb2-f6a5-8cc612e3fe68@oracle.com>
 <33d617e7-7ec4-ebde-efa1-5602189e8470@oracle.com>
 <12909a67-6876-a40c-85b9-b959ed9f02df@oracle.com>
 <640fdf30-fc85-112f-ad11-b99cc071053e@oracle.com>
Message-ID: <2df87576-cd2f-6d1d-4367-8a2956b88fea@oracle.com>

Hi David, I think we are seen the issue from complete opposite. (this RFE could be pushed as a bug from my POV)

On 10/03/2017 10:42 AM, David Holmes wrote:
> On 3/10/2017 6:00 PM, Robbin Ehn wrote:
>> Hi David,
>>
>> On 10/03/2017 12:46 AM, David Holmes wrote:
>>> Hi Robbin,
>>>
>>> I have some views on this :)
>>>
>>> On 3/10/2017 6:20 AM, Robbin Ehn wrote:
>>>> Hi Bob,
>>>>
>>>> As I said in your presentation for RT.
>>>> If kernel if configured with cgroup this should always be read (otherwise we get wrong values).
>>>> E.g. fedora have had cgroups default on several years (I believe most distros have it on).
>>>>
>>>> - No option is needed at all: right now we have wrong values your fix will provide right ones, why would you ever what to turn that off?
>>>
>>> It's not that you would want to turn that off (necessarily) but just because cgroups capability exists it doesn't mean they have actually been enabled and configured - 
>>> in which case reading all the cgroup info is unnecessary startup overhead. So for now this is opt-in - as was the experimental cgroup support we added. Once it becomes 
>>> clearer how this needs to be used we can adjust the defaults. For now this is enabling technology only.
>>
>> If cgroup are mounted they are on and the only way to know the configuration (such as no limits) is to actual read the cgroup filesystem.
>> Therefore the flag make no sense.
> 
> No that is exactly why it is opt-in! Why should we have to waste startup time reading a bunch of cgroup values just to determine that cgroups are not actually being used!

If you have a cgroup enabled kernel they _are_ being used, no escaping that.
cgroup is not a simple yes and no so for which resources depend on how you configured your kernel.
To find out for what resource and what limits are set is we need to read them.

I rather waste startup time (0.103292989 vs 0.103577139 seconds) and get values correct, so our heuristic works fine out-of-the-box. (and if you must, it opt-out)

Also I notice that we don't read the numa values so the phys mem method does a poor job. Correct would be check at least cgroup and numa bindings.
We also have this option UseCGroupMemoryLimitForHeap which should be removed.

> 
>>>
>>>> - log target container would make little sense since almost all linuxes run with croups on.
>>>
>>> Again the capability is present but may not be enabled/configured.
>>
>> The capability is on if cgroup are mount and the only way to know the configuration is to read the cgroup filesystem.
>>
>>>
>>>> - For cpuset, the processes affinity mask already reflect cgroup setting so you don't need to look into cgroup for that
>>>> ?? If you do, you would miss any processes specific affinity mask. So _cpu_count() should already be returning the right number of CPU's.
>>>
>>> While the process affinity mask reflect cpusets (and we already use it for that reason), it doesn't reflect shares and quotas. And if shares/quotas are enforced and 
>>> someone sets a custom affinity mask, what is it all supposed to mean? That's one of the main reasons to allow the number of cpu's to be hardwired via a flag. So it's 
>>> better IMHO to read everything from the cgroups if configured to use cgroups.
>>
>> I'm not taking about shares and quotes, they should be read of course, but cpuset should be checked such as in _cpu_count.
>>
>> Here is the bug:
>>
>> [rehn at rehn-ws dev]$ taskset --cpu-list 0-2,6 java -Xlog:os=debug -cp . ForEver | grep proc
>> [0.002s][debug][os] Initial active processor count set to 4
>> ^C
>> [rehn at rehn-ws dev]$ taskset --cpu-list 0-2,6 java -XX:+UseContainerSupport -Xlog:os=debug -cp . ForEver | grep proc
>> [0.003s][debug][os] Initial active processor count set to 32
>> ^C
>>
>> _cpu_count already does the right thing.
> 
> But how do you then combine that information with the use of shares and/or quotas?

That I don't know, wild naive guess would be:
active count ~ MIN(OSContainer::pd_active_processor_count(), cpuset); :)

I assume everything we need to know is in: https://www.kernel.org/doc/Documentation/cgroup-v1/cpusets.txt

Thanks, Robbin

> 
> David
> -----
> 
>> Thanks, Robbin
>>
>>
>>>
>>> Cheers,
>>> David
>>>
>>>>
>>>> Thanks for trying to fixing this!
>>>>
>>>> /Robbin
>>>>
>>>> On 09/22/2017 04:27 PM, Bob Vandette wrote:
>>>>> Please review these changes that improve on docker container detection and the
>>>>> automatic configuration of the number of active CPUs and total and free memory
>>>>> based on the containers resource limitation settings and metric data files.
>>>>>
>>>>> http://cr.openjdk.java.net/~bobv/8146115/webrev.00/ <http://cr.openjdk.java.net/~bobv/8146115/webrev.00/>
>>>>>
>>>>> These changes are enabled with -XX:+UseContainerSupport.
>>>>>
>>>>> You can enable logging for this support via -Xlog:os+container=trace.
>>>>>
>>>>> Since the dynamic selection of CPUs based on cpusets, quotas and shares
>>>>> may not satisfy every users needs, I?ve added an additional flag to allow the
>>>>> number of CPUs to be overridden.? This flag is named -XX:ActiveProcessorCount=xx.
>>>>>
>>>>>
>>>>> Bob.
>>>>>
>>>>>
>>>>>

From david.holmes at oracle.com  Tue Oct  3 11:00:46 2017
From: david.holmes at oracle.com (David Holmes)
Date: Tue, 3 Oct 2017 21:00:46 +1000
Subject: RFR: 8146115 - Improve docker container detection and resource
 configuration usage
In-Reply-To: <2df87576-cd2f-6d1d-4367-8a2956b88fea@oracle.com>
References: <74630458-926E-4B3E-B967-6F6ADCA0A406@oracle.com>
 <7e9322e8-274d-9fb2-f6a5-8cc612e3fe68@oracle.com>
 <33d617e7-7ec4-ebde-efa1-5602189e8470@oracle.com>
 <12909a67-6876-a40c-85b9-b959ed9f02df@oracle.com>
 <640fdf30-fc85-112f-ad11-b99cc071053e@oracle.com>
 <2df87576-cd2f-6d1d-4367-8a2956b88fea@oracle.com>
Message-ID: <ff3cf4bc-d917-c6e2-b90c-9581409ccaba@oracle.com>

Hi Robbin,

On 3/10/2017 8:45 PM, Robbin Ehn wrote:
> Hi David, I think we are seen the issue from complete opposite. (this 
> RFE could be pushed as a bug from my POV)

Yes we see this completely opposite. I see this is a poorly integrated 
add-on API that we have to try to account for instead of being able to 
read an "always correct" value from a standard OS API. They at least got 
the cpuset support correct by having sched_getaffinity correctly account 
for it. Alas the rest is ad-hoc.

> 
> On 10/03/2017 10:42 AM, David Holmes wrote:
>> On 3/10/2017 6:00 PM, Robbin Ehn wrote:
>>> Hi David,
>>>
>>> On 10/03/2017 12:46 AM, David Holmes wrote:
>>>> Hi Robbin,
>>>>
>>>> I have some views on this :)
>>>>
>>>> On 3/10/2017 6:20 AM, Robbin Ehn wrote:
>>>>> Hi Bob,
>>>>>
>>>>> As I said in your presentation for RT.
>>>>> If kernel if configured with cgroup this should always be read 
>>>>> (otherwise we get wrong values).
>>>>> E.g. fedora have had cgroups default on several years (I believe 
>>>>> most distros have it on).
>>>>>
>>>>> - No option is needed at all: right now we have wrong values your 
>>>>> fix will provide right ones, why would you ever what to turn that off?
>>>>
>>>> It's not that you would want to turn that off (necessarily) but just 
>>>> because cgroups capability exists it doesn't mean they have actually 
>>>> been enabled and configured - in which case reading all the cgroup 
>>>> info is unnecessary startup overhead. So for now this is opt-in - as 
>>>> was the experimental cgroup support we added. Once it becomes 
>>>> clearer how this needs to be used we can adjust the defaults. For 
>>>> now this is enabling technology only.
>>>
>>> If cgroup are mounted they are on and the only way to know the 
>>> configuration (such as no limits) is to actual read the cgroup 
>>> filesystem.
>>> Therefore the flag make no sense.
>>
>> No that is exactly why it is opt-in! Why should we have to waste 
>> startup time reading a bunch of cgroup values just to determine that 
>> cgroups are not actually being used!
> 
> If you have a cgroup enabled kernel they _are_ being used, no escaping 
> that.

A cgroup set to unlimited is not being used from a practical perspective.

> cgroup is not a simple yes and no so for which resources depend on how 
> you configured your kernel.
> To find out for what resource and what limits are set is we need to read 
> them.
> 
> I rather waste startup time (0.103292989 vs 0.103577139 seconds) and get 
> values correct, so our heuristic works fine out-of-the-box. (and if you 
> must, it opt-out)

I'd rather people say "Hey I'm using this add-on resource management API 
so don't ask the OS but please query the add-on.". Yes that is a little 
harsh but the lack of integration at the OS level is a huge impediment 
in my opinion.

> Also I notice that we don't read the numa values so the phys mem method 
> does a poor job. Correct would be check at least cgroup and numa bindings.

NUMA is another minefield.

> We also have this option UseCGroupMemoryLimitForHeap which should be 
> removed.

Bob already addressed why he was not getting rid of that initially.

>>
>>>>
>>>>> - log target container would make little sense since almost all 
>>>>> linuxes run with croups on.
>>>>
>>>> Again the capability is present but may not be enabled/configured.
>>>
>>> The capability is on if cgroup are mount and the only way to know the 
>>> configuration is to read the cgroup filesystem.
>>>
>>>>
>>>>> - For cpuset, the processes affinity mask already reflect cgroup 
>>>>> setting so you don't need to look into cgroup for that
>>>>> ?? If you do, you would miss any processes specific affinity mask. 
>>>>> So _cpu_count() should already be returning the right number of CPU's.
>>>>
>>>> While the process affinity mask reflect cpusets (and we already use 
>>>> it for that reason), it doesn't reflect shares and quotas. And if 
>>>> shares/quotas are enforced and someone sets a custom affinity mask, 
>>>> what is it all supposed to mean? That's one of the main reasons to 
>>>> allow the number of cpu's to be hardwired via a flag. So it's better 
>>>> IMHO to read everything from the cgroups if configured to use cgroups.
>>>
>>> I'm not taking about shares and quotes, they should be read of 
>>> course, but cpuset should be checked such as in _cpu_count.
>>>
>>> Here is the bug:
>>>
>>> [rehn at rehn-ws dev]$ taskset --cpu-list 0-2,6 java -Xlog:os=debug -cp 
>>> . ForEver | grep proc
>>> [0.002s][debug][os] Initial active processor count set to 4
>>> ^C
>>> [rehn at rehn-ws dev]$ taskset --cpu-list 0-2,6 java 
>>> -XX:+UseContainerSupport -Xlog:os=debug -cp . ForEver | grep proc
>>> [0.003s][debug][os] Initial active processor count set to 32
>>> ^C
>>>
>>> _cpu_count already does the right thing.
>>
>> But how do you then combine that information with the use of shares 
>> and/or quotas?
> 
> That I don't know, wild naive guess would be:
> active count ~ MIN(OSContainer::pd_active_processor_count(), cpuset); :)

That would be one option but it may not be meaningful. That said I don't 
think the use of quota or shares to define the number of available CPUs 
makes sense anyway.

Personally I don't think mixing direct use of cpusets with cgroup 
defined limits makes much sense.

> I assume everything we need to know is in: 
> https://www.kernel.org/doc/Documentation/cgroup-v1/cpusets.txt

Nope that only addresses cpusets. The one part of this that at least 
makes sense in isolation.

Cheers,
David

> Thanks, Robbin
> 
>>
>> David
>> -----
>>
>>> Thanks, Robbin
>>>
>>>
>>>>
>>>> Cheers,
>>>> David
>>>>
>>>>>
>>>>> Thanks for trying to fixing this!
>>>>>
>>>>> /Robbin
>>>>>
>>>>> On 09/22/2017 04:27 PM, Bob Vandette wrote:
>>>>>> Please review these changes that improve on docker container 
>>>>>> detection and the
>>>>>> automatic configuration of the number of active CPUs and total and 
>>>>>> free memory
>>>>>> based on the containers resource limitation settings and metric 
>>>>>> data files.
>>>>>>
>>>>>> http://cr.openjdk.java.net/~bobv/8146115/webrev.00/ 
>>>>>> <http://cr.openjdk.java.net/~bobv/8146115/webrev.00/>
>>>>>>
>>>>>> These changes are enabled with -XX:+UseContainerSupport.
>>>>>>
>>>>>> You can enable logging for this support via -Xlog:os+container=trace.
>>>>>>
>>>>>> Since the dynamic selection of CPUs based on cpusets, quotas and 
>>>>>> shares
>>>>>> may not satisfy every users needs, I?ve added an additional flag 
>>>>>> to allow the
>>>>>> number of CPUs to be overridden.? This flag is named 
>>>>>> -XX:ActiveProcessorCount=xx.
>>>>>>
>>>>>>
>>>>>> Bob.
>>>>>>
>>>>>>
>>>>>>

From vladimir.x.ivanov at oracle.com  Tue Oct  3 11:54:12 2017
From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov)
Date: Tue, 3 Oct 2017 14:54:12 +0300
Subject: Questions about ... Lambda Form Compilation
In-Reply-To: <CABTg2xp2tc5EfmHFr5HmEuEhPTO59T0Jmb=8S-GWA+b1f_TQ7w@mail.gmail.com>
References: <CABTg2xp2tc5EfmHFr5HmEuEhPTO59T0Jmb=8S-GWA+b1f_TQ7w@mail.gmail.com>
Message-ID: <57d5cf51-111f-d34a-e161-02df724b6577@oracle.com>

Hi,

> 2. For the same cluster, we also see over half of machines repeatedly
> experiencing full GC due to Metaspace full. We dump JSTACK for every minute
> during 30 minutes, and see many threads are trying to compile the exact
> same lambda form throughout the 30-minute period.
> 
> Here is an example stacktrace on one machine. The LambdaForm triggers the
> compilation on that machine is always LambdaForm$MH/170067652. Once it's
> compiled, it should use the new compiled lambda form. We don't know why
> it's still trying to compile the same lambda form again and again. -- Would
> it be because the compiled lambda form somehow failed to load? This might
> relate to the negative number of loaded classes.

What you are seeing here is LambdaForm customization (8069591 [1]).

Customization creates a new LambdaForm instance specialized for a 
particular MethodHandle instance (no LF sharing possible). It was 
designed to alleviate performance penalty when inlining through a MH 
invoker doesn't happen and enables JIT-compilers to compile the whole 
method handle chain into a single nmethod. Without customization a 
method handle chain breaks up into a chain of small nmethods (1 nmethod 
per LambdaForm) and calls between them start dominate the execution 
time. (More details are available in [2].)

Customization takes place once a method handle has been invoked through 
MH.invoke/invokeExact() more than 127 times.

Considering you observe continuous customization, it means there are 
method handles being continuously instantiated and used which share the 
same lambda form (LambdaForm$MH/170067652). It leads to excessive 
generation of VM anonymous classes and creates memory pressure in 
Metaspace.

As a workaround, you can try to disable LF customization 
(java.lang.invoke.MethodHandle.CUSTOMIZE_THRESHOLD=-1).

But I'd suggest to look into why the application continuously creates 
method handles. As you noted, it doesn't play well with existing 
heuristics aimed at maximum throughput which assume the application 
behavior "stabilizes" over time.

Best regards,
Vladimir Ivanov

[1] https://bugs.openjdk.java.net/browse/JDK-8069591

[2] http://cr.openjdk.java.net/~vlivanov/talks/2015-JVMLS_State_of_JLI.pdf
     slides #45-#50

>      "20170926_232912_39740_3vuuu.1.79-4-76640" #76640 prio=5 os_prio=0
> tid=0x00007f908006dbd0 nid=0x150a6 runnable [0x00007f8bddb1b000]
>         java.lang.Thread.State: RUNNABLE
>              at sun.misc.Unsafe.defineAnonymousClass(Native Method)
>              at java.lang.invoke.InvokerBytecodeGenerator.
> loadAndInitializeInvokerClass(InvokerBytecodeGenerator.java:284)
>              at java.lang.invoke.InvokerBytecodeGenerator.loadMethod(
> InvokerBytecodeGenerator.java:276)
>              at java.lang.invoke.InvokerBytecodeGenerator.
> generateCustomizedCode(InvokerBytecodeGenerator.java:618)
>              at java.lang.invoke.LambdaForm.compileToBytecode(LambdaForm.
> java:654)
>              at java.lang.invoke.LambdaForm.prepare(LambdaForm.java:635)
>              at java.lang.invoke.MethodHandle.updateForm(MethodHandle.java:
> 1432)
>              at java.lang.invoke.MethodHandle.customize(MethodHandle.java:
> 1442)
>              at java.lang.invoke.Invokers.maybeCustomize(Invokers.java:407)
>              at java.lang.invoke.Invokers.checkCustomized(Invokers.java:398)
>              at java.lang.invoke.LambdaForm$MH/170067652.invokeExact_MT(
> LambdaForm$MH)
>              at com.facebook.presto.operator.aggregation.MinMaxHelper.
> combineStateWithState(MinMaxHelper.java:141)
>              at com.facebook.presto.operator.aggregation.
> MaxAggregationFunction.combine(MaxAggregationFunction.java:108)
>              at java.lang.invoke.LambdaForm$DMH/1607453282.invokeStatic_
> L3_V(LambdaForm$DMH)
>              at java.lang.invoke.LambdaForm$BMH/1118134445.reinvoke(
> LambdaForm$BMH)
>              at java.lang.invoke.LambdaForm$MH/1971758264.
> linkToTargetMethod(LambdaForm$MH)
>              at com.facebook.presto.$gen.IntegerIntegerMaxGroupedAccumu
> lator_3439.addIntermediate(Unknown Source)
>              at com.facebook.presto.operator.aggregation.builder.
> InMemoryHashAggregationBuilder$Aggregator.processPage(
> InMemoryHashAggregationBuilder.java:367)
>              at com.facebook.presto.operator.aggregation.builder.
> InMemoryHashAggregationBuilder.processPage(InMemoryHashAggregationBuilder
> .java:138)
>              at com.facebook.presto.operator.HashAggregationOperator.
> addInput(HashAggregationOperator.java:400)
>              at com.facebook.presto.operator.Driver.processInternal(Driver.
> java:343)
>              at com.facebook.presto.operator.Driver.lambda$processFor$6(
> Driver.java:241)
>              at com.facebook.presto.operator.Driver$$Lambda$765/442308692.get(Unknown
> Source)
>              at com.facebook.presto.operator.Driver.tryWithLock(Driver.
> java:614)
>              at com.facebook.presto.operator.Driver.processFor(Driver.java:
> 235)
>              at com.facebook.presto.execution.SqlTaskExecution$
> DriverSplitRunner.processFor(SqlTaskExecution.java:622)
>              at com.facebook.presto.execution.executor.
> PrioritizedSplitRunner.process(PrioritizedSplitRunner.java:163)
>              at com.facebook.presto.execution.executor.TaskExecutor$
> TaskRunner.run(TaskExecutor.java:485)
>              at java.util.concurrent.ThreadPoolExecutor.runWorker(
> ThreadPoolExecutor.java:1142)
>              at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> ThreadPoolExecutor.java:617)
>              at java.lang.Thread.run(Thread.java:748)
>      ...
> 
> 
> 
> Both issues go away after we restart the JVM, and the same query won't
> trigger the LambdaForm compilation issue, so it looks like the JVM enters
> some weird state.  We are wondering if there is any thoughts on what could
> trigger these issues? Or is there any suggestions about how to further
> investigate it next time we see the VM in this state?
> 
> Thank you.
> 
> 

From vladimir.x.ivanov at oracle.com  Tue Oct  3 12:10:17 2017
From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov)
Date: Tue, 3 Oct 2017 15:10:17 +0300
Subject: Questions about negative loaded classes ...
In-Reply-To: <CABTg2xp2tc5EfmHFr5HmEuEhPTO59T0Jmb=8S-GWA+b1f_TQ7w@mail.gmail.com>
References: <CABTg2xp2tc5EfmHFr5HmEuEhPTO59T0Jmb=8S-GWA+b1f_TQ7w@mail.gmail.com>
Message-ID: <49bb66f9-49a7-3183-4410-b15176033e02@oracle.com>

> 1. On more than half of the machines (200 out of 400 machines), we see he
> JMX counter report negative LoadedClassCount, see attached jmxcounter.png.
> 
> After some further dig, we note UnloadedClassCount is larger than
> TotalLoadedClassCount. And  LoadedClassCount (-695,710) =
> TotalLoadedClassCount - UnloadedClassCount . PerfCounter reports the same
> number, here is the result on the same machine:
> 
>      $ jcmd 307 PerfCounter.print | grep -i class | grep -i java.cls
>      java.cls.loadedClasses=192004392
>      java.cls.sharedLoadedClasses=0
>      java.cls.sharedUnloadedClasses=0
>      java.cls.unloadedClasses=192700102

JVM performance counters aren't exact (e.g., updates aren't atomic [1]), 
so I wouldn't be surprised to see loadedClasses & unloadedClasses 
diverging during concurrent class loading.

Best regards,
Vladimir Ivanov

[1] 
http://hg.openjdk.java.net/jdk8u/jdk8u-dev/hotspot/file/92693f9dd704/src/share/vm/runtime/perfData.hpp#l425

From robbin.ehn at oracle.com  Tue Oct  3 12:19:36 2017
From: robbin.ehn at oracle.com (Robbin Ehn)
Date: Tue, 3 Oct 2017 14:19:36 +0200
Subject: RFR: 8146115 - Improve docker container detection and resource
 configuration usage
In-Reply-To: <ff3cf4bc-d917-c6e2-b90c-9581409ccaba@oracle.com>
References: <74630458-926E-4B3E-B967-6F6ADCA0A406@oracle.com>
 <7e9322e8-274d-9fb2-f6a5-8cc612e3fe68@oracle.com>
 <33d617e7-7ec4-ebde-efa1-5602189e8470@oracle.com>
 <12909a67-6876-a40c-85b9-b959ed9f02df@oracle.com>
 <640fdf30-fc85-112f-ad11-b99cc071053e@oracle.com>
 <2df87576-cd2f-6d1d-4367-8a2956b88fea@oracle.com>
 <ff3cf4bc-d917-c6e2-b90c-9581409ccaba@oracle.com>
Message-ID: <8439163a-3a94-6804-0e27-ce384be821cf@oracle.com>

Hi,

I'll leave that discussion for a while, another thing is:

In os::Linux::available_memory(), OSContainer::memory_limit_in_bytes() the limit can be larger than actual ram.
So we also need to check sysinfo e.g. return MIN(avail_mem, si.freeram * si.mem_unit).

So I think the check against "if (XXX == 9223372036854771712)" is not needed at all for any of those methods.
Just return what cgroup says if that is larger then the actual value pick the lower one.

/Robbin

On 10/03/2017 01:00 PM, David Holmes wrote:
> Hi Robbin,
> 
> On 3/10/2017 8:45 PM, Robbin Ehn wrote:
>> Hi David, I think we are seen the issue from complete opposite. (this RFE could be pushed as a bug from my POV)
> 
> Yes we see this completely opposite. I see this is a poorly integrated add-on API that we have to try to account for instead of being able to read an "always correct" value 
> from a standard OS API. They at least got the cpuset support correct by having sched_getaffinity correctly account for it. Alas the rest is ad-hoc.
> 
>>
>> On 10/03/2017 10:42 AM, David Holmes wrote:
>>> On 3/10/2017 6:00 PM, Robbin Ehn wrote:
>>>> Hi David,
>>>>
>>>> On 10/03/2017 12:46 AM, David Holmes wrote:
>>>>> Hi Robbin,
>>>>>
>>>>> I have some views on this :)
>>>>>
>>>>> On 3/10/2017 6:20 AM, Robbin Ehn wrote:
>>>>>> Hi Bob,
>>>>>>
>>>>>> As I said in your presentation for RT.
>>>>>> If kernel if configured with cgroup this should always be read (otherwise we get wrong values).
>>>>>> E.g. fedora have had cgroups default on several years (I believe most distros have it on).
>>>>>>
>>>>>> - No option is needed at all: right now we have wrong values your fix will provide right ones, why would you ever what to turn that off?
>>>>>
>>>>> It's not that you would want to turn that off (necessarily) but just because cgroups capability exists it doesn't mean they have actually been enabled and configured - 
>>>>> in which case reading all the cgroup info is unnecessary startup overhead. So for now this is opt-in - as was the experimental cgroup support we added. Once it becomes 
>>>>> clearer how this needs to be used we can adjust the defaults. For now this is enabling technology only.
>>>>
>>>> If cgroup are mounted they are on and the only way to know the configuration (such as no limits) is to actual read the cgroup filesystem.
>>>> Therefore the flag make no sense.
>>>
>>> No that is exactly why it is opt-in! Why should we have to waste startup time reading a bunch of cgroup values just to determine that cgroups are not actually being used!
>>
>> If you have a cgroup enabled kernel they _are_ being used, no escaping that.
> 
> A cgroup set to unlimited is not being used from a practical perspective.
> 
>> cgroup is not a simple yes and no so for which resources depend on how you configured your kernel.
>> To find out for what resource and what limits are set is we need to read them.
>>
>> I rather waste startup time (0.103292989 vs 0.103577139 seconds) and get values correct, so our heuristic works fine out-of-the-box. (and if you must, it opt-out)
> 
> I'd rather people say "Hey I'm using this add-on resource management API so don't ask the OS but please query the add-on.". Yes that is a little harsh but the lack of 
> integration at the OS level is a huge impediment in my opinion.
> 
>> Also I notice that we don't read the numa values so the phys mem method does a poor job. Correct would be check at least cgroup and numa bindings.
> 
> NUMA is another minefield.
> 
>> We also have this option UseCGroupMemoryLimitForHeap which should be removed.
> 
> Bob already addressed why he was not getting rid of that initially.
> 
>>>
>>>>>
>>>>>> - log target container would make little sense since almost all linuxes run with croups on.
>>>>>
>>>>> Again the capability is present but may not be enabled/configured.
>>>>
>>>> The capability is on if cgroup are mount and the only way to know the configuration is to read the cgroup filesystem.
>>>>
>>>>>
>>>>>> - For cpuset, the processes affinity mask already reflect cgroup setting so you don't need to look into cgroup for that
>>>>>> ?? If you do, you would miss any processes specific affinity mask. So _cpu_count() should already be returning the right number of CPU's.
>>>>>
>>>>> While the process affinity mask reflect cpusets (and we already use it for that reason), it doesn't reflect shares and quotas. And if shares/quotas are enforced and 
>>>>> someone sets a custom affinity mask, what is it all supposed to mean? That's one of the main reasons to allow the number of cpu's to be hardwired via a flag. So it's 
>>>>> better IMHO to read everything from the cgroups if configured to use cgroups.
>>>>
>>>> I'm not taking about shares and quotes, they should be read of course, but cpuset should be checked such as in _cpu_count.
>>>>
>>>> Here is the bug:
>>>>
>>>> [rehn at rehn-ws dev]$ taskset --cpu-list 0-2,6 java -Xlog:os=debug -cp . ForEver | grep proc
>>>> [0.002s][debug][os] Initial active processor count set to 4
>>>> ^C
>>>> [rehn at rehn-ws dev]$ taskset --cpu-list 0-2,6 java -XX:+UseContainerSupport -Xlog:os=debug -cp . ForEver | grep proc
>>>> [0.003s][debug][os] Initial active processor count set to 32
>>>> ^C
>>>>
>>>> _cpu_count already does the right thing.
>>>
>>> But how do you then combine that information with the use of shares and/or quotas?
>>
>> That I don't know, wild naive guess would be:
>> active count ~ MIN(OSContainer::pd_active_processor_count(), cpuset); :)
> 
> That would be one option but it may not be meaningful. That said I don't think the use of quota or shares to define the number of available CPUs makes sense anyway.
> 
> Personally I don't think mixing direct use of cpusets with cgroup defined limits makes much sense.
> 
>> I assume everything we need to know is in: https://www.kernel.org/doc/Documentation/cgroup-v1/cpusets.txt
> 
> Nope that only addresses cpusets. The one part of this that at least makes sense in isolation.
> 
> Cheers,
> David
> 
>> Thanks, Robbin
>>
>>>
>>> David
>>> -----
>>>
>>>> Thanks, Robbin
>>>>
>>>>
>>>>>
>>>>> Cheers,
>>>>> David
>>>>>
>>>>>>
>>>>>> Thanks for trying to fixing this!
>>>>>>
>>>>>> /Robbin
>>>>>>
>>>>>> On 09/22/2017 04:27 PM, Bob Vandette wrote:
>>>>>>> Please review these changes that improve on docker container detection and the
>>>>>>> automatic configuration of the number of active CPUs and total and free memory
>>>>>>> based on the containers resource limitation settings and metric data files.
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~bobv/8146115/webrev.00/ <http://cr.openjdk.java.net/~bobv/8146115/webrev.00/>
>>>>>>>
>>>>>>> These changes are enabled with -XX:+UseContainerSupport.
>>>>>>>
>>>>>>> You can enable logging for this support via -Xlog:os+container=trace.
>>>>>>>
>>>>>>> Since the dynamic selection of CPUs based on cpusets, quotas and shares
>>>>>>> may not satisfy every users needs, I?ve added an additional flag to allow the
>>>>>>> number of CPUs to be overridden.? This flag is named -XX:ActiveProcessorCount=xx.
>>>>>>>
>>>>>>>
>>>>>>> Bob.
>>>>>>>
>>>>>>>
>>>>>>>

From bob.vandette at oracle.com  Tue Oct  3 12:25:13 2017
From: bob.vandette at oracle.com (Bob Vandette)
Date: Tue, 3 Oct 2017 08:25:13 -0400
Subject: RFR: 8146115 - Improve docker container detection and resource
 configuration usage
In-Reply-To: <12909a67-6876-a40c-85b9-b959ed9f02df@oracle.com>
References: <74630458-926E-4B3E-B967-6F6ADCA0A406@oracle.com>
 <7e9322e8-274d-9fb2-f6a5-8cc612e3fe68@oracle.com>
 <33d617e7-7ec4-ebde-efa1-5602189e8470@oracle.com>
 <12909a67-6876-a40c-85b9-b959ed9f02df@oracle.com>
Message-ID: <39AD9F8D-7E2B-4C15-8525-36DBA7C74302@oracle.com>

After talking to a number of folks and getting feedback, my current thinking is to enable the support by default. 

I still want to include the flag for at least one Java release in the event that the new behavior causes some regression
in behavior.  I?m trying to make the detection robust so that it will fallback to the current behavior in the event
that cgroups is not configured as expected but I?d like to have a way of forcing the issue.  JDK 10 is not
supposed to be a long term support release which makes it a good target for this new behavior.

I agree with David that once we commit to cgroups, we should extract all VM configuration data from that
source.  There?s more information available for cpusets than just processor affinity that we might want to 
consider when calculating the number of processors to assume for the VM.  There?s exclusivity and
effective cpu data available in addition to the cpuset string.

Bob.


> On Oct 3, 2017, at 4:00 AM, Robbin Ehn <robbin.ehn at oracle.com> wrote:
> 
> Hi David,
> 
> On 10/03/2017 12:46 AM, David Holmes wrote:
>> Hi Robbin,
>> I have some views on this :)
>> On 3/10/2017 6:20 AM, Robbin Ehn wrote:
>>> Hi Bob,
>>> 
>>> As I said in your presentation for RT.
>>> If kernel if configured with cgroup this should always be read (otherwise we get wrong values).
>>> E.g. fedora have had cgroups default on several years (I believe most distros have it on).
>>> 
>>> - No option is needed at all: right now we have wrong values your fix will provide right ones, why would you ever what to turn that off?
>> It's not that you would want to turn that off (necessarily) but just because cgroups capability exists it doesn't mean they have actually been enabled and configured - in which case reading all the cgroup info is unnecessary startup overhead. So for now this is opt-in - as was the experimental cgroup support we added. Once it becomes clearer how this needs to be used we can adjust the defaults. For now this is enabling technology only.
> 
> If cgroup are mounted they are on and the only way to know the configuration (such as no limits) is to actual read the cgroup filesystem.
> Therefore the flag make no sense.
> 
>>> - log target container would make little sense since almost all linuxes run with croups on.
>> Again the capability is present but may not be enabled/configured.
> 
> The capability is on if cgroup are mount and the only way to know the configuration is to read the cgroup filesystem.
> 
>>> - For cpuset, the processes affinity mask already reflect cgroup setting so you don't need to look into cgroup for that
>>>    If you do, you would miss any processes specific affinity mask. So _cpu_count() should already be returning the right number of CPU's.
>> While the process affinity mask reflect cpusets (and we already use it for that reason), it doesn't reflect shares and quotas. And if shares/quotas are enforced and someone sets a custom affinity mask, what is it all supposed to mean? That's one of the main reasons to allow the number of cpu's to be hardwired via a flag. So it's better IMHO to read everything from the cgroups if configured to use cgroups.
> 
> I'm not taking about shares and quotes, they should be read of course, but cpuset should be checked such as in _cpu_count.
> 
> Here is the bug:
> 
> [rehn at rehn-ws dev]$ taskset --cpu-list 0-2,6 java -Xlog:os=debug -cp . ForEver | grep proc
> [0.002s][debug][os] Initial active processor count set to 4
> ^C
> [rehn at rehn-ws dev]$ taskset --cpu-list 0-2,6 java -XX:+UseContainerSupport -Xlog:os=debug -cp . ForEver | grep proc
> [0.003s][debug][os] Initial active processor count set to 32
> ^C
> 
> _cpu_count already does the right thing.
> 
> Thanks, Robbin
> 
> 
>> Cheers,
>> David
>>> 
>>> Thanks for trying to fixing this!
>>> 
>>> /Robbin
>>> 
>>> On 09/22/2017 04:27 PM, Bob Vandette wrote:
>>>> Please review these changes that improve on docker container detection and the
>>>> automatic configuration of the number of active CPUs and total and free memory
>>>> based on the containers resource limitation settings and metric data files.
>>>> 
>>>> http://cr.openjdk.java.net/~bobv/8146115/webrev.00/ <http://cr.openjdk.java.net/~bobv/8146115/webrev.00/>
>>>> 
>>>> These changes are enabled with -XX:+UseContainerSupport.
>>>> 
>>>> You can enable logging for this support via -Xlog:os+container=trace.
>>>> 
>>>> Since the dynamic selection of CPUs based on cpusets, quotas and shares
>>>> may not satisfy every users needs, I?ve added an additional flag to allow the
>>>> number of CPUs to be overridden.  This flag is named -XX:ActiveProcessorCount=xx.
>>>> 
>>>> 
>>>> Bob.
>>>> 
>>>> 
>>>> 


From erik.osterlund at oracle.com  Tue Oct  3 12:29:07 2017
From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=)
Date: Tue, 3 Oct 2017 14:29:07 +0200
Subject: RFR (M): 8188224: Generalize Atomic::load/store to use templates
Message-ID: <59D38293.7030800@oracle.com>

Hi,

The time has come to generalize Atomic::load/store with templates - the 
last operation to generalize in Atomic.
The design was inspired by Atomic::xchg and uses a similar mechanism to 
validate the passed in arguments. It was also designed with coming 
OrderAccess changes in mind. OrderAccess also contains loads and stores 
that will reuse the LoadImpl and StoreImpl infrastructure in 
Atomic::load/store. (the type checking for what is okay to pass in to 
Atomic::load/store is very much the same for 
OrderAccess::load_acquire/*store*).

One thing worth mentioning is that the bsd zero port (but notably not 
the linux zero port) had a leading fence for atomic stores of jint when 
#if !defined(ARM) && !defined(M68K) is true without any comment 
describing why. So I took the liberty of removing it. Atomic should not 
have any fencing at all - that is what OrderAccess is for. In fact 
Atomic does not promise any memory ordering semantics for loads and 
stores. Atomic merely provides relaxed accesses that are atomic. Worth 
mentioning nevertheless in case anyone wants to keep that jint 
Atomic::store fence on bsd zero !M68K && !ARM.

Bug:
https://bugs.openjdk.java.net/browse/JDK-8188224

Webrev:
http://cr.openjdk.java.net/~eosterlund/8188224/webrev.00/

Testing: JPRT, mach5 hs-tier3

Thanks,
/Erik

From robbin.ehn at oracle.com  Tue Oct  3 12:39:38 2017
From: robbin.ehn at oracle.com (Robbin Ehn)
Date: Tue, 3 Oct 2017 14:39:38 +0200
Subject: RFR: 8146115 - Improve docker container detection and resource
 configuration usage
In-Reply-To: <39AD9F8D-7E2B-4C15-8525-36DBA7C74302@oracle.com>
References: <74630458-926E-4B3E-B967-6F6ADCA0A406@oracle.com>
 <7e9322e8-274d-9fb2-f6a5-8cc612e3fe68@oracle.com>
 <33d617e7-7ec4-ebde-efa1-5602189e8470@oracle.com>
 <12909a67-6876-a40c-85b9-b959ed9f02df@oracle.com>
 <39AD9F8D-7E2B-4C15-8525-36DBA7C74302@oracle.com>
Message-ID: <d3c36d08-1c9f-69bb-4e0c-aa00d3e62691@oracle.com>

On 10/03/2017 02:25 PM, Bob Vandette wrote:
> After talking to a number of folks and getting feedback, my current thinking is to enable the support by default.

Great.

> 
> I still want to include the flag for at least one Java release in the event that the new behavior causes some regression
> in behavior.  I?m trying to make the detection robust so that it will fallback to the current behavior in the event
> that cgroups is not configured as expected but I?d like to have a way of forcing the issue.  JDK 10 is not
> supposed to be a long term support release which makes it a good target for this new behavior.
> 
> I agree with David that once we commit to cgroups, we should extract all VM configuration data from that
> source.  There?s more information available for cpusets than just processor affinity that we might want to
> consider when calculating the number of processors to assume for the VM.  There?s exclusivity and
> effective cpu data available in addition to the cpuset string.

cgroup only contains limits, not the real hard limits.
You most consider the affinity mask. We that have numa nodes do:

[rehn at rehn-ws dev]$ numactl --cpunodebind=1 --membind=1 java -Xlog:os=debug -cp . ForEver | grep proc
[0.001s][debug][os] Initial active processor count set to 16
[rehn at rehn-ws dev]$ numactl --cpunodebind=1 --membind=1 java -Xlog:os=debug -XX:+UseContainerSupport -cp . ForEver | grep proc
[0.001s][debug][os] Initial active processor count set to 32

when benchmarking all the time and that must be set to 16 otherwise the flag is really bad for us.
So the flag actually breaks the little numa support we have now.

Thanks, Robbin

> 
> Bob.
> 
> 
>> On Oct 3, 2017, at 4:00 AM, Robbin Ehn <robbin.ehn at oracle.com> wrote:
>>
>> Hi David,
>>
>> On 10/03/2017 12:46 AM, David Holmes wrote:
>>> Hi Robbin,
>>> I have some views on this :)
>>> On 3/10/2017 6:20 AM, Robbin Ehn wrote:
>>>> Hi Bob,
>>>>
>>>> As I said in your presentation for RT.
>>>> If kernel if configured with cgroup this should always be read (otherwise we get wrong values).
>>>> E.g. fedora have had cgroups default on several years (I believe most distros have it on).
>>>>
>>>> - No option is needed at all: right now we have wrong values your fix will provide right ones, why would you ever what to turn that off?
>>> It's not that you would want to turn that off (necessarily) but just because cgroups capability exists it doesn't mean they have actually been enabled and configured - in which case reading all the cgroup info is unnecessary startup overhead. So for now this is opt-in - as was the experimental cgroup support we added. Once it becomes clearer how this needs to be used we can adjust the defaults. For now this is enabling technology only.
>>
>> If cgroup are mounted they are on and the only way to know the configuration (such as no limits) is to actual read the cgroup filesystem.
>> Therefore the flag make no sense.
>>
>>>> - log target container would make little sense since almost all linuxes run with croups on.
>>> Again the capability is present but may not be enabled/configured.
>>
>> The capability is on if cgroup are mount and the only way to know the configuration is to read the cgroup filesystem.
>>
>>>> - For cpuset, the processes affinity mask already reflect cgroup setting so you don't need to look into cgroup for that
>>>>     If you do, you would miss any processes specific affinity mask. So _cpu_count() should already be returning the right number of CPU's.
>>> While the process affinity mask reflect cpusets (and we already use it for that reason), it doesn't reflect shares and quotas. And if shares/quotas are enforced and someone sets a custom affinity mask, what is it all supposed to mean? That's one of the main reasons to allow the number of cpu's to be hardwired via a flag. So it's better IMHO to read everything from the cgroups if configured to use cgroups.
>>
>> I'm not taking about shares and quotes, they should be read of course, but cpuset should be checked such as in _cpu_count.
>>
>> Here is the bug:
>>
>> [rehn at rehn-ws dev]$ taskset --cpu-list 0-2,6 java -Xlog:os=debug -cp . ForEver | grep proc
>> [0.002s][debug][os] Initial active processor count set to 4
>> ^C
>> [rehn at rehn-ws dev]$ taskset --cpu-list 0-2,6 java -XX:+UseContainerSupport -Xlog:os=debug -cp . ForEver | grep proc
>> [0.003s][debug][os] Initial active processor count set to 32
>> ^C
>>
>> _cpu_count already does the right thing.
>>
>> Thanks, Robbin
>>
>>
>>> Cheers,
>>> David
>>>>
>>>> Thanks for trying to fixing this!
>>>>
>>>> /Robbin
>>>>
>>>> On 09/22/2017 04:27 PM, Bob Vandette wrote:
>>>>> Please review these changes that improve on docker container detection and the
>>>>> automatic configuration of the number of active CPUs and total and free memory
>>>>> based on the containers resource limitation settings and metric data files.
>>>>>
>>>>> http://cr.openjdk.java.net/~bobv/8146115/webrev.00/ <http://cr.openjdk.java.net/~bobv/8146115/webrev.00/>
>>>>>
>>>>> These changes are enabled with -XX:+UseContainerSupport.
>>>>>
>>>>> You can enable logging for this support via -Xlog:os+container=trace.
>>>>>
>>>>> Since the dynamic selection of CPUs based on cpusets, quotas and shares
>>>>> may not satisfy every users needs, I?ve added an additional flag to allow the
>>>>> number of CPUs to be overridden.  This flag is named -XX:ActiveProcessorCount=xx.
>>>>>
>>>>>
>>>>> Bob.
>>>>>
>>>>>
>>>>>
> 

From david.holmes at oracle.com  Tue Oct  3 12:44:19 2017
From: david.holmes at oracle.com (David Holmes)
Date: Tue, 3 Oct 2017 22:44:19 +1000
Subject: RFR (M): 8188224: Generalize Atomic::load/store to use templates
In-Reply-To: <59D38293.7030800@oracle.com>
References: <59D38293.7030800@oracle.com>
Message-ID: <712e1c4e-b38b-11c3-4b51-d88f1560a063@oracle.com>

Hi Erik,

A lot of jumping through hoops just to do a direct load/store in the 
bulk of cases - but okay, we're embracing templates.

66   // Atomically store to a location
67   // See comment above about using jlong atomics on 32-bit platforms

The comment at #67 and the equivalent one for load can be deleted. The 
"comment above" should only be referring to r-m-w atomic ops not basic 
load and store. All platforms must have a means to do atomic load/store 
of 64-bit due to Java volatile variables (eg by using floating-point 
unit on 32-bit) but may not have cmpxchg<8> capability. (I failed to 
convince the author of this when those comments went in. ;-) )

Cheers,
David

On 3/10/2017 10:29 PM, Erik ?sterlund wrote:
> Hi,
> 
> The time has come to generalize Atomic::load/store with templates - the 
> last operation to generalize in Atomic.
> The design was inspired by Atomic::xchg and uses a similar mechanism to 
> validate the passed in arguments. It was also designed with coming 
> OrderAccess changes in mind. OrderAccess also contains loads and stores 
> that will reuse the LoadImpl and StoreImpl infrastructure in 
> Atomic::load/store. (the type checking for what is okay to pass in to 
> Atomic::load/store is very much the same for 
> OrderAccess::load_acquire/*store*).
> 
> One thing worth mentioning is that the bsd zero port (but notably not 
> the linux zero port) had a leading fence for atomic stores of jint when 
> #if !defined(ARM) && !defined(M68K) is true without any comment 
> describing why. So I took the liberty of removing it. Atomic should not 
> have any fencing at all - that is what OrderAccess is for. In fact 
> Atomic does not promise any memory ordering semantics for loads and 
> stores. Atomic merely provides relaxed accesses that are atomic. Worth 
> mentioning nevertheless in case anyone wants to keep that jint 
> Atomic::store fence on bsd zero !M68K && !ARM.
> 
> Bug:
> https://bugs.openjdk.java.net/browse/JDK-8188224
> 
> Webrev:
> http://cr.openjdk.java.net/~eosterlund/8188224/webrev.00/
> 
> Testing: JPRT, mach5 hs-tier3
> 
> Thanks,
> /Erik

From erik.osterlund at oracle.com  Tue Oct  3 12:58:11 2017
From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=)
Date: Tue, 3 Oct 2017 14:58:11 +0200
Subject: RFR (M): 8188224: Generalize Atomic::load/store to use templates
In-Reply-To: <712e1c4e-b38b-11c3-4b51-d88f1560a063@oracle.com>
References: <59D38293.7030800@oracle.com>
 <712e1c4e-b38b-11c3-4b51-d88f1560a063@oracle.com>
Message-ID: <59D38963.2070806@oracle.com>

Hi David,

Thanks for the review.
The comments have been removed.

New full webrev:
http://cr.openjdk.java.net/~eosterlund/8188224/webrev.01/

New incremental webrev:
http://cr.openjdk.java.net/~eosterlund/8188224/webrev.00_01/

Thanks,
/Erik

On 2017-10-03 14:44, David Holmes wrote:
> Hi Erik,
>
> A lot of jumping through hoops just to do a direct load/store in the 
> bulk of cases - but okay, we're embracing templates.
>
> 66   // Atomically store to a location
> 67   // See comment above about using jlong atomics on 32-bit platforms
>
> The comment at #67 and the equivalent one for load can be deleted. The 
> "comment above" should only be referring to r-m-w atomic ops not basic 
> load and store. All platforms must have a means to do atomic 
> load/store of 64-bit due to Java volatile variables (eg by using 
> floating-point unit on 32-bit) but may not have cmpxchg<8> capability. 
> (I failed to convince the author of this when those comments went in. 
> ;-) )
>
> Cheers,
> David
>
> On 3/10/2017 10:29 PM, Erik ?sterlund wrote:
>> Hi,
>>
>> The time has come to generalize Atomic::load/store with templates - 
>> the last operation to generalize in Atomic.
>> The design was inspired by Atomic::xchg and uses a similar mechanism 
>> to validate the passed in arguments. It was also designed with coming 
>> OrderAccess changes in mind. OrderAccess also contains loads and 
>> stores that will reuse the LoadImpl and StoreImpl infrastructure in 
>> Atomic::load/store. (the type checking for what is okay to pass in to 
>> Atomic::load/store is very much the same for 
>> OrderAccess::load_acquire/*store*).
>>
>> One thing worth mentioning is that the bsd zero port (but notably not 
>> the linux zero port) had a leading fence for atomic stores of jint 
>> when #if !defined(ARM) && !defined(M68K) is true without any comment 
>> describing why. So I took the liberty of removing it. Atomic should 
>> not have any fencing at all - that is what OrderAccess is for. In 
>> fact Atomic does not promise any memory ordering semantics for loads 
>> and stores. Atomic merely provides relaxed accesses that are atomic. 
>> Worth mentioning nevertheless in case anyone wants to keep that jint 
>> Atomic::store fence on bsd zero !M68K && !ARM.
>>
>> Bug:
>> https://bugs.openjdk.java.net/browse/JDK-8188224
>>
>> Webrev:
>> http://cr.openjdk.java.net/~eosterlund/8188224/webrev.00/
>>
>> Testing: JPRT, mach5 hs-tier3
>>
>> Thanks,
>> /Erik


From coleen.phillimore at oracle.com  Tue Oct  3 14:23:26 2017
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Tue, 3 Oct 2017 10:23:26 -0400
Subject: RFR (L) 8186777: Make Klass::_java_mirror an OopHandle
In-Reply-To: <1498efad-e443-5875-cc20-b0d0c926e883@oracle.com>
References: <9adb92ce-fb2d-11df-01dc-722e482a4d40@oracle.com>
 <383dcc42-47ea-3d3e-5565-15f8950c35ae@oracle.com>
 <1498efad-e443-5875-cc20-b0d0c926e883@oracle.com>
Message-ID: <7982f8eb-e4ba-8c09-f15f-e33797553141@oracle.com>


Here is an updated webrev with fixes for your comments.

open webrev at http://cr.openjdk.java.net/~coleenp/8186777.03/webrev

Thanks for reviewing and all your help with this!

Coleen

On 9/29/17 6:41 AM, Stefan Karlsson wrote:
> Hi Coleen,
>
> I started looking at this, but will need a second round before I've 
> fully reviewed the GC parts.
>
> Here are some nits that would be nice to get cleaned up.
>
> ==========
> http://cr.openjdk.java.net/~coleenp/8186777.02/webrev/src/hotspot/share/classfile/classLoaderData.cpp.frames.html 
>
>
> ?788???? record_modified_oops();? // necessary?
>
> This could be removed. Only G1 cares about deleted "weak" references.
>
> Or we can wait until Erik?'s GC Barrier Interface is in place and 
> remove it then.
>
> ----------
>
> ?#ifdef CLD_DUMP_KLASSES
> ?? if (Verbose) {
> ???? Klass* k = _klasses;
> ???? while (k != NULL) {
> -????? out->print_cr("klass " PTR_FORMAT ", %s, CT: %d, MUT: %d", k, 
> k->name()->as_C_string(),
> -????????? k->has_modified_oops(), k->has_accumulated_modified_oops());
> +????? out->print_cr("klass " PTR_FORMAT ", %s", k, 
> k->name()->as_C_string());
> ?????? assert(k != k->next_link(), "no loops!");
> ?????? k = k->next_link();
> ???? }
> ?? }
> ?#endif? // CLD_DUMP_KLASSES
>
> Pre-existing: I don't think this will compile if you turn on 
> CLD_DUMP_KLASSES. k must be p2i(k).
>
> ==========
> http://cr.openjdk.java.net/~coleenp/8186777.02/webrev/src/hotspot/share/classfile/classLoaderData.hpp.udiff.html 
>
>
> +? // Remembered sets support for the oops in the class loader data.
> +? jbyte _modified_oops;???????????? // Card Table Equivalent (YC/CMS 
> support)
> +? jbyte _accumulated_modified_oops; // Mod Union Equivalent (CMS 
> support)
>
> We should create a follow-up bug to change these jbytes to bools.
>
> ==========
> http://cr.openjdk.java.net/~coleenp/8186777.02/webrev/src/hotspot/share/gc/g1/g1HeapVerifier.cpp.frames.html 
>
>
> Spurious addition:
> +? G1CollectedHeap* _g1h;
>
> ==========
> http://cr.openjdk.java.net/~coleenp/8186777.02/webrev/src/hotspot/share/gc/g1/g1OopClosures.hpp.udiff.html 
>
>
> Spurious addition?:
> +? G1CollectedHeap* g1() { return _g1; }
>
> ==========
> http://cr.openjdk.java.net/~coleenp/8186777.02/webrev/src/hotspot/share/gc/parallel/psScavenge.inline.hpp.patch 
>
>
> ?? PSPromotionManager* _pm;
> -? // Used to redirty a scanned klass if it has oops
> +? // Used to redirty a scanned cld if it has oops
> ?? // pointing to the young generation after being scanned.
> -? Klass*???????????? _scanned_klass;
> +? ClassLoaderData*???????????? _scanned_cld;
>
> Indentation.
>
> ==========
> http://cr.openjdk.java.net/~coleenp/8186777.02/webrev/src/hotspot/share/gc/parallel/psTasks.cpp.frames.html 
>
>
> ? 80???? case class_loader_data:
> ? 81???? {
> ? 82?????? PSScavengeCLDClosure ps(pm);
> ? 83?????? ClassLoaderDataGraph::cld_do(&ps);
> ? 84???? }
>
> Would you mind changing the name ps to cld_closure?
>
> ==========
> http://cr.openjdk.java.net/~coleenp/8186777.02/webrev/src/hotspot/share/gc/shared/genOopClosures.hpp.patch 
>
>
> +? OopsInClassLoaderDataOrGenClosure*?? _scavenge_closure;
> ?? // true if the the modified oops state should be saved.
> ?? bool???????????????????? _accumulate_modified_oops;
>
> Indentation.
>
> ----------
> +? void do_cld(ClassLoaderData* k);
>
> Rename k?
>
> Thanks,
> StefanK
>
> On 2017-09-28 23:36, coleen.phillimore at oracle.com wrote:
>>
>> Thank you to Stefan Karlsson offlist for pointing out that the 
>> previous .01 version of this webrev breaks CMS in that it doesn't 
>> remember ClassLoaderData::_handles that are changed and added while 
>> concurrent marking is in progress.? I've fixed this bug to move the 
>> Klass::_modified_oops and _accumulated_modified_oops to the 
>> ClassLoaderData and use these fields in the CMS remarking phase to 
>> catch any new handles that are added.?? This also fixes this bug 
>> https://bugs.openjdk.java.net/browse/JDK-8173988 .
>>
>> In addition, the previous version of this change removed an 
>> optimization during young collection, which showed some uncertain 
>> performance regression in young pause times, so I added this 
>> optimization back to not walk ClassLoaderData during young 
>> collections if all the oops are old.? The performance results of 
>> SPECjbb2015 now are slightly better, but not significantly.
>>
>> This latest patch has been tested on tier1-5 on linux x64 and windows 
>> x64 in mach5 test harness.
>>
>> http://cr.openjdk.java.net/~coleenp/8186777.02/webrev/
>>
>> Can I get at least 3 reviewers?? One from each of the compiler, gc, 
>> and runtime group at least since there are changes to all 3.
>>
>> Thanks!
>> Coleen
>>
>>
>> On 9/6/17 12:04 PM, coleen.phillimore at oracle.com wrote:
>>> Summary: Add indirection for fetching mirror so that GC doesn't have 
>>> to follow CLD::_klasses
>>>
>>> Thank you to Tom Rodriguez for Graal changes and Rickard for the C2 
>>> changes.
>>>
>>> Ran nightly tests through Mach5 and RBT.?? Early performance testing 
>>> showed good performance improvment in GC class loader data 
>>> processing time, but nmethod processing time continues to dominate. 
>>> Also performace testing showed no throughput regression.?? I'm 
>>> rerunning both of these performance testing and will post the numbers.
>>>
>>> bug link https://bugs.openjdk.java.net/browse/JDK-8186777
>>> open webrev at http://cr.openjdk.java.net/~coleenp/8186777.01/webrev
>>>
>>> Thanks,
>>> Coleen


From dmitry.chuyko at bell-sw.com  Tue Oct  3 14:24:03 2017
From: dmitry.chuyko at bell-sw.com (Dmitry Chuyko)
Date: Tue, 3 Oct 2017 17:24:03 +0300
Subject: [10] RFR: 8186671 - AARCH64: Use `yield` instruction in SpinPause
 on linux-aarch64
In-Reply-To: <eb7863d4-7489-a40c-b22d-583b98b63b1e@bell-sw.com>
References: <a2a1efff-c572-205c-fed9-5f8ca3219e8d@redhat.com>
 <eb7863d4-7489-a40c-b22d-583b98b63b1e@bell-sw.com>
Message-ID: <70a22c6b-3716-0355-b80c-c0c2b84ec3a2@bell-sw.com>

Over the past time there have been no objections,,

Andrew, can you please sponsor the change?

Thanks,

-Dmitry


On 09/27/2017 08:04 PM, Dmitry Chuyko wrote:
>
> Hello,
>
> Re-sending this to hotspot-dev on the advice of Adrew, the patch is 
> updated for consolidated repo.
>
> rfe: https://bugs.openjdk.java.net/browse/JDK-8186671
> webrev: http://cr.openjdk.java.net/~dchuyko/8186671/webrev.01/
> original thread: 
> http://mail.openjdk.java.net/pipermail/aarch64-port-dev/2017-August/004870.html
>
> The function was moved to platform .S file and now implemented with 
> yield instruction.
>
> -Dmitry
>
>
> -------- Forwarded Message --------
> Subject: 	Re: [aarch64-port-dev ] RFR: 8186671: Use `yield` 
> instruction in SpinPause on linux-aarch64
> Date: 	Sat, 2 Sep 2017 09:10:00 +0100
> From: 	Andrew Haley <aph at redhat.com>
> To: 	Dmitry Chuyko <dmitry.chuyko at bell-sw.com>, 
> aarch64-port-dev at openjdk.java.net
>
>
>
> On 01/09/17 17:26, Dmitry Chuyko wrote:
> > There were no objections to this part (extern). I need sponsorship to 
> > push the change.
>
> I can do it, but it really needs to be sent to hotspot-dev.
>
> > It would be interesting to discuss the other (intrinsic) part a bit more 
> > at fireside chat.
>
> OK, but without any actual implementations we can test it'll be a very
> short discussion.
>
> -- 
> Andrew Haley
> Java Platform Lead Engineer
> Red Hat UK Ltd.<https://www.redhat.com>
> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From aph at redhat.com  Tue Oct  3 14:30:13 2017
From: aph at redhat.com (Andrew Haley)
Date: Tue, 3 Oct 2017 15:30:13 +0100
Subject: [10] RFR: 8186671 - AARCH64: Use `yield` instruction in SpinPause
 on linux-aarch64
In-Reply-To: <70a22c6b-3716-0355-b80c-c0c2b84ec3a2@bell-sw.com>
References: <a2a1efff-c572-205c-fed9-5f8ca3219e8d@redhat.com>
 <eb7863d4-7489-a40c-b22d-583b98b63b1e@bell-sw.com>
 <70a22c6b-3716-0355-b80c-c0c2b84ec3a2@bell-sw.com>
Message-ID: <6a8b007f-2b1c-c8a8-5b5e-6025ccc6dbd6@redhat.com>

On 03/10/17 15:24, Dmitry Chuyko wrote:
> Over the past time there have been no objections,,
> 
> Andrew, can you please sponsor the change?

No, let's discuss it on Thursday.

-- 
Andrew Haley
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671

From bob.vandette at oracle.com  Tue Oct  3 14:41:38 2017
From: bob.vandette at oracle.com (Bob Vandette)
Date: Tue, 3 Oct 2017 10:41:38 -0400
Subject: RFR: 8146115 - Improve docker container detection and resource
 configuration usage
In-Reply-To: <d3c36d08-1c9f-69bb-4e0c-aa00d3e62691@oracle.com>
References: <74630458-926E-4B3E-B967-6F6ADCA0A406@oracle.com>
 <7e9322e8-274d-9fb2-f6a5-8cc612e3fe68@oracle.com>
 <33d617e7-7ec4-ebde-efa1-5602189e8470@oracle.com>
 <12909a67-6876-a40c-85b9-b959ed9f02df@oracle.com>
 <39AD9F8D-7E2B-4C15-8525-36DBA7C74302@oracle.com>
 <d3c36d08-1c9f-69bb-4e0c-aa00d3e62691@oracle.com>
Message-ID: <82E66654-2AF3-45EB-B996-45C7DE4191D2@oracle.com>


> On Oct 3, 2017, at 8:39 AM, Robbin Ehn <robbin.ehn at oracle.com> wrote:
> 
> On 10/03/2017 02:25 PM, Bob Vandette wrote:
>> After talking to a number of folks and getting feedback, my current thinking is to enable the support by default.
> 
> Great.
> 
>> I still want to include the flag for at least one Java release in the event that the new behavior causes some regression
>> in behavior.  I?m trying to make the detection robust so that it will fallback to the current behavior in the event
>> that cgroups is not configured as expected but I?d like to have a way of forcing the issue.  JDK 10 is not
>> supposed to be a long term support release which makes it a good target for this new behavior.
>> I agree with David that once we commit to cgroups, we should extract all VM configuration data from that
>> source.  There?s more information available for cpusets than just processor affinity that we might want to
>> consider when calculating the number of processors to assume for the VM.  There?s exclusivity and
>> effective cpu data available in addition to the cpuset string.
> 
> cgroup only contains limits, not the real hard limits.
> You most consider the affinity mask. We that have numa nodes do:
> 
> [rehn at rehn-ws dev]$ numactl --cpunodebind=1 --membind=1 java -Xlog:os=debug -cp . ForEver | grep proc
> [0.001s][debug][os] Initial active processor count set to 16
> [rehn at rehn-ws dev]$ numactl --cpunodebind=1 --membind=1 java -Xlog:os=debug -XX:+UseContainerSupport -cp . ForEver | grep proc
> [0.001s][debug][os] Initial active processor count set to 32
> 
> when benchmarking all the time and that must be set to 16 otherwise the flag is really bad for us.
> So the flag actually breaks the little numa support we have now.

Thanks for sharing those results.  I?ll look into this.

I?m hoping this is due to the fact that I am not yet examining the memory node files in the cgroup file
system.  

Bob.

> 
> Thanks, Robbin
> 
>> Bob.
>>> On Oct 3, 2017, at 4:00 AM, Robbin Ehn <robbin.ehn at oracle.com> wrote:
>>> 
>>> Hi David,
>>> 
>>> On 10/03/2017 12:46 AM, David Holmes wrote:
>>>> Hi Robbin,
>>>> I have some views on this :)
>>>> On 3/10/2017 6:20 AM, Robbin Ehn wrote:
>>>>> Hi Bob,
>>>>> 
>>>>> As I said in your presentation for RT.
>>>>> If kernel if configured with cgroup this should always be read (otherwise we get wrong values).
>>>>> E.g. fedora have had cgroups default on several years (I believe most distros have it on).
>>>>> 
>>>>> - No option is needed at all: right now we have wrong values your fix will provide right ones, why would you ever what to turn that off?
>>>> It's not that you would want to turn that off (necessarily) but just because cgroups capability exists it doesn't mean they have actually been enabled and configured - in which case reading all the cgroup info is unnecessary startup overhead. So for now this is opt-in - as was the experimental cgroup support we added. Once it becomes clearer how this needs to be used we can adjust the defaults. For now this is enabling technology only.
>>> 
>>> If cgroup are mounted they are on and the only way to know the configuration (such as no limits) is to actual read the cgroup filesystem.
>>> Therefore the flag make no sense.
>>> 
>>>>> - log target container would make little sense since almost all linuxes run with croups on.
>>>> Again the capability is present but may not be enabled/configured.
>>> 
>>> The capability is on if cgroup are mount and the only way to know the configuration is to read the cgroup filesystem.
>>> 
>>>>> - For cpuset, the processes affinity mask already reflect cgroup setting so you don't need to look into cgroup for that
>>>>>    If you do, you would miss any processes specific affinity mask. So _cpu_count() should already be returning the right number of CPU's.
>>>> While the process affinity mask reflect cpusets (and we already use it for that reason), it doesn't reflect shares and quotas. And if shares/quotas are enforced and someone sets a custom affinity mask, what is it all supposed to mean? That's one of the main reasons to allow the number of cpu's to be hardwired via a flag. So it's better IMHO to read everything from the cgroups if configured to use cgroups.
>>> 
>>> I'm not taking about shares and quotes, they should be read of course, but cpuset should be checked such as in _cpu_count.
>>> 
>>> Here is the bug:
>>> 
>>> [rehn at rehn-ws dev]$ taskset --cpu-list 0-2,6 java -Xlog:os=debug -cp . ForEver | grep proc
>>> [0.002s][debug][os] Initial active processor count set to 4
>>> ^C
>>> [rehn at rehn-ws dev]$ taskset --cpu-list 0-2,6 java -XX:+UseContainerSupport -Xlog:os=debug -cp . ForEver | grep proc
>>> [0.003s][debug][os] Initial active processor count set to 32
>>> ^C
>>> 
>>> _cpu_count already does the right thing.
>>> 
>>> Thanks, Robbin
>>> 
>>> 
>>>> Cheers,
>>>> David
>>>>> 
>>>>> Thanks for trying to fixing this!
>>>>> 
>>>>> /Robbin
>>>>> 
>>>>> On 09/22/2017 04:27 PM, Bob Vandette wrote:
>>>>>> Please review these changes that improve on docker container detection and the
>>>>>> automatic configuration of the number of active CPUs and total and free memory
>>>>>> based on the containers resource limitation settings and metric data files.
>>>>>> 
>>>>>> http://cr.openjdk.java.net/~bobv/8146115/webrev.00/ <http://cr.openjdk.java.net/~bobv/8146115/webrev.00/>
>>>>>> 
>>>>>> These changes are enabled with -XX:+UseContainerSupport.
>>>>>> 
>>>>>> You can enable logging for this support via -Xlog:os+container=trace.
>>>>>> 
>>>>>> Since the dynamic selection of CPUs based on cpusets, quotas and shares
>>>>>> may not satisfy every users needs, I?ve added an additional flag to allow the
>>>>>> number of CPUs to be overridden.  This flag is named -XX:ActiveProcessorCount=xx.
>>>>>> 
>>>>>> 
>>>>>> Bob.
>>>>>> 
>>>>>> 
>>>>>> 


From aph at redhat.com  Tue Oct  3 14:56:11 2017
From: aph at redhat.com (Andrew Haley)
Date: Tue, 3 Oct 2017 15:56:11 +0100
Subject: RFR (M): 8188224: Generalize Atomic::load/store to use templates
In-Reply-To: <712e1c4e-b38b-11c3-4b51-d88f1560a063@oracle.com>
References: <59D38293.7030800@oracle.com>
 <712e1c4e-b38b-11c3-4b51-d88f1560a063@oracle.com>
Message-ID: <599fbf96-4439-ba00-e0a2-0599f0de057f@redhat.com>

On 03/10/17 13:44, David Holmes wrote:
> A lot of jumping through hoops just to do a direct load/store in the 
> bulk of cases - but okay, we're embracing templates.

That doesn't really follow: embracing templates often makes generic
code simpler, with fewer hoops.  That's the idea, as I understand it.

-- 
Andrew Haley
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671

From wenlei.xie at gmail.com  Tue Oct  3 18:14:40 2017
From: wenlei.xie at gmail.com (Wenlei Xie)
Date: Tue, 3 Oct 2017 11:14:40 -0700
Subject: Questions about negative loaded classes and Lambda Form
 Compilation
In-Reply-To: <CABTg2xp2tc5EfmHFr5HmEuEhPTO59T0Jmb=8S-GWA+b1f_TQ7w@mail.gmail.com>
References: <CABTg2xp2tc5EfmHFr5HmEuEhPTO59T0Jmb=8S-GWA+b1f_TQ7w@mail.gmail.com>
Message-ID: <CABTg2xroKOxhxVqJ59mKqSmZJgiJwANY3JesYU0yOBa2uJ9yZA@mail.gmail.com>

Hi,

We are still seeing this on 1.8.0_144. Just wondering these is any idea
what might cause this, or what kind of thing we can do to investigate the
VM is in this state?

Thank you !!

BTW: I note the attachment doesn't seem to work. So here is the link to
screenshot about the negative number of loaded classes: https://imgur.com/a/
kGbto

On Wed, Sep 27, 2017 at 11:03 AM, Wenlei Xie <wenlei.xie at gmail.com> wrote:

> Hi,
>
> We recently see some weird behavior of JVM in our production cluster. We
> are running JDK 1.8.0_131.
>
> 1. On more than half of the machines (200 out of 400 machines), we see he
> JMX counter report negative LoadedClassCount, see attached jmxcounter.png.
>
> After some further dig, we note UnloadedClassCount is larger than
> TotalLoadedClassCount. And  LoadedClassCount (-695,710) =
> TotalLoadedClassCount - UnloadedClassCount . PerfCounter reports the same
> number, here is the result on the same machine:
>
>     $ jcmd 307 PerfCounter.print | grep -i class | grep -i java.cls
>     java.cls.loadedClasses=192004392
>     java.cls.sharedLoadedClasses=0
>     java.cls.sharedUnloadedClasses=0
>     java.cls.unloadedClasses=192700102
>
>
>
> 2. For the same cluster, we also see over half of machines repeatedly
> experiencing full GC due to Metaspace full. We dump JSTACK for every minute
> during 30 minutes, and see many threads are trying to compile the exact
> same lambda form throughout the 30-minute period.
>
> Here is an example stacktrace on one machine. The LambdaForm triggers the
> compilation on that machine is always LambdaForm$MH/170067652. Once it's
> compiled, it should use the new compiled lambda form. We don't know why
> it's still trying to compile the same lambda form again and again. -- Would
> it be because the compiled lambda form somehow failed to load? This might
> relate to the negative number of loaded classes.
>
>
>     "20170926_232912_39740_3vuuu.1.79-4-76640" #76640 prio=5 os_prio=0
> tid=0x00007f908006dbd0 nid=0x150a6 runnable [0x00007f8bddb1b000]
>        java.lang.Thread.State: RUNNABLE
>             at sun.misc.Unsafe.defineAnonymousClass(Native Method)
>             at java.lang.invoke.InvokerByteco
> deGenerator.loadAndInitializeInvokerClass(InvokerBytecodeGen
> erator.java:284)
>             at java.lang.invoke.InvokerByteco
> deGenerator.loadMethod(InvokerBytecodeGenerator.java:276)
>             at java.lang.invoke.InvokerByteco
> deGenerator.generateCustomizedCode(InvokerBytecodeGenerator.java:618)
>             at java.lang.invoke.LambdaForm.co
> mpileToBytecode(LambdaForm.java:654)
>             at java.lang.invoke.LambdaForm.prepare(LambdaForm.java:635)
>             at java.lang.invoke.MethodHandle.
> updateForm(MethodHandle.java:1432)
>             at java.lang.invoke.MethodHandle.
> customize(MethodHandle.java:1442)
>             at java.lang.invoke.Invokers.maybeCustomize(Invokers.java:407)
>             at java.lang.invoke.Invokers.chec
> kCustomized(Invokers.java:398)
>             at java.lang.invoke.LambdaForm$MH
> /170067652.invokeExact_MT(LambdaForm$MH)
>             at com.facebook.presto.operator.a
> ggregation.MinMaxHelper.combineStateWithState(MinMaxHelper.java:141)
>             at com.facebook.presto.operator.a
> ggregation.MaxAggregationFunction.combine(MaxAggregationFunction.java:108)
>             at java.lang.invoke.LambdaForm$DM
> H/1607453282.invokeStatic_L3_V(LambdaForm$DMH)
>             at java.lang.invoke.LambdaForm$BM
> H/1118134445.reinvoke(LambdaForm$BMH)
>             at java.lang.invoke.LambdaForm$MH
> /1971758264.linkToTargetMethod(LambdaForm$MH)
>             at com.facebook.presto.$gen.Integ
> erIntegerMaxGroupedAccumulator_3439.addIntermediate(Unknown Source)
>             at com.facebook.presto.operator.a
> ggregation.builder.InMemoryHashAggregationBuilder$Aggregator
> .processPage(InMemoryHashAggregationBuilder.java:367)
>             at com.facebook.presto.operator.a
> ggregation.builder.InMemoryHashAggregationBuilder.processPag
> e(InMemoryHashAggregationBuilder.java:138)
>             at com.facebook.presto.operator.H
> ashAggregationOperator.addInput(HashAggregationOperator.java:400)
>             at com.facebook.presto.operator.D
> river.processInternal(Driver.java:343)
>             at com.facebook.presto.operator.D
> river.lambda$processFor$6(Driver.java:241)
>             at com.facebook.presto.operator.Driver$$Lambda$765/
> 442308692.get(Unknown Source)
>             at com.facebook.presto.operator.D
> river.tryWithLock(Driver.java:614)
>             at com.facebook.presto.operator.D
> river.processFor(Driver.java:235)
>             at com.facebook.presto.execution.
> SqlTaskExecution$DriverSplitRunner.processFor(SqlTaskExecution.java:622)
>             at com.facebook.presto.execution.
> executor.PrioritizedSplitRunner.process(PrioritizedSplitRunner.java:163)
>             at com.facebook.presto.execution.
> executor.TaskExecutor$TaskRunner.run(TaskExecutor.java:485)
>             at java.util.concurrent.ThreadPoo
> lExecutor.runWorker(ThreadPoolExecutor.java:1142)
>             at java.util.concurrent.ThreadPoo
> lExecutor$Worker.run(ThreadPoolExecutor.java:617)
>             at java.lang.Thread.run(Thread.java:748)
>     ...
>
>
>
> Both issues go away after we restart the JVM, and the same query won't
> trigger the LambdaForm compilation issue, so it looks like the JVM enters
> some weird state.  We are wondering if there is any thoughts on what could
> trigger these issues? Or is there any suggestions about how to further
> investigate it next time we see the VM in this state?
>
> Thank you.
>
>
> --
> Best Regards,
> Wenlei Xie
>
> Email: wenlei.xie at gmail.com
>


-- 
Best Regards,
Wenlei Xie

Email: wenlei.xie at gmail.com

From alexander.harlap at oracle.com  Tue Oct  3 18:44:35 2017
From: alexander.harlap at oracle.com (Alexander Harlap)
Date: Tue, 3 Oct 2017 14:44:35 -0400
Subject: Request for review JDK-8187819 gc/TestFullGCALot.java fails on jdk10
 started with "-XX:-UseCompressedOops" option
Message-ID: <ebef6cd3-2995-550c-6b70-4c7806fe976c@oracle.com>

Please review change for JDK-8187819 
<https://bugs.openjdk.java.net/browse/JDK-8187819> 
gc/TestFullGCALot.java fails on jdk10 started with 
"-XX:-UseCompressedOops" option.

Change is located at http://cr.openjdk.java.net/~aharlap/8187819/webrev.00/

Initialized metaspace performance counters before their potential use.

Tested - JPRT

Alex


From vladimir.kozlov at oracle.com  Tue Oct  3 19:46:35 2017
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Tue, 3 Oct 2017 12:46:35 -0700
Subject: RFR(M): 8166317: InterpreterCodeSize should be computed
In-Reply-To: <5b5219a5-960e-363b-2bdc-3613f1dae62c@oracle.com>
References: <CA+3eh13aijzoGexJXDB7tTnFfmqE4BffbASUysSb2=osD7F4nA@mail.gmail.com>
 <50cda0ab-f403-372a-ce51-1a27d8821448@oracle.com>
 <CA+3eh11HCkBF8KkMG5-o-Ouji=KLqQ=FtztLWo6u3Han3yxoKw@mail.gmail.com>
 <e00bf17e-0ee5-75ba-8eb3-f766b984fa9f@oracle.com>
 <CA+3eh10Sxx5hsXABdeWCZ_P=AY5aR6F37Gs3cdFEv9P_K8GUrw@mail.gmail.com>
 <CA+3eh13fpKiZZgmHCAORgOjxdrEiTipN4zOxAxdz=++GiMZ0SQ@mail.gmail.com>
 <6704868d-caa7-51e0-4741-5d62f90d837c@oracle.com>
 <CA+3eh12_Z9S-oyMNJf8MLwtveDXAx+2xAqsKjOZjS5o_nm6WUQ@mail.gmail.com>
 <8c522d38-90db-2864-0778-6d5948b1f50c@oracle.com>
 <7fee08f1-8304-3026-19e9-844e618e98ea@oracle.com>
 <CA+3eh125jBF2OEMMJ_22N__UHx=Kj6NMKtpfnfGnnf0WUk4Wbw@mail.gmail.com>
 <2bb4136a-8c0e-ac4c-0c03-af38ff79ab40@oracle.com>
 <CA+3eh125R_kgucQ0pfUAgmW9a+y=hx05_N4eomHBSzd1viEztg@mail.gmail.com>
 <5b5219a5-960e-363b-2bdc-3613f1dae62c@oracle.com>
Message-ID: <ff56e260-c520-f0a2-32ac-512af28b317c@oracle.com>

I rebased it. But there is problem with changes. VM hit guarantee() in this code when run on SPARC in both, fastdebug and product, builds.
Crash happens during build. We can't push this - problem should be investigated and fixed first.

Thanks,
Vladimir

make/Main.gmk:443: recipe for target 'generate-link-opt-data' failed
/usr/ccs/bin/bash: line 4:  9349 Abort                   (core dumped) /s/build/solaris-sparcv9-debug/support/interim-image/bin/java 
-XX:DumpLoadedClassList=/s/build/solaris-sparcv9-debug/support/link_opt/classlist -Djava.lang.invoke.MethodHandle.TRACE_RESOLVE=true -cp /s/build/solaris-sparcv9-debug/support/classlist.jar 
build.tools.classlist.HelloClasslist 2>&1 > /s/build/solaris-sparcv9-debug/support/link_opt/default_jli_trace.txt
make[3]: *** [/s/build/solaris-sparcv9-debug/support/link_opt/classlist] Error 134
make[2]: *** [generate-link-opt-data] Error 1


# A fatal error has been detected by the Java Runtime Environment:
#
#  Internal Error (/s/open/src/hotspot/share/memory/heap.cpp:233), pid=9349, tid=2
#  guarantee(b == block_at(_next_segment - actual_number_of_segments)) failed: Intermediate allocation!
#
# JRE version:  (10.0) (fastdebug build )
# Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 10-internal+0-2017-09-30-014154.8166317, mixed mode, tiered, compressed oops, g1 gc, solaris-sparc)
# Core dump will be written. Default location: /s/open/make/core or core.9349
#
# If you would like to submit a bug report, please visit:
#   http://bugreport.java.com/bugreport/crash.jsp
#

---------------  S U M M A R Y ------------

Command Line: -XX:DumpLoadedClassList=/s/build/solaris-sparcv9-debug/support/link_opt/classlist -Djava.lang.invoke.MethodHandle.TRACE_RESOLVE=true build.tools.classlist.HelloClasslist

Host: sca00dbv, Sparcv9 64 bit 3600 MHz, 16 cores, 32G, Oracle Solaris 11.2 SPARC
Time: Sat Sep 30 03:29:46 2017 UTC elapsed time: 0 seconds (0d 0h 0m 0s)

---------------  T H R E A D  ---------------

Current thread (0x000000010012f000):  JavaThread "Unknown thread" [_thread_in_vm, id=2, stack(0x0007fffef9700000,0x0007fffef9800000)]

Stack: [0x0007fffef9700000,0x0007fffef9800000],  sp=0x0007fffef97ff020,  free space=1020k
Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0x1f94508]  void VMError::report_and_die(int,const char*,const char*,void*,Thread*,unsigned char*,void*,void*,const char*,int,unsigned long)+0xa58
V  [libjvm.so+0x1f93a3c]  void VMError::report_and_die(Thread*,const char*,int,const char*,const char*,void*)+0x3c
V  [libjvm.so+0xd02f38]  void report_vm_error(const char*,int,const char*,const char*,...)+0x78
V  [libjvm.so+0xfc219c]  void CodeHeap::deallocate_tail(void*,unsigned long)+0xec
V  [libjvm.so+0xbf4f14]  void CodeCache::free_unused_tail(CodeBlob*,unsigned long)+0xe4
V  [libjvm.so+0x1e0ae70]  void StubQueue::deallocate_unused_tail()+0x40
V  [libjvm.so+0x1e7452c]  void TemplateInterpreter::initialize()+0x19c
V  [libjvm.so+0x1051220]  void interpreter_init()+0x20
V  [libjvm.so+0x10116e0]  int init_globals()+0xf0
V  [libjvm.so+0x1ed8548]  int Threads::create_vm(JavaVMInitArgs*,bool*)+0x4a8
V  [libjvm.so+0x11c7b58]  int JNI_CreateJavaVM_inner(JavaVM_**,void**,void*)+0x108
C  [libjli.so+0x7950]  InitializeJVM+0x100


On 10/2/17 7:55 AM, coleen.phillimore at oracle.com wrote:
> 
> I can sponsor this for you once you rebase, and fix these compilation errors.
> Thanks,
> Coleen
> 
> On 9/30/17 12:28 AM, Volker Simonis wrote:
>> Hi Vladimir,
>>
>> thanks a lot for remembering these changes!
>>
>> Regards,
>> Volker
>>
>>
>> Vladimir Kozlov <vladimir.kozlov at oracle.com <mailto:vladimir.kozlov at oracle.com>> schrieb am Fr. 29. Sep. 2017 um 15:47:
>>
>>     I hit build failure when tried to push changes:
>>
>>     src/hotspot/share/code/codeBlob.hpp(162) : warning C4267: '=' : conversion from 'size_t' to 'int', possible loss of data
>>     src/hotspot/share/code/codeBlob.hpp(163) : warning C4267: '=' : conversion from 'size_t' to 'int', possible loss of data
>>
>>     I am going to fix it by casting (int):
>>
>>     +? void adjust_size(size_t used) {
>>     +? ? _size = (int)used;
>>     +? ? _data_offset = (int)used;
>>     +? ? _code_end = (address)this + used;
>>     +? ? _data_end = (address)this + used;
>>     +? }
>>
>>     Note, CodeCache size can't more than 2Gb (max_int) so such casting is fine.
>>
>>     Vladimir
>>
>>     On 9/6/17 6:20 AM, Volker Simonis wrote:
>>     > On Tue, Sep 5, 2017 at 9:36 PM,? <coleen.phillimore at oracle.com <mailto:coleen.phillimore at oracle.com>> wrote:
>>     >>
>>     >> I was going to make the same comment about the friend declaration in v1, so
>>     >> v2 looks better to me.? Looks good.? Thank you for finding a solution to
>>     >> this problem that we've had for a long time.? I will sponsor this (remind me
>>     >> if I forget after the 18th).
>>     >>
>>     >
>>     > Thanks Coleen! I've updated
>>     >
>>     > http://cr.openjdk.java.net/~simonis/webrevs/2017/8166317.v2/ <http://cr.openjdk.java.net/%7Esimonis/webrevs/2017/8166317.v2/>
>>     >
>>     > in-place and added you as a second reviewer.
>>     >
>>     > Regards,
>>     > Volker
>>     >
>>     >
>>     >> thanks,
>>     >> Coleen
>>     >>
>>     >>
>>     >>
>>     >> On 9/5/17 1:17 PM, Vladimir Kozlov wrote:
>>     >>>
>>     >>> On 9/5/17 9:49 AM, Volker Simonis wrote:
>>     >>>>
>>     >>>> On Fri, Sep 1, 2017 at 6:16 PM, Vladimir Kozlov
>>     >>>> <vladimir.kozlov at oracle.com <mailto:vladimir.kozlov at oracle.com>> wrote:
>>     >>>>>
>>     >>>>> May be add new CodeBlob's method to adjust sizes instead of directly
>>     >>>>> setting
>>     >>>>> them in? CodeCache::free_unused_tail(). Then you would not need friend
>>     >>>>> class
>>     >>>>> CodeCache in CodeBlob.
>>     >>>>>
>>     >>>>
>>     >>>> Changed as suggested (I didn't liked the friend declaration as well :)
>>     >>>>
>>     >>>>> Also I think adjustment to header_size should be done in
>>     >>>>> CodeCache::free_unused_tail() to limit scope of code who knows about
>>     >>>>> blob
>>     >>>>> layout.
>>     >>>>>
>>     >>>>
>>     >>>> Yes, that's much cleaner. Please find the updated webrev here:
>>     >>>>
>>     >>>> http://cr.openjdk.java.net/~simonis/webrevs/2017/8166317.v2/ <http://cr.openjdk.java.net/%7Esimonis/webrevs/2017/8166317.v2/>
>>     >>>
>>     >>>
>>     >>> Good.
>>     >>>
>>     >>>>
>>     >>>> I've also found another "day 1" problem in StubQueue::next():
>>     >>>>
>>     >>>>? ? ? Stub* next(Stub* s) const ? ? ? ? { int i =
>>     >>>> index_of(s) + stub_size(s);
>>     >>>> - ? ? ? ? ?if (i ==
>>     >>>> _buffer_limit) i = 0;
>>     >>>> + ? ? ? ? ?// Only wrap
>>     >>>> around in the non-contiguous case (see stubss.cpp)
>>     >>>> + ? ? ? ? ?if (i ==
>>     >>>> _buffer_limit && _queue_end < _buffer_limit) i = 0;
>>     >>>> ? ? ? ? ? ?return (i ==
>>     >>>> _queue_end) ? NULL : stub_at(i);
>>     >>>> ? ? ? ? ?}
>>     >>>>
>>     >>>> The problem was that the method was not prepared to handle the case
>>     >>>> where _buffer_limit == _queue_end == _buffer_size which lead to an
>>     >>>> infinite recursion when iterating over a StubQueue with
>>     >>>> StubQueue::next() until next() returns NULL (as this was for example
>>     >>>> done with -XX:+PrintInterpreter). But with the new, trimmed CodeBlob
>>     >>>> we run into exactly this situation.
>>     >>>
>>     >>>
>>     >>> Okay.
>>     >>>
>>     >>>>
>>     >>>> While doing this last fix I also noticed that "StubQueue::stubs_do()",
>>     >>>> "StubQueue::queues_do()" and "StubQueue::register_queue()" don't seem
>>     >>>> to be used anywhere in the open code base (please correct me if I'm
>>     >>>> wrong). What do you think, maybe we should remove this code in a
>>     >>>> follow up change if it is really not needed?
>>     >>>
>>     >>>
>>     >>> register_queue() is used in constructor. Other 2 you can remove.
>>     >>> stub_code_begin() and stub_code_end() are not used too -remove.
>>     >>> I thought we run on linux with flag which warn about unused code.
>>     >>>
>>     >>>>
>>     >>>> Finally, could you please run the new version through JPRT and sponsor
>>     >>>> it once jdk10/hs will be opened again?
>>     >>>
>>     >>>
>>     >>> Will do when jdk10 "consolidation" is finished. Please, remind me later if
>>     >>> I forget.
>>     >>>
>>     >>> Thanks,
>>     >>> Vladimir
>>     >>>
>>     >>>>
>>     >>>> Thanks,
>>     >>>> Volker
>>     >>>>
>>     >>>>> Thanks,
>>     >>>>> Vladimir
>>     >>>>>
>>     >>>>>
>>     >>>>> On 9/1/17 8:46 AM, Volker Simonis wrote:
>>     >>>>>>
>>     >>>>>>
>>     >>>>>> Hi,
>>     >>>>>>
>>     >>>>>> I've decided to split the fix for the 'CodeHeap::contains_blob()'
>>     >>>>>> problem into its own issue "8187091: ReturnBlobToWrongHeapTest fails
>>     >>>>>> because of problems in CodeHeap::contains_blob()"
>>     >>>>>> (https://bugs.openjdk.java.net/browse/JDK-8187091) and started a new
>>     >>>>>> review thread for discussing it at:
>>     >>>>>>
>>     >>>>>>
>>     >>>>>> http://mail.openjdk.java.net/pipermail/hotspot-dev/2017-September/028206.html
>>     >>>>>>
>>     >>>>>> So please lets keep this thread for discussing the interpreter code
>>     >>>>>> size issue only. I've prepared a new version of the webrev which is
>>     >>>>>> the same as the first one with the only difference that the change to
>>     >>>>>> 'CodeHeap::contains_blob()' has been removed:
>>     >>>>>>
>>     >>>>>> http://cr.openjdk.java.net/~simonis/webrevs/2017/8166317.v1/ <http://cr.openjdk.java.net/%7Esimonis/webrevs/2017/8166317.v1/>
>>     >>>>>>
>>     >>>>>> Thanks,
>>     >>>>>> Volker
>>     >>>>>>
>>     >>>>>>
>>     >>>>>> On Thu, Aug 31, 2017 at 6:35 PM, Volker Simonis
>>     >>>>>> <volker.simonis at gmail.com <mailto:volker.simonis at gmail.com>> wrote:
>>     >>>>>>>
>>     >>>>>>>
>>     >>>>>>> On Thu, Aug 31, 2017 at 6:05 PM, Vladimir Kozlov
>>     >>>>>>> <vladimir.kozlov at oracle.com <mailto:vladimir.kozlov at oracle.com>> wrote:
>>     >>>>>>>>
>>     >>>>>>>>
>>     >>>>>>>> Very good change. Thank you, Volker.
>>     >>>>>>>>
>>     >>>>>>>> About contains_blob(). The problem is that AOTCompiledMethod
>>     >>>>>>>> allocated
>>     >>>>>>>> in
>>     >>>>>>>> CHeap and not in aot code section (which is RO):
>>     >>>>>>>>
>>     >>>>>>>>
>>     >>>>>>>>
>>     >>>>>>>> http://hg.openjdk.java.net/jdk10/hs/hotspot/file/8acd232fb52a/src/share/vm/aot/aotCompiledMethod.hpp#l124
>>     >>>>>>>>
>>     >>>>>>>> It is allocated in CHeap after AOT library is loaded. Its
>>     >>>>>>>> code_begin()
>>     >>>>>>>> points to AOT code section but AOTCompiledMethod* points outside it
>>     >>>>>>>> (to
>>     >>>>>>>> normal malloced space) so you can't use (char*)blob address.
>>     >>>>>>>>
>>     >>>>>>>
>>     >>>>>>> Thanks for the explanation - now I got it.
>>     >>>>>>>
>>     >>>>>>>> There are 2 ways to fix it, I think.
>>     >>>>>>>> One is to add new field to CodeBlobLayout and set it to blob* address
>>     >>>>>>>> for
>>     >>>>>>>> normal CodeCache blobs and to code_begin for AOT code.
>>     >>>>>>>> Second is to use contains(blob->code_end() - 1) assuming that AOT
>>     >>>>>>>> code
>>     >>>>>>>> is
>>     >>>>>>>> never zero.
>>     >>>>>>>>
>>     >>>>>>>
>>     >>>>>>> I'll give it a try tomorrow and will send out a new webrev.
>>     >>>>>>>
>>     >>>>>>> Regards,
>>     >>>>>>> Volker
>>     >>>>>>>
>>     >>>>>>>> Thanks,
>>     >>>>>>>> Vladimir
>>     >>>>>>>>
>>     >>>>>>>>
>>     >>>>>>>> On 8/31/17 5:43 AM, Volker Simonis wrote:
>>     >>>>>>>>>
>>     >>>>>>>>>
>>     >>>>>>>>>
>>     >>>>>>>>> On Thu, Aug 31, 2017 at 12:14 PM, Claes Redestad
>>     >>>>>>>>> <claes.redestad at oracle.com <mailto:claes.redestad at oracle.com>> wrote:
>>     >>>>>>>>>>
>>     >>>>>>>>>>
>>     >>>>>>>>>>
>>     >>>>>>>>>>
>>     >>>>>>>>>>
>>     >>>>>>>>>> On 2017-08-31 08:54, Volker Simonis wrote:
>>     >>>>>>>>>>>
>>     >>>>>>>>>>>
>>     >>>>>>>>>>>
>>     >>>>>>>>>>>
>>     >>>>>>>>>>> While working on this, I found another problem which is related to
>>     >>>>>>>>>>> the
>>     >>>>>>>>>>> fix of JDK-8183573 and leads to crashes when executing the JTreg
>>     >>>>>>>>>>> test
>>     >>>>>>>>>>> compiler/codecache/stress/ReturnBlobToWrongHeapTest.java.
>>     >>>>>>>>>>>
>>     >>>>>>>>>>> The problem is that JDK-8183573 replaced
>>     >>>>>>>>>>>
>>     >>>>>>>>>>>? ? ? ? virtual bool contains_blob(const CodeBlob* blob) const {
>>     >>>>>>>>>>> return
>>     >>>>>>>>>>> low_boundary() <= (char*) blob && (char*) blob < high(); }
>>     >>>>>>>>>>>
>>     >>>>>>>>>>> by:
>>     >>>>>>>>>>>
>>     >>>>>>>>>>>? ? ? ? bool contains_blob(const CodeBlob* blob) const { return
>>     >>>>>>>>>>> contains(blob->code_begin()); }
>>     >>>>>>>>>>>
>>     >>>>>>>>>>> But that my be wrong in the corner case where the size of the
>>     >>>>>>>>>>> CodeBlob's payload is zero (i.e. the CodeBlob consists only of the
>>     >>>>>>>>>>> 'header' - i.e. the C++ object itself) because in that case
>>     >>>>>>>>>>> CodeBlob::code_begin() points right behind the CodeBlob's header
>>     >>>>>>>>>>> which
>>     >>>>>>>>>>> is a memory location which doesn't belong to the CodeBlob anymore.
>>     >>>>>>>>>>
>>     >>>>>>>>>>
>>     >>>>>>>>>>
>>     >>>>>>>>>>
>>     >>>>>>>>>>
>>     >>>>>>>>>> I recall this change was somehow necessary to allow merging
>>     >>>>>>>>>> AOTCodeHeap::contains_blob and CodeHead::contains_blob into
>>     >>>>>>>>>> one devirtualized method, so you need to ensure all AOT tests
>>     >>>>>>>>>> pass with this change (on linux-x64).
>>     >>>>>>>>>>
>>     >>>>>>>>>
>>     >>>>>>>>> All of hotspot/test/aot and hotspot/test/jvmci executed and passed
>>     >>>>>>>>> successful. Are there any other tests I should check?
>>     >>>>>>>>>
>>     >>>>>>>>> That said, it is a little hard to follow the stages of your change.
>>     >>>>>>>>> It
>>     >>>>>>>>> seems like
>>     >>>>>>>>> http://cr.openjdk.java.net/~redestad/scratch/codeheap_contains.00/ <http://cr.openjdk.java.net/%7Eredestad/scratch/codeheap_contains.00/>
>>     >>>>>>>>> was reviewed [1] but then finally the slightly changed version from
>>     >>>>>>>>> http://cr.openjdk.java.net/~redestad/scratch/codeheap_contains.01/ <http://cr.openjdk.java.net/%7Eredestad/scratch/codeheap_contains.01/>
>>     >>>>>>>>> was
>>     >>>>>>>>> checked in and linked to the bug report.
>>     >>>>>>>>>
>>     >>>>>>>>> The first, reviewed version of the change still had a correct
>>     >>>>>>>>> version
>>     >>>>>>>>> of 'CodeHeap::contains_blob(const CodeBlob* blob)' while the second,
>>     >>>>>>>>> checked in version has the faulty version of that method.
>>     >>>>>>>>>
>>     >>>>>>>>> I don't know why you finally did that change to 'contains_blob()'
>>     >>>>>>>>> but
>>     >>>>>>>>> I don't see any reason why we shouldn't be able to directly use the
>>     >>>>>>>>> blob's address for inclusion checking. From what I understand, it
>>     >>>>>>>>> should ALWAYS be contained in the corresponding CodeHeap so no
>>     >>>>>>>>> reason
>>     >>>>>>>>> to mess with 'CodeBlob::code_begin()'.
>>     >>>>>>>>>
>>     >>>>>>>>> Please let me know if I'm missing something.
>>     >>>>>>>>>
>>     >>>>>>>>> [1]
>>     >>>>>>>>>
>>     >>>>>>>>>
>>     >>>>>>>>> http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2017-July/026624.html
>>     >>>>>>>>>
>>     >>>>>>>>>> I can't help to wonder if we'd not be better served by disallowing
>>     >>>>>>>>>> zero-sized payloads. Is this something that can ever actually
>>     >>>>>>>>>> happen except by abuse of the white box API?
>>     >>>>>>>>>>
>>     >>>>>>>>>
>>     >>>>>>>>> The corresponding test (ReturnBlobToWrongHeapTest.java) specifically
>>     >>>>>>>>> wants to allocate "segment sized" blocks which is most easily
>>     >>>>>>>>> achieved
>>     >>>>>>>>> by allocation zero-sized CodeBlobs. And I think there's nothing
>>     >>>>>>>>> wrong
>>     >>>>>>>>> about it if we handle the inclusion tests correctly.
>>     >>>>>>>>>
>>     >>>>>>>>> Thank you and best regards,
>>     >>>>>>>>> Volker
>>     >>>>>>>>>
>>     >>>>>>>>>> /Claes
>>     >>
>>     >>
>>
> 

From vladimir.kozlov at oracle.com  Tue Oct  3 19:58:54 2017
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Tue, 3 Oct 2017 12:58:54 -0700
Subject: JDK10/RFR(M): 8172232: SPARC ISA/CPU feature detection is
 broken/insufficient (on Linux).
In-Reply-To: <908a6ae1-0d83-361c-9c1b-1b2a114884ff@oracle.com>
References: <7d5e1ebb-7de8-66f1-a1f0-db465bcad4ab@oracle.com>
 <9f2896ca-65dc-557f-793c-4235499cc340@oracle.com>
 <908a6ae1-0d83-361c-9c1b-1b2a114884ff@oracle.com>
Message-ID: <f32aac02-a8a2-4e4d-ad1b-8f728a84dc01@oracle.com>

On 10/2/17 8:52 AM, Patric Hedlin wrote:
> Hi Vladimir,
> 
> 
> On 09/29/2017 08:56 PM, Vladimir Kozlov wrote:
>> In general it is fine. Few notes.
>> You use ifdef DEBUG_SPARC_CAPS which is undefed at the beginning. Is it set by gcc by default?
>>
> I have not noticed any (obvious) convention in the code base for this case, when I have a entirely (file-) local, typically debug, definition that makes no sense to define except within a particular 
> file. I usually list those as undefines in the beginning of the file to make sure they are not exposed to the command line (the rationale being that they should not be of use if you are not actively 
> making changes to the particular file). And it sort of works as part of the local docs.

Got it. But in such situation we have other mechanisms to print information about CPUs.
I would suggest to use unified logging currently we use for this: -Xlog:os+cpu

http://hg.openjdk.java.net/jdk10/hs/file/58931d9b2260/src/hotspot/share/runtime/vm_version.cpp#l300

There are different levels of output and for your case you can use Debug or Trace level (default is Info).

Thanks,
Vladimir

> 
> Would it be an acceptable approach to add a comment like this:
> 
> /* NOTE: Enable the local define 'DEBUG_LINUX_SPARC_CAPS' below (or define it
>  ?*?????? from the command line) as an aid when updating the feature table.
> #define DEBUG_LINUX_SPARC_CAPS
>  ?*/
> 
> Close to its first use (?). (I changed the name since it will be exposed outside the file.)
> 
>> Coding style for methods definitions - open parenthesis should be on the same line:
>>
>> +? bool match(const char* s) const
>> +? {
>>
> 
> Old habits die hard... and it's so much more readable ;)
> 
> /Patric
>> Thanks,
>> Vladimir
>>
>> On 9/29/17 6:08 AM, Patric Hedlin wrote:
>>> Dear all,
>>>
>>> I would like to ask for help to review the following change/update:
>>>
>>> Issue:? https://bugs.openjdk.java.net/browse/JDK-8172232
>>>
>>> Webrev: http://cr.openjdk.java.net/~phedlin/tr8172232/
>>>
>>>
>>> 8172232: SPARC ISA/CPU feature detection is broken/insufficient (on Linux).
>>>
>>> ???? Subsumes (duplicate) JDK-8186579: VM_Version::platform_features() needs update on linux-sparc.
>>>
>>>
>>> Caveat:
>>>
>>> ???? This update will introduce some redundancies into the code base, features and definitions
>>> ???? currently not used, addressed by subsequent bug or feature updates/patches. Fujitsu HW is
>>> ???? treated very conservatively.
>>>
>>>
>>> Testing:
>>>
>>> ???? JDK9/JDK10 local jtreg/hotspot
>>>
>>>
>>> Thanks to Adrian for additional test (and review) support.
>>>
>>> Tested-By: John Paul Adrian Glaubitz <glaubitz at physik.fu-berlin.de>
>>>
>>>
>>> Best regards,
>>> Patric
>>>
> 

From coleen.phillimore at oracle.com  Tue Oct  3 20:02:38 2017
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Tue, 3 Oct 2017 16:02:38 -0400
Subject: RFR (L) 8186777: Make Klass::_java_mirror an OopHandle
In-Reply-To: <7982f8eb-e4ba-8c09-f15f-e33797553141@oracle.com>
References: <9adb92ce-fb2d-11df-01dc-722e482a4d40@oracle.com>
 <383dcc42-47ea-3d3e-5565-15f8950c35ae@oracle.com>
 <1498efad-e443-5875-cc20-b0d0c926e883@oracle.com>
 <7982f8eb-e4ba-8c09-f15f-e33797553141@oracle.com>
Message-ID: <124f386e-28ec-701a-111c-fcc15335feb6@oracle.com>


Stefan found a problem that set_java_mirror() code could be unsafe if 
the java_mirror code changes, which the function allowed one to do.? 
There is code in jvmtiRedefineClasses that temporarily switches the 
java_mirrors for verification of the newly loaded class.? Since this 
simply swaps java_mirrors that are together in the 
ClassLoaderData::_handles area, I added an API for that and made 
set_java_mirror() more restrictive.

I reran JVMTI, CDS and tier1 tests.?? New webrev with all changes are:

open webrev at http://cr.openjdk.java.net/~coleenp/8186777.04/webrev

Thanks,
Coleen

On 10/3/17 10:23 AM, coleen.phillimore at oracle.com wrote:
>
> Here is an updated webrev with fixes for your comments.
>
> open webrev at http://cr.openjdk.java.net/~coleenp/8186777.03/webrev
>
> Thanks for reviewing and all your help with this!
>
> Coleen
>
> On 9/29/17 6:41 AM, Stefan Karlsson wrote:
>> Hi Coleen,
>>
>> I started looking at this, but will need a second round before I've 
>> fully reviewed the GC parts.
>>
>> Here are some nits that would be nice to get cleaned up.
>>
>> ==========
>> http://cr.openjdk.java.net/~coleenp/8186777.02/webrev/src/hotspot/share/classfile/classLoaderData.cpp.frames.html 
>>
>>
>> ?788???? record_modified_oops();? // necessary?
>>
>> This could be removed. Only G1 cares about deleted "weak" references.
>>
>> Or we can wait until Erik?'s GC Barrier Interface is in place and 
>> remove it then.
>>
>> ----------
>>
>> ?#ifdef CLD_DUMP_KLASSES
>> ?? if (Verbose) {
>> ???? Klass* k = _klasses;
>> ???? while (k != NULL) {
>> -????? out->print_cr("klass " PTR_FORMAT ", %s, CT: %d, MUT: %d", k, 
>> k->name()->as_C_string(),
>> -????????? k->has_modified_oops(), k->has_accumulated_modified_oops());
>> +????? out->print_cr("klass " PTR_FORMAT ", %s", k, 
>> k->name()->as_C_string());
>> ?????? assert(k != k->next_link(), "no loops!");
>> ?????? k = k->next_link();
>> ???? }
>> ?? }
>> ?#endif? // CLD_DUMP_KLASSES
>>
>> Pre-existing: I don't think this will compile if you turn on 
>> CLD_DUMP_KLASSES. k must be p2i(k).
>>
>> ==========
>> http://cr.openjdk.java.net/~coleenp/8186777.02/webrev/src/hotspot/share/classfile/classLoaderData.hpp.udiff.html 
>>
>>
>> +? // Remembered sets support for the oops in the class loader data.
>> +? jbyte _modified_oops;???????????? // Card Table Equivalent (YC/CMS 
>> support)
>> +? jbyte _accumulated_modified_oops; // Mod Union Equivalent (CMS 
>> support)
>>
>> We should create a follow-up bug to change these jbytes to bools.
>>
>> ==========
>> http://cr.openjdk.java.net/~coleenp/8186777.02/webrev/src/hotspot/share/gc/g1/g1HeapVerifier.cpp.frames.html 
>>
>>
>> Spurious addition:
>> +? G1CollectedHeap* _g1h;
>>
>> ==========
>> http://cr.openjdk.java.net/~coleenp/8186777.02/webrev/src/hotspot/share/gc/g1/g1OopClosures.hpp.udiff.html 
>>
>>
>> Spurious addition?:
>> +? G1CollectedHeap* g1() { return _g1; }
>>
>> ==========
>> http://cr.openjdk.java.net/~coleenp/8186777.02/webrev/src/hotspot/share/gc/parallel/psScavenge.inline.hpp.patch 
>>
>>
>> ?? PSPromotionManager* _pm;
>> -? // Used to redirty a scanned klass if it has oops
>> +? // Used to redirty a scanned cld if it has oops
>> ?? // pointing to the young generation after being scanned.
>> -? Klass*???????????? _scanned_klass;
>> +? ClassLoaderData*???????????? _scanned_cld;
>>
>> Indentation.
>>
>> ==========
>> http://cr.openjdk.java.net/~coleenp/8186777.02/webrev/src/hotspot/share/gc/parallel/psTasks.cpp.frames.html 
>>
>>
>> ? 80???? case class_loader_data:
>> ? 81???? {
>> ? 82?????? PSScavengeCLDClosure ps(pm);
>> ? 83?????? ClassLoaderDataGraph::cld_do(&ps);
>> ? 84???? }
>>
>> Would you mind changing the name ps to cld_closure?
>>
>> ==========
>> http://cr.openjdk.java.net/~coleenp/8186777.02/webrev/src/hotspot/share/gc/shared/genOopClosures.hpp.patch 
>>
>>
>> +? OopsInClassLoaderDataOrGenClosure*?? _scavenge_closure;
>> ?? // true if the the modified oops state should be saved.
>> ?? bool???????????????????? _accumulate_modified_oops;
>>
>> Indentation.
>>
>> ----------
>> +? void do_cld(ClassLoaderData* k);
>>
>> Rename k?
>>
>> Thanks,
>> StefanK
>>
>> On 2017-09-28 23:36, coleen.phillimore at oracle.com wrote:
>>>
>>> Thank you to Stefan Karlsson offlist for pointing out that the 
>>> previous .01 version of this webrev breaks CMS in that it doesn't 
>>> remember ClassLoaderData::_handles that are changed and added while 
>>> concurrent marking is in progress.? I've fixed this bug to move the 
>>> Klass::_modified_oops and _accumulated_modified_oops to the 
>>> ClassLoaderData and use these fields in the CMS remarking phase to 
>>> catch any new handles that are added.?? This also fixes this bug 
>>> https://bugs.openjdk.java.net/browse/JDK-8173988 .
>>>
>>> In addition, the previous version of this change removed an 
>>> optimization during young collection, which showed some uncertain 
>>> performance regression in young pause times, so I added this 
>>> optimization back to not walk ClassLoaderData during young 
>>> collections if all the oops are old.? The performance results of 
>>> SPECjbb2015 now are slightly better, but not significantly.
>>>
>>> This latest patch has been tested on tier1-5 on linux x64 and 
>>> windows x64 in mach5 test harness.
>>>
>>> http://cr.openjdk.java.net/~coleenp/8186777.02/webrev/
>>>
>>> Can I get at least 3 reviewers?? One from each of the compiler, gc, 
>>> and runtime group at least since there are changes to all 3.
>>>
>>> Thanks!
>>> Coleen
>>>
>>>
>>> On 9/6/17 12:04 PM, coleen.phillimore at oracle.com wrote:
>>>> Summary: Add indirection for fetching mirror so that GC doesn't 
>>>> have to follow CLD::_klasses
>>>>
>>>> Thank you to Tom Rodriguez for Graal changes and Rickard for the C2 
>>>> changes.
>>>>
>>>> Ran nightly tests through Mach5 and RBT.?? Early performance 
>>>> testing showed good performance improvment in GC class loader data 
>>>> processing time, but nmethod processing time continues to dominate. 
>>>> Also performace testing showed no throughput regression.?? I'm 
>>>> rerunning both of these performance testing and will post the numbers.
>>>>
>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8186777
>>>> open webrev at http://cr.openjdk.java.net/~coleenp/8186777.01/webrev
>>>>
>>>> Thanks,
>>>> Coleen
>


From stefan.karlsson at oracle.com  Tue Oct  3 20:15:06 2017
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Tue, 3 Oct 2017 22:15:06 +0200
Subject: RFR (L) 8186777: Make Klass::_java_mirror an OopHandle
In-Reply-To: <124f386e-28ec-701a-111c-fcc15335feb6@oracle.com>
References: <9adb92ce-fb2d-11df-01dc-722e482a4d40@oracle.com>
 <383dcc42-47ea-3d3e-5565-15f8950c35ae@oracle.com>
 <1498efad-e443-5875-cc20-b0d0c926e883@oracle.com>
 <7982f8eb-e4ba-8c09-f15f-e33797553141@oracle.com>
 <124f386e-28ec-701a-111c-fcc15335feb6@oracle.com>
Message-ID: <055f4b75-efaa-79a3-0b6f-83c13ab87896@oracle.com>

On 2017-10-03 22:02, coleen.phillimore at oracle.com wrote:
>
> Stefan found a problem that set_java_mirror() code could be unsafe if 
> the java_mirror code changes, which the function allowed one to do.? 
> There is code in jvmtiRedefineClasses that temporarily switches the 
> java_mirrors for verification of the newly loaded class.? Since this 
> simply swaps java_mirrors that are together in the 
> ClassLoaderData::_handles area, I added an API for that and made 
> set_java_mirror() more restrictive.
>
> I reran JVMTI, CDS and tier1 tests.?? New webrev with all changes are:
>
> open webrev at http://cr.openjdk.java.net/~coleenp/8186777.04/webrev

The GC parts look good to me.

Thanks,
StefanK

>
> Thanks,
> Coleen
>
> On 10/3/17 10:23 AM, coleen.phillimore at oracle.com wrote:
>>
>> Here is an updated webrev with fixes for your comments.
>>
>> open webrev at http://cr.openjdk.java.net/~coleenp/8186777.03/webrev
>>
>> Thanks for reviewing and all your help with this!
>>
>> Coleen
>>
>> On 9/29/17 6:41 AM, Stefan Karlsson wrote:
>>> Hi Coleen,
>>>
>>> I started looking at this, but will need a second round before I've 
>>> fully reviewed the GC parts.
>>>
>>> Here are some nits that would be nice to get cleaned up.
>>>
>>> ==========
>>> http://cr.openjdk.java.net/~coleenp/8186777.02/webrev/src/hotspot/share/classfile/classLoaderData.cpp.frames.html 
>>>
>>>
>>> ?788???? record_modified_oops();? // necessary?
>>>
>>> This could be removed. Only G1 cares about deleted "weak" references.
>>>
>>> Or we can wait until Erik?'s GC Barrier Interface is in place and 
>>> remove it then.
>>>
>>> ----------
>>>
>>> ?#ifdef CLD_DUMP_KLASSES
>>> ?? if (Verbose) {
>>> ???? Klass* k = _klasses;
>>> ???? while (k != NULL) {
>>> -????? out->print_cr("klass " PTR_FORMAT ", %s, CT: %d, MUT: %d", k, 
>>> k->name()->as_C_string(),
>>> -????????? k->has_modified_oops(), k->has_accumulated_modified_oops());
>>> +????? out->print_cr("klass " PTR_FORMAT ", %s", k, 
>>> k->name()->as_C_string());
>>> ?????? assert(k != k->next_link(), "no loops!");
>>> ?????? k = k->next_link();
>>> ???? }
>>> ?? }
>>> ?#endif? // CLD_DUMP_KLASSES
>>>
>>> Pre-existing: I don't think this will compile if you turn on 
>>> CLD_DUMP_KLASSES. k must be p2i(k).
>>>
>>> ==========
>>> http://cr.openjdk.java.net/~coleenp/8186777.02/webrev/src/hotspot/share/classfile/classLoaderData.hpp.udiff.html 
>>>
>>>
>>> +? // Remembered sets support for the oops in the class loader data.
>>> +? jbyte _modified_oops;???????????? // Card Table Equivalent 
>>> (YC/CMS support)
>>> +? jbyte _accumulated_modified_oops; // Mod Union Equivalent (CMS 
>>> support)
>>>
>>> We should create a follow-up bug to change these jbytes to bools.
>>>
>>> ==========
>>> http://cr.openjdk.java.net/~coleenp/8186777.02/webrev/src/hotspot/share/gc/g1/g1HeapVerifier.cpp.frames.html 
>>>
>>>
>>> Spurious addition:
>>> +? G1CollectedHeap* _g1h;
>>>
>>> ==========
>>> http://cr.openjdk.java.net/~coleenp/8186777.02/webrev/src/hotspot/share/gc/g1/g1OopClosures.hpp.udiff.html 
>>>
>>>
>>> Spurious addition?:
>>> +? G1CollectedHeap* g1() { return _g1; }
>>>
>>> ==========
>>> http://cr.openjdk.java.net/~coleenp/8186777.02/webrev/src/hotspot/share/gc/parallel/psScavenge.inline.hpp.patch 
>>>
>>>
>>> ?? PSPromotionManager* _pm;
>>> -? // Used to redirty a scanned klass if it has oops
>>> +? // Used to redirty a scanned cld if it has oops
>>> ?? // pointing to the young generation after being scanned.
>>> -? Klass*???????????? _scanned_klass;
>>> +? ClassLoaderData*???????????? _scanned_cld;
>>>
>>> Indentation.
>>>
>>> ==========
>>> http://cr.openjdk.java.net/~coleenp/8186777.02/webrev/src/hotspot/share/gc/parallel/psTasks.cpp.frames.html 
>>>
>>>
>>> ? 80???? case class_loader_data:
>>> ? 81???? {
>>> ? 82?????? PSScavengeCLDClosure ps(pm);
>>> ? 83?????? ClassLoaderDataGraph::cld_do(&ps);
>>> ? 84???? }
>>>
>>> Would you mind changing the name ps to cld_closure?
>>>
>>> ==========
>>> http://cr.openjdk.java.net/~coleenp/8186777.02/webrev/src/hotspot/share/gc/shared/genOopClosures.hpp.patch 
>>>
>>>
>>> +? OopsInClassLoaderDataOrGenClosure*?? _scavenge_closure;
>>> ?? // true if the the modified oops state should be saved.
>>> ?? bool???????????????????? _accumulate_modified_oops;
>>>
>>> Indentation.
>>>
>>> ----------
>>> +? void do_cld(ClassLoaderData* k);
>>>
>>> Rename k?
>>>
>>> Thanks,
>>> StefanK
>>>
>>> On 2017-09-28 23:36, coleen.phillimore at oracle.com wrote:
>>>>
>>>> Thank you to Stefan Karlsson offlist for pointing out that the 
>>>> previous .01 version of this webrev breaks CMS in that it doesn't 
>>>> remember ClassLoaderData::_handles that are changed and added while 
>>>> concurrent marking is in progress. I've fixed this bug to move the 
>>>> Klass::_modified_oops and _accumulated_modified_oops to the 
>>>> ClassLoaderData and use these fields in the CMS remarking phase to 
>>>> catch any new handles that are added.?? This also fixes this bug 
>>>> https://bugs.openjdk.java.net/browse/JDK-8173988 .
>>>>
>>>> In addition, the previous version of this change removed an 
>>>> optimization during young collection, which showed some uncertain 
>>>> performance regression in young pause times, so I added this 
>>>> optimization back to not walk ClassLoaderData during young 
>>>> collections if all the oops are old.? The performance results of 
>>>> SPECjbb2015 now are slightly better, but not significantly.
>>>>
>>>> This latest patch has been tested on tier1-5 on linux x64 and 
>>>> windows x64 in mach5 test harness.
>>>>
>>>> http://cr.openjdk.java.net/~coleenp/8186777.02/webrev/
>>>>
>>>> Can I get at least 3 reviewers?? One from each of the compiler, gc, 
>>>> and runtime group at least since there are changes to all 3.
>>>>
>>>> Thanks!
>>>> Coleen
>>>>
>>>>
>>>> On 9/6/17 12:04 PM, coleen.phillimore at oracle.com wrote:
>>>>> Summary: Add indirection for fetching mirror so that GC doesn't 
>>>>> have to follow CLD::_klasses
>>>>>
>>>>> Thank you to Tom Rodriguez for Graal changes and Rickard for the 
>>>>> C2 changes.
>>>>>
>>>>> Ran nightly tests through Mach5 and RBT.?? Early performance 
>>>>> testing showed good performance improvment in GC class loader data 
>>>>> processing time, but nmethod processing time continues to 
>>>>> dominate. Also performace testing showed no throughput 
>>>>> regression.?? I'm rerunning both of these performance testing and 
>>>>> will post the numbers.
>>>>>
>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8186777
>>>>> open webrev at http://cr.openjdk.java.net/~coleenp/8186777.01/webrev
>>>>>
>>>>> Thanks,
>>>>> Coleen
>>
>


From coleen.phillimore at oracle.com  Tue Oct  3 20:31:43 2017
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Tue, 3 Oct 2017 16:31:43 -0400
Subject: RFR (L) 8186777: Make Klass::_java_mirror an OopHandle
In-Reply-To: <055f4b75-efaa-79a3-0b6f-83c13ab87896@oracle.com>
References: <9adb92ce-fb2d-11df-01dc-722e482a4d40@oracle.com>
 <383dcc42-47ea-3d3e-5565-15f8950c35ae@oracle.com>
 <1498efad-e443-5875-cc20-b0d0c926e883@oracle.com>
 <7982f8eb-e4ba-8c09-f15f-e33797553141@oracle.com>
 <124f386e-28ec-701a-111c-fcc15335feb6@oracle.com>
 <055f4b75-efaa-79a3-0b6f-83c13ab87896@oracle.com>
Message-ID: <d379d29f-8693-65dd-629f-738c2754a931@oracle.com>


On 10/3/17 4:15 PM, Stefan Karlsson wrote:
> On 2017-10-03 22:02, coleen.phillimore at oracle.com wrote:
>>
>> Stefan found a problem that set_java_mirror() code could be unsafe if 
>> the java_mirror code changes, which the function allowed one to do.? 
>> There is code in jvmtiRedefineClasses that temporarily switches the 
>> java_mirrors for verification of the newly loaded class.? Since this 
>> simply swaps java_mirrors that are together in the 
>> ClassLoaderData::_handles area, I added an API for that and made 
>> set_java_mirror() more restrictive.
>>
>> I reran JVMTI, CDS and tier1 tests.?? New webrev with all changes are:
>>
>> open webrev at http://cr.openjdk.java.net/~coleenp/8186777.04/webrev
>
> The GC parts look good to me.

Thanks for your help!

Coleen
>
> Thanks,
> StefanK
>
>>
>> Thanks,
>> Coleen
>>
>> On 10/3/17 10:23 AM, coleen.phillimore at oracle.com wrote:
>>>
>>> Here is an updated webrev with fixes for your comments.
>>>
>>> open webrev at http://cr.openjdk.java.net/~coleenp/8186777.03/webrev
>>>
>>> Thanks for reviewing and all your help with this!
>>>
>>> Coleen
>>>
>>> On 9/29/17 6:41 AM, Stefan Karlsson wrote:
>>>> Hi Coleen,
>>>>
>>>> I started looking at this, but will need a second round before I've 
>>>> fully reviewed the GC parts.
>>>>
>>>> Here are some nits that would be nice to get cleaned up.
>>>>
>>>> ==========
>>>> http://cr.openjdk.java.net/~coleenp/8186777.02/webrev/src/hotspot/share/classfile/classLoaderData.cpp.frames.html 
>>>>
>>>>
>>>> ?788???? record_modified_oops();? // necessary?
>>>>
>>>> This could be removed. Only G1 cares about deleted "weak" references.
>>>>
>>>> Or we can wait until Erik?'s GC Barrier Interface is in place and 
>>>> remove it then.
>>>>
>>>> ----------
>>>>
>>>> ?#ifdef CLD_DUMP_KLASSES
>>>> ?? if (Verbose) {
>>>> ???? Klass* k = _klasses;
>>>> ???? while (k != NULL) {
>>>> -????? out->print_cr("klass " PTR_FORMAT ", %s, CT: %d, MUT: %d", 
>>>> k, k->name()->as_C_string(),
>>>> -????????? k->has_modified_oops(), 
>>>> k->has_accumulated_modified_oops());
>>>> +????? out->print_cr("klass " PTR_FORMAT ", %s", k, 
>>>> k->name()->as_C_string());
>>>> ?????? assert(k != k->next_link(), "no loops!");
>>>> ?????? k = k->next_link();
>>>> ???? }
>>>> ?? }
>>>> ?#endif? // CLD_DUMP_KLASSES
>>>>
>>>> Pre-existing: I don't think this will compile if you turn on 
>>>> CLD_DUMP_KLASSES. k must be p2i(k).
>>>>
>>>> ==========
>>>> http://cr.openjdk.java.net/~coleenp/8186777.02/webrev/src/hotspot/share/classfile/classLoaderData.hpp.udiff.html 
>>>>
>>>>
>>>> +? // Remembered sets support for the oops in the class loader data.
>>>> +? jbyte _modified_oops;???????????? // Card Table Equivalent 
>>>> (YC/CMS support)
>>>> +? jbyte _accumulated_modified_oops; // Mod Union Equivalent (CMS 
>>>> support)
>>>>
>>>> We should create a follow-up bug to change these jbytes to bools.
>>>>
>>>> ==========
>>>> http://cr.openjdk.java.net/~coleenp/8186777.02/webrev/src/hotspot/share/gc/g1/g1HeapVerifier.cpp.frames.html 
>>>>
>>>>
>>>> Spurious addition:
>>>> +? G1CollectedHeap* _g1h;
>>>>
>>>> ==========
>>>> http://cr.openjdk.java.net/~coleenp/8186777.02/webrev/src/hotspot/share/gc/g1/g1OopClosures.hpp.udiff.html 
>>>>
>>>>
>>>> Spurious addition?:
>>>> +? G1CollectedHeap* g1() { return _g1; }
>>>>
>>>> ==========
>>>> http://cr.openjdk.java.net/~coleenp/8186777.02/webrev/src/hotspot/share/gc/parallel/psScavenge.inline.hpp.patch 
>>>>
>>>>
>>>> ?? PSPromotionManager* _pm;
>>>> -? // Used to redirty a scanned klass if it has oops
>>>> +? // Used to redirty a scanned cld if it has oops
>>>> ?? // pointing to the young generation after being scanned.
>>>> -? Klass*???????????? _scanned_klass;
>>>> +? ClassLoaderData*???????????? _scanned_cld;
>>>>
>>>> Indentation.
>>>>
>>>> ==========
>>>> http://cr.openjdk.java.net/~coleenp/8186777.02/webrev/src/hotspot/share/gc/parallel/psTasks.cpp.frames.html 
>>>>
>>>>
>>>> ? 80???? case class_loader_data:
>>>> ? 81???? {
>>>> ? 82?????? PSScavengeCLDClosure ps(pm);
>>>> ? 83?????? ClassLoaderDataGraph::cld_do(&ps);
>>>> ? 84???? }
>>>>
>>>> Would you mind changing the name ps to cld_closure?
>>>>
>>>> ==========
>>>> http://cr.openjdk.java.net/~coleenp/8186777.02/webrev/src/hotspot/share/gc/shared/genOopClosures.hpp.patch 
>>>>
>>>>
>>>> +? OopsInClassLoaderDataOrGenClosure*?? _scavenge_closure;
>>>> ?? // true if the the modified oops state should be saved.
>>>> ?? bool???????????????????? _accumulate_modified_oops;
>>>>
>>>> Indentation.
>>>>
>>>> ----------
>>>> +? void do_cld(ClassLoaderData* k);
>>>>
>>>> Rename k?
>>>>
>>>> Thanks,
>>>> StefanK
>>>>
>>>> On 2017-09-28 23:36, coleen.phillimore at oracle.com wrote:
>>>>>
>>>>> Thank you to Stefan Karlsson offlist for pointing out that the 
>>>>> previous .01 version of this webrev breaks CMS in that it doesn't 
>>>>> remember ClassLoaderData::_handles that are changed and added 
>>>>> while concurrent marking is in progress. I've fixed this bug to 
>>>>> move the Klass::_modified_oops and _accumulated_modified_oops to 
>>>>> the ClassLoaderData and use these fields in the CMS remarking 
>>>>> phase to catch any new handles that are added.?? This also fixes 
>>>>> this bug https://bugs.openjdk.java.net/browse/JDK-8173988 .
>>>>>
>>>>> In addition, the previous version of this change removed an 
>>>>> optimization during young collection, which showed some uncertain 
>>>>> performance regression in young pause times, so I added this 
>>>>> optimization back to not walk ClassLoaderData during young 
>>>>> collections if all the oops are old.? The performance results of 
>>>>> SPECjbb2015 now are slightly better, but not significantly.
>>>>>
>>>>> This latest patch has been tested on tier1-5 on linux x64 and 
>>>>> windows x64 in mach5 test harness.
>>>>>
>>>>> http://cr.openjdk.java.net/~coleenp/8186777.02/webrev/
>>>>>
>>>>> Can I get at least 3 reviewers?? One from each of the compiler, 
>>>>> gc, and runtime group at least since there are changes to all 3.
>>>>>
>>>>> Thanks!
>>>>> Coleen
>>>>>
>>>>>
>>>>> On 9/6/17 12:04 PM, coleen.phillimore at oracle.com wrote:
>>>>>> Summary: Add indirection for fetching mirror so that GC doesn't 
>>>>>> have to follow CLD::_klasses
>>>>>>
>>>>>> Thank you to Tom Rodriguez for Graal changes and Rickard for the 
>>>>>> C2 changes.
>>>>>>
>>>>>> Ran nightly tests through Mach5 and RBT.?? Early performance 
>>>>>> testing showed good performance improvment in GC class loader 
>>>>>> data processing time, but nmethod processing time continues to 
>>>>>> dominate. Also performace testing showed no throughput 
>>>>>> regression.?? I'm rerunning both of these performance testing and 
>>>>>> will post the numbers.
>>>>>>
>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8186777
>>>>>> open webrev at http://cr.openjdk.java.net/~coleenp/8186777.01/webrev
>>>>>>
>>>>>> Thanks,
>>>>>> Coleen
>>>
>>
>


From volker.simonis at gmail.com  Wed Oct  4 07:19:49 2017
From: volker.simonis at gmail.com (Volker Simonis)
Date: Wed, 04 Oct 2017 07:19:49 +0000
Subject: RFR(M): 8166317: InterpreterCodeSize should be computed
In-Reply-To: <ff56e260-c520-f0a2-32ac-512af28b317c@oracle.com>
References: <CA+3eh13aijzoGexJXDB7tTnFfmqE4BffbASUysSb2=osD7F4nA@mail.gmail.com>
 <50cda0ab-f403-372a-ce51-1a27d8821448@oracle.com>
 <CA+3eh11HCkBF8KkMG5-o-Ouji=KLqQ=FtztLWo6u3Han3yxoKw@mail.gmail.com>
 <e00bf17e-0ee5-75ba-8eb3-f766b984fa9f@oracle.com>
 <CA+3eh10Sxx5hsXABdeWCZ_P=AY5aR6F37Gs3cdFEv9P_K8GUrw@mail.gmail.com>
 <CA+3eh13fpKiZZgmHCAORgOjxdrEiTipN4zOxAxdz=++GiMZ0SQ@mail.gmail.com>
 <6704868d-caa7-51e0-4741-5d62f90d837c@oracle.com>
 <CA+3eh12_Z9S-oyMNJf8MLwtveDXAx+2xAqsKjOZjS5o_nm6WUQ@mail.gmail.com>
 <8c522d38-90db-2864-0778-6d5948b1f50c@oracle.com>
 <7fee08f1-8304-3026-19e9-844e618e98ea@oracle.com>
 <CA+3eh125jBF2OEMMJ_22N__UHx=Kj6NMKtpfnfGnnf0WUk4Wbw@mail.gmail.com>
 <2bb4136a-8c0e-ac4c-0c03-af38ff79ab40@oracle.com>
 <CA+3eh125R_kgucQ0pfUAgmW9a+y=hx05_N4eomHBSzd1viEztg@mail.gmail.com>
 <5b5219a5-960e-363b-2bdc-3613f1dae62c@oracle.com>
 <ff56e260-c520-f0a2-32ac-512af28b317c@oracle.com>
Message-ID: <CA+3eh10Hc73tdsj5xM3_zYXq876+7=rtkfdVgvNherbytrTk_w@mail.gmail.com>

Thanks Vladimir,

I'll take a look at the problem next week when I'm back from JavaOne.

Regards,
Volker

Vladimir Kozlov <vladimir.kozlov at oracle.com> schrieb am Di. 3. Okt. 2017 um
12:43:

> I rebased it. But there is problem with changes. VM hit guarantee() in
> this code when run on SPARC in both, fastdebug and product, builds.
> Crash happens during build. We can't push this - problem should be
> investigated and fixed first.
>
> Thanks,
> Vladimir
>
> make/Main.gmk:443: recipe for target 'generate-link-opt-data' failed
> /usr/ccs/bin/bash: line 4:  9349 Abort                   (core dumped)
> /s/build/solaris-sparcv9-debug/support/interim-image/bin/java
> -XX:DumpLoadedClassList=/s/build/solaris-sparcv9-debug/support/link_opt/classlist
> -Djava.lang.invoke.MethodHandle.TRACE_RESOLVE=true -cp
> /s/build/solaris-sparcv9-debug/support/classlist.jar
> build.tools.classlist.HelloClasslist 2>&1 >
> /s/build/solaris-sparcv9-debug/support/link_opt/default_jli_trace.txt
> make[3]: *** [/s/build/solaris-sparcv9-debug/support/link_opt/classlist]
> Error 134
> make[2]: *** [generate-link-opt-data] Error 1
>
>
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #  Internal Error (/s/open/src/hotspot/share/memory/heap.cpp:233),
> pid=9349, tid=2
> #  guarantee(b == block_at(_next_segment - actual_number_of_segments))
> failed: Intermediate allocation!
> #
> # JRE version:  (10.0) (fastdebug build )
> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug
> 10-internal+0-2017-09-30-014154.8166317, mixed mode, tiered, compressed
> oops, g1 gc, solaris-sparc)
> # Core dump will be written. Default location: /s/open/make/core or
> core.9349
> #
> # If you would like to submit a bug report, please visit:
> #   http://bugreport.java.com/bugreport/crash.jsp
> #
>
> ---------------  S U M M A R Y ------------
>
> Command Line:
> -XX:DumpLoadedClassList=/s/build/solaris-sparcv9-debug/support/link_opt/classlist
> -Djava.lang.invoke.MethodHandle.TRACE_RESOLVE=true
> build.tools.classlist.HelloClasslist
>
> Host: sca00dbv, Sparcv9 64 bit 3600 MHz, 16 cores, 32G, Oracle Solaris
> 11.2 SPARC
> Time: Sat Sep 30 03:29:46 2017 UTC elapsed time: 0 seconds (0d 0h 0m 0s)
>
> ---------------  T H R E A D  ---------------
>
> Current thread (0x000000010012f000):  JavaThread "Unknown thread"
> [_thread_in_vm, id=2, stack(0x0007fffef9700000,0x0007fffef9800000)]
>
> Stack: [0x0007fffef9700000,0x0007fffef9800000],  sp=0x0007fffef97ff020,
> free space=1020k
> Native frames: (J=compiled Java code, A=aot compiled Java code,
> j=interpreted, Vv=VM code, C=native code)
> V  [libjvm.so+0x1f94508]  void VMError::report_and_die(int,const
> char*,const char*,void*,Thread*,unsigned char*,void*,void*,const
> char*,int,unsigned long)+0xa58
> V  [libjvm.so+0x1f93a3c]  void VMError::report_and_die(Thread*,const
> char*,int,const char*,const char*,void*)+0x3c
> V  [libjvm.so+0xd02f38]  void report_vm_error(const char*,int,const
> char*,const char*,...)+0x78
> V  [libjvm.so+0xfc219c]  void CodeHeap::deallocate_tail(void*,unsigned
> long)+0xec
> V  [libjvm.so+0xbf4f14]  void
> CodeCache::free_unused_tail(CodeBlob*,unsigned long)+0xe4
> V  [libjvm.so+0x1e0ae70]  void StubQueue::deallocate_unused_tail()+0x40
> V  [libjvm.so+0x1e7452c]  void TemplateInterpreter::initialize()+0x19c
> V  [libjvm.so+0x1051220]  void interpreter_init()+0x20
> V  [libjvm.so+0x10116e0]  int init_globals()+0xf0
> V  [libjvm.so+0x1ed8548]  int
> Threads::create_vm(JavaVMInitArgs*,bool*)+0x4a8
> V  [libjvm.so+0x11c7b58]  int
> JNI_CreateJavaVM_inner(JavaVM_**,void**,void*)+0x108
> C  [libjli.so+0x7950]  InitializeJVM+0x100
>
>
> On 10/2/17 7:55 AM, coleen.phillimore at oracle.com wrote:
> >
> > I can sponsor this for you once you rebase, and fix these compilation
> errors.
> > Thanks,
> > Coleen
> >
> > On 9/30/17 12:28 AM, Volker Simonis wrote:
> >> Hi Vladimir,
> >>
> >> thanks a lot for remembering these changes!
> >>
> >> Regards,
> >> Volker
> >>
> >>
> >> Vladimir Kozlov <vladimir.kozlov at oracle.com <mailto:
> vladimir.kozlov at oracle.com>> schrieb am Fr. 29. Sep. 2017 um 15:47:
> >>
> >>     I hit build failure when tried to push changes:
> >>
> >>     src/hotspot/share/code/codeBlob.hpp(162) : warning C4267: '=' :
> conversion from 'size_t' to 'int', possible loss of data
> >>     src/hotspot/share/code/codeBlob.hpp(163) : warning C4267: '=' :
> conversion from 'size_t' to 'int', possible loss of data
> >>
> >>     I am going to fix it by casting (int):
> >>
> >>     +  void adjust_size(size_t used) {
> >>     +    _size = (int)used;
> >>     +    _data_offset = (int)used;
> >>     +    _code_end = (address)this + used;
> >>     +    _data_end = (address)this + used;
> >>     +  }
> >>
> >>     Note, CodeCache size can't more than 2Gb (max_int) so such casting
> is fine.
> >>
> >>     Vladimir
> >>
> >>     On 9/6/17 6:20 AM, Volker Simonis wrote:
> >>     > On Tue, Sep 5, 2017 at 9:36 PM,  <coleen.phillimore at oracle.com
> <mailto:coleen.phillimore at oracle.com>> wrote:
> >>     >>
> >>     >> I was going to make the same comment about the friend
> declaration in v1, so
> >>     >> v2 looks better to me.  Looks good.  Thank you for finding a
> solution to
> >>     >> this problem that we've had for a long time.  I will sponsor
> this (remind me
> >>     >> if I forget after the 18th).
> >>     >>
> >>     >
> >>     > Thanks Coleen! I've updated
> >>     >
> >>     > http://cr.openjdk.java.net/~simonis/webrevs/2017/8166317.v2/ <
> http://cr.openjdk.java.net/%7Esimonis/webrevs/2017/8166317.v2/>
> >>     >
> >>     > in-place and added you as a second reviewer.
> >>     >
> >>     > Regards,
> >>     > Volker
> >>     >
> >>     >
> >>     >> thanks,
> >>     >> Coleen
> >>     >>
> >>     >>
> >>     >>
> >>     >> On 9/5/17 1:17 PM, Vladimir Kozlov wrote:
> >>     >>>
> >>     >>> On 9/5/17 9:49 AM, Volker Simonis wrote:
> >>     >>>>
> >>     >>>> On Fri, Sep 1, 2017 at 6:16 PM, Vladimir Kozlov
> >>     >>>> <vladimir.kozlov at oracle.com <mailto:vladimir.kozlov at oracle.com>>
> wrote:
> >>     >>>>>
> >>     >>>>> May be add new CodeBlob's method to adjust sizes instead of
> directly
> >>     >>>>> setting
> >>     >>>>> them in  CodeCache::free_unused_tail(). Then you would not
> need friend
> >>     >>>>> class
> >>     >>>>> CodeCache in CodeBlob.
> >>     >>>>>
> >>     >>>>
> >>     >>>> Changed as suggested (I didn't liked the friend declaration as
> well :)
> >>     >>>>
> >>     >>>>> Also I think adjustment to header_size should be done in
> >>     >>>>> CodeCache::free_unused_tail() to limit scope of code who
> knows about
> >>     >>>>> blob
> >>     >>>>> layout.
> >>     >>>>>
> >>     >>>>
> >>     >>>> Yes, that's much cleaner. Please find the updated webrev here:
> >>     >>>>
> >>     >>>> http://cr.openjdk.java.net/~simonis/webrevs/2017/8166317.v2/ <
> http://cr.openjdk.java.net/%7Esimonis/webrevs/2017/8166317.v2/>
> >>     >>>
> >>     >>>
> >>     >>> Good.
> >>     >>>
> >>     >>>>
> >>     >>>> I've also found another "day 1" problem in StubQueue::next():
> >>     >>>>
> >>     >>>>      Stub* next(Stub* s) const         { int i =
> >>     >>>> index_of(s) + stub_size(s);
> >>     >>>> -          if (i ==
> >>     >>>> _buffer_limit) i = 0;
> >>     >>>> +          // Only wrap
> >>     >>>> around in the non-contiguous case (see stubss.cpp)
> >>     >>>> +          if (i ==
> >>     >>>> _buffer_limit && _queue_end < _buffer_limit) i = 0;
> >>     >>>>            return (i ==
> >>     >>>> _queue_end) ? NULL : stub_at(i);
> >>     >>>>          }
> >>     >>>>
> >>     >>>> The problem was that the method was not prepared to handle the
> case
> >>     >>>> where _buffer_limit == _queue_end == _buffer_size which lead
> to an
> >>     >>>> infinite recursion when iterating over a StubQueue with
> >>     >>>> StubQueue::next() until next() returns NULL (as this was for
> example
> >>     >>>> done with -XX:+PrintInterpreter). But with the new, trimmed
> CodeBlob
> >>     >>>> we run into exactly this situation.
> >>     >>>
> >>     >>>
> >>     >>> Okay.
> >>     >>>
> >>     >>>>
> >>     >>>> While doing this last fix I also noticed that
> "StubQueue::stubs_do()",
> >>     >>>> "StubQueue::queues_do()" and "StubQueue::register_queue()"
> don't seem
> >>     >>>> to be used anywhere in the open code base (please correct me
> if I'm
> >>     >>>> wrong). What do you think, maybe we should remove this code in
> a
> >>     >>>> follow up change if it is really not needed?
> >>     >>>
> >>     >>>
> >>     >>> register_queue() is used in constructor. Other 2 you can remove.
> >>     >>> stub_code_begin() and stub_code_end() are not used too -remove.
> >>     >>> I thought we run on linux with flag which warn about unused
> code.
> >>     >>>
> >>     >>>>
> >>     >>>> Finally, could you please run the new version through JPRT and
> sponsor
> >>     >>>> it once jdk10/hs will be opened again?
> >>     >>>
> >>     >>>
> >>     >>> Will do when jdk10 "consolidation" is finished. Please, remind
> me later if
> >>     >>> I forget.
> >>     >>>
> >>     >>> Thanks,
> >>     >>> Vladimir
> >>     >>>
> >>     >>>>
> >>     >>>> Thanks,
> >>     >>>> Volker
> >>     >>>>
> >>     >>>>> Thanks,
> >>     >>>>> Vladimir
> >>     >>>>>
> >>     >>>>>
> >>     >>>>> On 9/1/17 8:46 AM, Volker Simonis wrote:
> >>     >>>>>>
> >>     >>>>>>
> >>     >>>>>> Hi,
> >>     >>>>>>
> >>     >>>>>> I've decided to split the fix for the
> 'CodeHeap::contains_blob()'
> >>     >>>>>> problem into its own issue "8187091:
> ReturnBlobToWrongHeapTest fails
> >>     >>>>>> because of problems in CodeHeap::contains_blob()"
> >>     >>>>>> (https://bugs.openjdk.java.net/browse/JDK-8187091) and
> started a new
> >>     >>>>>> review thread for discussing it at:
> >>     >>>>>>
> >>     >>>>>>
> >>     >>>>>>
> http://mail.openjdk.java.net/pipermail/hotspot-dev/2017-September/028206.html
> >>     >>>>>>
> >>     >>>>>> So please lets keep this thread for discussing the
> interpreter code
> >>     >>>>>> size issue only. I've prepared a new version of the webrev
> which is
> >>     >>>>>> the same as the first one with the only difference that the
> change to
> >>     >>>>>> 'CodeHeap::contains_blob()' has been removed:
> >>     >>>>>>
> >>     >>>>>> http://cr.openjdk.java.net/~simonis/webrevs/2017/8166317.v1/
> <http://cr.openjdk.java.net/%7Esimonis/webrevs/2017/8166317.v1/>
> >>     >>>>>>
> >>     >>>>>> Thanks,
> >>     >>>>>> Volker
> >>     >>>>>>
> >>     >>>>>>
> >>     >>>>>> On Thu, Aug 31, 2017 at 6:35 PM, Volker Simonis
> >>     >>>>>> <volker.simonis at gmail.com <mailto:volker.simonis at gmail.com>>
> wrote:
> >>     >>>>>>>
> >>     >>>>>>>
> >>     >>>>>>> On Thu, Aug 31, 2017 at 6:05 PM, Vladimir Kozlov
> >>     >>>>>>> <vladimir.kozlov at oracle.com <mailto:
> vladimir.kozlov at oracle.com>> wrote:
> >>     >>>>>>>>
> >>     >>>>>>>>
> >>     >>>>>>>> Very good change. Thank you, Volker.
> >>     >>>>>>>>
> >>     >>>>>>>> About contains_blob(). The problem is that
> AOTCompiledMethod
> >>     >>>>>>>> allocated
> >>     >>>>>>>> in
> >>     >>>>>>>> CHeap and not in aot code section (which is RO):
> >>     >>>>>>>>
> >>     >>>>>>>>
> >>     >>>>>>>>
> >>     >>>>>>>>
> http://hg.openjdk.java.net/jdk10/hs/hotspot/file/8acd232fb52a/src/share/vm/aot/aotCompiledMethod.hpp#l124
> >>     >>>>>>>>
> >>     >>>>>>>> It is allocated in CHeap after AOT library is loaded. Its
> >>     >>>>>>>> code_begin()
> >>     >>>>>>>> points to AOT code section but AOTCompiledMethod* points
> outside it
> >>     >>>>>>>> (to
> >>     >>>>>>>> normal malloced space) so you can't use (char*)blob
> address.
> >>     >>>>>>>>
> >>     >>>>>>>
> >>     >>>>>>> Thanks for the explanation - now I got it.
> >>     >>>>>>>
> >>     >>>>>>>> There are 2 ways to fix it, I think.
> >>     >>>>>>>> One is to add new field to CodeBlobLayout and set it to
> blob* address
> >>     >>>>>>>> for
> >>     >>>>>>>> normal CodeCache blobs and to code_begin for AOT code.
> >>     >>>>>>>> Second is to use contains(blob->code_end() - 1) assuming
> that AOT
> >>     >>>>>>>> code
> >>     >>>>>>>> is
> >>     >>>>>>>> never zero.
> >>     >>>>>>>>
> >>     >>>>>>>
> >>     >>>>>>> I'll give it a try tomorrow and will send out a new webrev.
> >>     >>>>>>>
> >>     >>>>>>> Regards,
> >>     >>>>>>> Volker
> >>     >>>>>>>
> >>     >>>>>>>> Thanks,
> >>     >>>>>>>> Vladimir
> >>     >>>>>>>>
> >>     >>>>>>>>
> >>     >>>>>>>> On 8/31/17 5:43 AM, Volker Simonis wrote:
> >>     >>>>>>>>>
> >>     >>>>>>>>>
> >>     >>>>>>>>>
> >>     >>>>>>>>> On Thu, Aug 31, 2017 at 12:14 PM, Claes Redestad
> >>     >>>>>>>>> <claes.redestad at oracle.com <mailto:
> claes.redestad at oracle.com>> wrote:
> >>     >>>>>>>>>>
> >>     >>>>>>>>>>
> >>     >>>>>>>>>>
> >>     >>>>>>>>>>
> >>     >>>>>>>>>>
> >>     >>>>>>>>>> On 2017-08-31 08:54, Volker Simonis wrote:
> >>     >>>>>>>>>>>
> >>     >>>>>>>>>>>
> >>     >>>>>>>>>>>
> >>     >>>>>>>>>>>
> >>     >>>>>>>>>>> While working on this, I found another problem which is
> related to
> >>     >>>>>>>>>>> the
> >>     >>>>>>>>>>> fix of JDK-8183573 and leads to crashes when executing
> the JTreg
> >>     >>>>>>>>>>> test
> >>     >>>>>>>>>>>
> compiler/codecache/stress/ReturnBlobToWrongHeapTest.java.
> >>     >>>>>>>>>>>
> >>     >>>>>>>>>>> The problem is that JDK-8183573 replaced
> >>     >>>>>>>>>>>
> >>     >>>>>>>>>>>        virtual bool contains_blob(const CodeBlob* blob)
> const {
> >>     >>>>>>>>>>> return
> >>     >>>>>>>>>>> low_boundary() <= (char*) blob && (char*) blob <
> high(); }
> >>     >>>>>>>>>>>
> >>     >>>>>>>>>>> by:
> >>     >>>>>>>>>>>
> >>     >>>>>>>>>>>        bool contains_blob(const CodeBlob* blob) const {
> return
> >>     >>>>>>>>>>> contains(blob->code_begin()); }
> >>     >>>>>>>>>>>
> >>     >>>>>>>>>>> But that my be wrong in the corner case where the size
> of the
> >>     >>>>>>>>>>> CodeBlob's payload is zero (i.e. the CodeBlob consists
> only of the
> >>     >>>>>>>>>>> 'header' - i.e. the C++ object itself) because in that
> case
> >>     >>>>>>>>>>> CodeBlob::code_begin() points right behind the
> CodeBlob's header
> >>     >>>>>>>>>>> which
> >>     >>>>>>>>>>> is a memory location which doesn't belong to the
> CodeBlob anymore.
> >>     >>>>>>>>>>
> >>     >>>>>>>>>>
> >>     >>>>>>>>>>
> >>     >>>>>>>>>>
> >>     >>>>>>>>>>
> >>     >>>>>>>>>> I recall this change was somehow necessary to allow
> merging
> >>     >>>>>>>>>> AOTCodeHeap::contains_blob and CodeHead::contains_blob
> into
> >>     >>>>>>>>>> one devirtualized method, so you need to ensure all AOT
> tests
> >>     >>>>>>>>>> pass with this change (on linux-x64).
> >>     >>>>>>>>>>
> >>     >>>>>>>>>
> >>     >>>>>>>>> All of hotspot/test/aot and hotspot/test/jvmci executed
> and passed
> >>     >>>>>>>>> successful. Are there any other tests I should check?
> >>     >>>>>>>>>
> >>     >>>>>>>>> That said, it is a little hard to follow the stages of
> your change.
> >>     >>>>>>>>> It
> >>     >>>>>>>>> seems like
> >>     >>>>>>>>>
> http://cr.openjdk.java.net/~redestad/scratch/codeheap_contains.00/ <
> http://cr.openjdk.java.net/%7Eredestad/scratch/codeheap_contains.00/>
> >>     >>>>>>>>> was reviewed [1] but then finally the slightly changed
> version from
> >>     >>>>>>>>>
> http://cr.openjdk.java.net/~redestad/scratch/codeheap_contains.01/ <
> http://cr.openjdk.java.net/%7Eredestad/scratch/codeheap_contains.01/>
> >>     >>>>>>>>> was
> >>     >>>>>>>>> checked in and linked to the bug report.
> >>     >>>>>>>>>
> >>     >>>>>>>>> The first, reviewed version of the change still had a
> correct
> >>     >>>>>>>>> version
> >>     >>>>>>>>> of 'CodeHeap::contains_blob(const CodeBlob* blob)' while
> the second,
> >>     >>>>>>>>> checked in version has the faulty version of that method.
> >>     >>>>>>>>>
> >>     >>>>>>>>> I don't know why you finally did that change to
> 'contains_blob()'
> >>     >>>>>>>>> but
> >>     >>>>>>>>> I don't see any reason why we shouldn't be able to
> directly use the
> >>     >>>>>>>>> blob's address for inclusion checking. From what I
> understand, it
> >>     >>>>>>>>> should ALWAYS be contained in the corresponding CodeHeap
> so no
> >>     >>>>>>>>> reason
> >>     >>>>>>>>> to mess with 'CodeBlob::code_begin()'.
> >>     >>>>>>>>>
> >>     >>>>>>>>> Please let me know if I'm missing something.
> >>     >>>>>>>>>
> >>     >>>>>>>>> [1]
> >>     >>>>>>>>>
> >>     >>>>>>>>>
> >>     >>>>>>>>>
> http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2017-July/026624.html
> >>     >>>>>>>>>
> >>     >>>>>>>>>> I can't help to wonder if we'd not be better served by
> disallowing
> >>     >>>>>>>>>> zero-sized payloads. Is this something that can ever
> actually
> >>     >>>>>>>>>> happen except by abuse of the white box API?
> >>     >>>>>>>>>>
> >>     >>>>>>>>>
> >>     >>>>>>>>> The corresponding test (ReturnBlobToWrongHeapTest.java)
> specifically
> >>     >>>>>>>>> wants to allocate "segment sized" blocks which is most
> easily
> >>     >>>>>>>>> achieved
> >>     >>>>>>>>> by allocation zero-sized CodeBlobs. And I think there's
> nothing
> >>     >>>>>>>>> wrong
> >>     >>>>>>>>> about it if we handle the inclusion tests correctly.
> >>     >>>>>>>>>
> >>     >>>>>>>>> Thank you and best regards,
> >>     >>>>>>>>> Volker
> >>     >>>>>>>>>
> >>     >>>>>>>>>> /Claes
> >>     >>
> >>     >>
> >>
> >
>

From patric.hedlin at oracle.com  Wed Oct  4 09:04:18 2017
From: patric.hedlin at oracle.com (Patric Hedlin)
Date: Wed, 4 Oct 2017 11:04:18 +0200
Subject: JDK10/RFR(M): 8172232: SPARC ISA/CPU feature detection is
 broken/insufficient (on Linux).
In-Reply-To: <9f2896ca-65dc-557f-793c-4235499cc340@oracle.com>
References: <7d5e1ebb-7de8-66f1-a1f0-db465bcad4ab@oracle.com>
 <9f2896ca-65dc-557f-793c-4235499cc340@oracle.com>
Message-ID: <3fcc865d-3eda-a341-e112-8417711ee3e5@oracle.com>

Thanks for reviewing Vladimir.

On 09/29/2017 08:56 PM, Vladimir Kozlov wrote:
> In general it is fine. Few notes.
> You use ifdef DEBUG_SPARC_CAPS which is undefed at the beginning. Is 
> it set by gcc by default?
>
Removed.

> Coding style for methods definitions - open parenthesis should be on 
> the same line:
>
> +  bool match(const char* s) const
> +  {
>
Updated/re-formated.

Refreshed webrev.

@Adrian: Please validate.

Best regards,
Patric

> Thanks,
> Vladimir
>
> On 9/29/17 6:08 AM, Patric Hedlin wrote:
>> Dear all,
>>
>> I would like to ask for help to review the following change/update:
>>
>> Issue:  https://bugs.openjdk.java.net/browse/JDK-8172232
>>
>> Webrev: http://cr.openjdk.java.net/~phedlin/tr8172232/
>>
>>
>> 8172232: SPARC ISA/CPU feature detection is broken/insufficient (on 
>> Linux).
>>
>>      Subsumes (duplicate) JDK-8186579: 
>> VM_Version::platform_features() needs update on linux-sparc.
>>
>>
>> Caveat:
>>
>>      This update will introduce some redundancies into the code base, 
>> features and definitions
>>      currently not used, addressed by subsequent bug or feature 
>> updates/patches. Fujitsu HW is
>>      treated very conservatively.
>>
>>
>> Testing:
>>
>>      JDK9/JDK10 local jtreg/hotspot
>>
>>
>> Thanks to Adrian for additional test (and review) support.
>>
>> Tested-By: John Paul Adrian Glaubitz <glaubitz at physik.fu-berlin.de>
>>
>>
>> Best regards,
>> Patric
>>


From glaubitz at physik.fu-berlin.de  Wed Oct  4 09:39:35 2017
From: glaubitz at physik.fu-berlin.de (John Paul Adrian Glaubitz)
Date: Wed, 4 Oct 2017 11:39:35 +0200
Subject: JDK10/RFR(M): 8172232: SPARC ISA/CPU feature detection is
 broken/insufficient (on Linux).
In-Reply-To: <3fcc865d-3eda-a341-e112-8417711ee3e5@oracle.com>
References: <7d5e1ebb-7de8-66f1-a1f0-db465bcad4ab@oracle.com>
 <9f2896ca-65dc-557f-793c-4235499cc340@oracle.com>
 <3fcc865d-3eda-a341-e112-8417711ee3e5@oracle.com>
Message-ID: <55211504-0f3e-52a0-0930-f34babb5da14@physik.fu-berlin.de>

On 10/04/2017 11:04 AM, Patric Hedlin wrote:
> Refreshed webrev.
> 
> @Adrian: Please validate.
Done. Both the server and the zero variant build fine on linux-sparc
with the updated webrev, hence:

Tested-By: John Paul Adrian Glaubitz <glaubitz at physik.fu-berlin.de>

Adrian

-- 
  .''`.  John Paul Adrian Glaubitz
: :' :  Debian Developer - glaubitz at debian.org
`. `'   Freie Universitaet Berlin - glaubitz at physik.fu-berlin.de
   `-    GPG: 62FF 8A75 84E0 2956 9546  0006 7426 3B37 F5B5 F913

From patric.hedlin at oracle.com  Wed Oct  4 09:39:56 2017
From: patric.hedlin at oracle.com (Patric Hedlin)
Date: Wed, 4 Oct 2017 11:39:56 +0200
Subject: JDK10/RFR(M): 8172232: SPARC ISA/CPU feature detection is
 broken/insufficient (on Linux).
In-Reply-To: <55211504-0f3e-52a0-0930-f34babb5da14@physik.fu-berlin.de>
References: <7d5e1ebb-7de8-66f1-a1f0-db465bcad4ab@oracle.com>
 <9f2896ca-65dc-557f-793c-4235499cc340@oracle.com>
 <3fcc865d-3eda-a341-e112-8417711ee3e5@oracle.com>
 <55211504-0f3e-52a0-0930-f34babb5da14@physik.fu-berlin.de>
Message-ID: <fdde14d2-b451-6a33-cde7-818479c17703@oracle.com>

Thanks Adrian.

/Patric


On 10/04/2017 11:39 AM, John Paul Adrian Glaubitz wrote:
> On 10/04/2017 11:04 AM, Patric Hedlin wrote:
>> Refreshed webrev.
>>
>> @Adrian: Please validate.
> Done. Both the server and the zero variant build fine on linux-sparc
> with the updated webrev, hence:
>
> Tested-By: John Paul Adrian Glaubitz <glaubitz at physik.fu-berlin.de>
>
> Adrian
>


From glaubitz at physik.fu-berlin.de  Wed Oct  4 09:58:17 2017
From: glaubitz at physik.fu-berlin.de (John Paul Adrian Glaubitz)
Date: Wed, 4 Oct 2017 11:58:17 +0200
Subject: JDK10/RFR(M): 8172232: SPARC ISA/CPU feature detection is
 broken/insufficient (on Linux).
In-Reply-To: <fdde14d2-b451-6a33-cde7-818479c17703@oracle.com>
References: <7d5e1ebb-7de8-66f1-a1f0-db465bcad4ab@oracle.com>
 <9f2896ca-65dc-557f-793c-4235499cc340@oracle.com>
 <3fcc865d-3eda-a341-e112-8417711ee3e5@oracle.com>
 <55211504-0f3e-52a0-0930-f34babb5da14@physik.fu-berlin.de>
 <fdde14d2-b451-6a33-cde7-818479c17703@oracle.com>
Message-ID: <2d1fd501-8ba3-7591-a360-2cdc114cfbe9@physik.fu-berlin.de>

On 10/04/2017 11:39 AM, Patric Hedlin wrote:
> Thanks Adrian.

Thank you for your work on this :-).

Hope this gets merged soon. After that, the linux-sparc builds
won't need any external patches downstream anymore.

Adrian

-- 
  .''`.  John Paul Adrian Glaubitz
: :' :  Debian Developer - glaubitz at debian.org
`. `'   Freie Universitaet Berlin - glaubitz at physik.fu-berlin.de
   `-    GPG: 62FF 8A75 84E0 2956 9546  0006 7426 3B37 F5B5 F913

From HORIE at jp.ibm.com  Wed Oct  4 10:13:58 2017
From: HORIE at jp.ibm.com (Michihiro Horie)
Date: Wed, 4 Oct 2017 19:13:58 +0900
Subject: RFR(S):8188757:PPC64:Disable VSR52-63 in ppc.ad
Message-ID: <OFCFBF343A.3DFD4EF0-ON002581AF.003645B1-492581AF.00383638@notes.na.collabserv.com>


Dear all,

Would you please review the following change in hs?
Bug: https://bugs.openjdk.java.net/browse/JDK-8188757
Webrev: http://cr.openjdk.java.net/~mhorie/8188757/webrev.00/

This change disables VSR52-63 because currently there is no support for
these registers to be properly treated as nonvolatile.
Also, this change removes redundant logical or with 1u to enforce to use
VSR32- registers in assembler_ppc.inline.hpp, which was done in my previous
webrev for 8188139.


Best regards,
--
Michihiro,
IBM Research - Tokyo

From martin.doerr at sap.com  Wed Oct  4 12:05:49 2017
From: martin.doerr at sap.com (Doerr, Martin)
Date: Wed, 4 Oct 2017 12:05:49 +0000
Subject: RFR(S):8188757:PPC64:Disable VSR52-63 in ppc.ad
In-Reply-To: <OFCFBF343A.3DFD4EF0-ON002581AF.003645B1-492581AF.00383638@notes.na.collabserv.com>
References: <OFCFBF343A.3DFD4EF0-ON002581AF.003645B1-492581AF.00383638@notes.na.collabserv.com>
Message-ID: <47f7c8e22e364223b3f049998cf2506f@sap.com>

Hi Michihiro,

thanks for fixing it so quickly. Reviewed and pushed.

Best regards,
Martin


From: Michihiro Horie [mailto:HORIE at jp.ibm.com]
Sent: Mittwoch, 4. Oktober 2017 12:14
To: ppc-aix-port-dev at openjdk.java.net; hotspot-dev at openjdk.java.net; Doerr, Martin <martin.doerr at sap.com>
Cc: Simonis, Volker <volker.simonis at sap.com>; Hiroshi H Horii <HORII at jp.ibm.com>; Kazunori Ogata <OGATAK at jp.ibm.com>; Gustavo Romero <gromero at linux.vnet.ibm.com>
Subject: RFR(S):8188757:PPC64:Disable VSR52-63 in ppc.ad


Dear all,

Would you please review the following change in hs?
Bug: https://bugs.openjdk.java.net/browse/JDK-8188757
Webrev: http://cr.openjdk.java.net/~mhorie/8188757/webrev.00/

This change disables VSR52-63 because currently there is no support for these registers to be properly treated as nonvolatile.
Also, this change removes redundant logical or with 1u to enforce to use VSR32- registers in assembler_ppc.inline.hpp, which was done in my previous webrev for 8188139.


Best regards,
--
Michihiro,
IBM Research - Tokyo

From coleen.phillimore at oracle.com  Wed Oct  4 12:08:43 2017
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Wed, 4 Oct 2017 08:08:43 -0400
Subject: RFR (M): 8188224: Generalize Atomic::load/store to use templates
In-Reply-To: <59D38963.2070806@oracle.com>
References: <59D38293.7030800@oracle.com>
 <712e1c4e-b38b-11c3-4b51-d88f1560a063@oracle.com>
 <59D38963.2070806@oracle.com>
Message-ID: <c2f2ebe2-287c-aa77-4e2a-5396f689455e@oracle.com>


So this change is becoming more familiar but I think it's because the 
comment is repeated now for cmpxchg, add, and now load and store.?? My 
scanning ability is too limited to spot the differences.? I don't like 
the duplicated comments at all.

I don't know if this is possible and not with this change, but I think 
there should be a class platformAtomic.hpp which consolidates these 
comments and moves the platform* stuff out of atomic.hpp, to be included 
or subclassed by atomic.hpp.? Then we can find our desired Atomic::blah 
functions again.?? I would like an RFE for this.

Otherwise, I've pattern matched this and it seems correct and am fine 
with checking this in.

http://cr.openjdk.java.net/~eosterlund/8188224/webrev.01/src/hotspot/os_cpu/windows_x86/atomic_windows_x86.hpp.udiff.html

These changes I really like because now we don't have to go hunting to 
see that atomic::load/store is just *thing.

Thanks!
Coleen

On 10/3/17 8:58 AM, Erik ?sterlund wrote:
> Hi David,
>
> Thanks for the review.
> The comments have been removed.
>
> New full webrev:
> http://cr.openjdk.java.net/~eosterlund/8188224/webrev.01/
>
> New incremental webrev:
> http://cr.openjdk.java.net/~eosterlund/8188224/webrev.00_01/
>
> Thanks,
> /Erik
>
> On 2017-10-03 14:44, David Holmes wrote:
>> Hi Erik,
>>
>> A lot of jumping through hoops just to do a direct load/store in the 
>> bulk of cases - but okay, we're embracing templates.
>>
>> 66?? // Atomically store to a location
>> 67?? // See comment above about using jlong atomics on 32-bit platforms
>>
>> The comment at #67 and the equivalent one for load can be deleted. 
>> The "comment above" should only be referring to r-m-w atomic ops not 
>> basic load and store. All platforms must have a means to do atomic 
>> load/store of 64-bit due to Java volatile variables (eg by using 
>> floating-point unit on 32-bit) but may not have cmpxchg<8> 
>> capability. (I failed to convince the author of this when those 
>> comments went in. ;-) )
>>
>> Cheers,
>> David
>>
>> On 3/10/2017 10:29 PM, Erik ?sterlund wrote:
>>> Hi,
>>>
>>> The time has come to generalize Atomic::load/store with templates - 
>>> the last operation to generalize in Atomic.
>>> The design was inspired by Atomic::xchg and uses a similar mechanism 
>>> to validate the passed in arguments. It was also designed with 
>>> coming OrderAccess changes in mind. OrderAccess also contains loads 
>>> and stores that will reuse the LoadImpl and StoreImpl infrastructure 
>>> in Atomic::load/store. (the type checking for what is okay to pass 
>>> in to Atomic::load/store is very much the same for 
>>> OrderAccess::load_acquire/*store*).
>>>
>>> One thing worth mentioning is that the bsd zero port (but notably 
>>> not the linux zero port) had a leading fence for atomic stores of 
>>> jint when #if !defined(ARM) && !defined(M68K) is true without any 
>>> comment describing why. So I took the liberty of removing it. Atomic 
>>> should not have any fencing at all - that is what OrderAccess is 
>>> for. In fact Atomic does not promise any memory ordering semantics 
>>> for loads and stores. Atomic merely provides relaxed accesses that 
>>> are atomic. Worth mentioning nevertheless in case anyone wants to 
>>> keep that jint Atomic::store fence on bsd zero !M68K && !ARM.
>>>
>>> Bug:
>>> https://bugs.openjdk.java.net/browse/JDK-8188224
>>>
>>> Webrev:
>>> http://cr.openjdk.java.net/~eosterlund/8188224/webrev.00/
>>>
>>> Testing: JPRT, mach5 hs-tier3
>>>
>>> Thanks,
>>> /Erik
>


From coleen.phillimore at oracle.com  Wed Oct  4 12:09:55 2017
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Wed, 4 Oct 2017 08:09:55 -0400
Subject: RFR (M): 8188224: Generalize Atomic::load/store to use templates
In-Reply-To: <c2f2ebe2-287c-aa77-4e2a-5396f689455e@oracle.com>
References: <59D38293.7030800@oracle.com>
 <712e1c4e-b38b-11c3-4b51-d88f1560a063@oracle.com>
 <59D38963.2070806@oracle.com>
 <c2f2ebe2-287c-aa77-4e2a-5396f689455e@oracle.com>
Message-ID: <29af85a5-28c4-c617-abb8-42cda7dea371@oracle.com>


On 10/4/17 8:08 AM, coleen.phillimore at oracle.com wrote:
>
> So this change is becoming more familiar but I think it's because the 
> comment is repeated now for cmpxchg, add, and now load and store.?? My 
> scanning ability is too limited to spot the differences.? I don't like 
> the duplicated comments at all.
^ long (> 1 line)
>
> I don't know if this is possible and not with this change, but I think 
> there should be a class platformAtomic.hpp which consolidates these 
> comments and moves the platform* stuff out of atomic.hpp, to be 
> included or subclassed by atomic.hpp.? Then we can find our desired 
> Atomic::blah functions again.?? I would like an RFE for this.
>
> Otherwise, I've pattern matched this and it seems correct and am fine 
> with checking this in.
>
> http://cr.openjdk.java.net/~eosterlund/8188224/webrev.01/src/hotspot/os_cpu/windows_x86/atomic_windows_x86.hpp.udiff.html 
>
>
> These changes I really like because now we don't have to go hunting to 
> see that atomic::load/store is just *thing.
>
> Thanks!
> Coleen
>
> On 10/3/17 8:58 AM, Erik ?sterlund wrote:
>> Hi David,
>>
>> Thanks for the review.
>> The comments have been removed.
>>
>> New full webrev:
>> http://cr.openjdk.java.net/~eosterlund/8188224/webrev.01/
>>
>> New incremental webrev:
>> http://cr.openjdk.java.net/~eosterlund/8188224/webrev.00_01/
>>
>> Thanks,
>> /Erik
>>
>> On 2017-10-03 14:44, David Holmes wrote:
>>> Hi Erik,
>>>
>>> A lot of jumping through hoops just to do a direct load/store in the 
>>> bulk of cases - but okay, we're embracing templates.
>>>
>>> 66?? // Atomically store to a location
>>> 67?? // See comment above about using jlong atomics on 32-bit platforms
>>>
>>> The comment at #67 and the equivalent one for load can be deleted. 
>>> The "comment above" should only be referring to r-m-w atomic ops not 
>>> basic load and store. All platforms must have a means to do atomic 
>>> load/store of 64-bit due to Java volatile variables (eg by using 
>>> floating-point unit on 32-bit) but may not have cmpxchg<8> 
>>> capability. (I failed to convince the author of this when those 
>>> comments went in. ;-) )
>>>
>>> Cheers,
>>> David
>>>
>>> On 3/10/2017 10:29 PM, Erik ?sterlund wrote:
>>>> Hi,
>>>>
>>>> The time has come to generalize Atomic::load/store with templates - 
>>>> the last operation to generalize in Atomic.
>>>> The design was inspired by Atomic::xchg and uses a similar 
>>>> mechanism to validate the passed in arguments. It was also designed 
>>>> with coming OrderAccess changes in mind. OrderAccess also contains 
>>>> loads and stores that will reuse the LoadImpl and StoreImpl 
>>>> infrastructure in Atomic::load/store. (the type checking for what 
>>>> is okay to pass in to Atomic::load/store is very much the same for 
>>>> OrderAccess::load_acquire/*store*).
>>>>
>>>> One thing worth mentioning is that the bsd zero port (but notably 
>>>> not the linux zero port) had a leading fence for atomic stores of 
>>>> jint when #if !defined(ARM) && !defined(M68K) is true without any 
>>>> comment describing why. So I took the liberty of removing it. 
>>>> Atomic should not have any fencing at all - that is what 
>>>> OrderAccess is for. In fact Atomic does not promise any memory 
>>>> ordering semantics for loads and stores. Atomic merely provides 
>>>> relaxed accesses that are atomic. Worth mentioning nevertheless in 
>>>> case anyone wants to keep that jint Atomic::store fence on bsd zero 
>>>> !M68K && !ARM.
>>>>
>>>> Bug:
>>>> https://bugs.openjdk.java.net/browse/JDK-8188224
>>>>
>>>> Webrev:
>>>> http://cr.openjdk.java.net/~eosterlund/8188224/webrev.00/
>>>>
>>>> Testing: JPRT, mach5 hs-tier3
>>>>
>>>> Thanks,
>>>> /Erik
>>
>


From erik.osterlund at oracle.com  Wed Oct  4 13:06:17 2017
From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=)
Date: Wed, 4 Oct 2017 15:06:17 +0200
Subject: RFR (M): 8188224: Generalize Atomic::load/store to use templates
In-Reply-To: <c2f2ebe2-287c-aa77-4e2a-5396f689455e@oracle.com>
References: <59D38293.7030800@oracle.com>
 <712e1c4e-b38b-11c3-4b51-d88f1560a063@oracle.com>
 <59D38963.2070806@oracle.com>
 <c2f2ebe2-287c-aa77-4e2a-5396f689455e@oracle.com>
Message-ID: <59D4DCC9.6080107@oracle.com>

Hi Coleen,

On 2017-10-04 14:08, coleen.phillimore at oracle.com wrote:
>
> So this change is becoming more familiar but I think it's because the 
> comment is repeated now for cmpxchg, add, and now load and store.   My 
> scanning ability is too limited to spot the differences.  I don't like 
> the duplicated comments at all.
>
> I don't know if this is possible and not with this change, but I think 
> there should be a class platformAtomic.hpp which consolidates these 
> comments and moves the platform* stuff out of atomic.hpp, to be 
> included or subclassed by atomic.hpp.  Then we can find our desired 
> Atomic::blah functions again.   I would like an RFE for this.

I see what you are saying. When you think about it, is almost as if we 
want the comments themselves to be template expanded for each operation. 
(joking)
I will file an RFE for this.

> Otherwise, I've pattern matched this and it seems correct and am fine 
> with checking this in.
>
> http://cr.openjdk.java.net/~eosterlund/8188224/webrev.01/src/hotspot/os_cpu/windows_x86/atomic_windows_x86.hpp.udiff.html 
>
>
> These changes I really like because now we don't have to go hunting to 
> see that atomic::load/store is just *thing.

Thank you for the review!

/Erik

>
> Thanks!
> Coleen
>
> On 10/3/17 8:58 AM, Erik ?sterlund wrote:
>> Hi David,
>>
>> Thanks for the review.
>> The comments have been removed.
>>
>> New full webrev:
>> http://cr.openjdk.java.net/~eosterlund/8188224/webrev.01/
>>
>> New incremental webrev:
>> http://cr.openjdk.java.net/~eosterlund/8188224/webrev.00_01/
>>
>> Thanks,
>> /Erik
>>
>> On 2017-10-03 14:44, David Holmes wrote:
>>> Hi Erik,
>>>
>>> A lot of jumping through hoops just to do a direct load/store in the 
>>> bulk of cases - but okay, we're embracing templates.
>>>
>>> 66   // Atomically store to a location
>>> 67   // See comment above about using jlong atomics on 32-bit platforms
>>>
>>> The comment at #67 and the equivalent one for load can be deleted. 
>>> The "comment above" should only be referring to r-m-w atomic ops not 
>>> basic load and store. All platforms must have a means to do atomic 
>>> load/store of 64-bit due to Java volatile variables (eg by using 
>>> floating-point unit on 32-bit) but may not have cmpxchg<8> 
>>> capability. (I failed to convince the author of this when those 
>>> comments went in. ;-) )
>>>
>>> Cheers,
>>> David
>>>
>>> On 3/10/2017 10:29 PM, Erik ?sterlund wrote:
>>>> Hi,
>>>>
>>>> The time has come to generalize Atomic::load/store with templates - 
>>>> the last operation to generalize in Atomic.
>>>> The design was inspired by Atomic::xchg and uses a similar 
>>>> mechanism to validate the passed in arguments. It was also designed 
>>>> with coming OrderAccess changes in mind. OrderAccess also contains 
>>>> loads and stores that will reuse the LoadImpl and StoreImpl 
>>>> infrastructure in Atomic::load/store. (the type checking for what 
>>>> is okay to pass in to Atomic::load/store is very much the same for 
>>>> OrderAccess::load_acquire/*store*).
>>>>
>>>> One thing worth mentioning is that the bsd zero port (but notably 
>>>> not the linux zero port) had a leading fence for atomic stores of 
>>>> jint when #if !defined(ARM) && !defined(M68K) is true without any 
>>>> comment describing why. So I took the liberty of removing it. 
>>>> Atomic should not have any fencing at all - that is what 
>>>> OrderAccess is for. In fact Atomic does not promise any memory 
>>>> ordering semantics for loads and stores. Atomic merely provides 
>>>> relaxed accesses that are atomic. Worth mentioning nevertheless in 
>>>> case anyone wants to keep that jint Atomic::store fence on bsd zero 
>>>> !M68K && !ARM.
>>>>
>>>> Bug:
>>>> https://bugs.openjdk.java.net/browse/JDK-8188224
>>>>
>>>> Webrev:
>>>> http://cr.openjdk.java.net/~eosterlund/8188224/webrev.00/
>>>>
>>>> Testing: JPRT, mach5 hs-tier3
>>>>
>>>> Thanks,
>>>> /Erik
>>
>


From coleen.phillimore at oracle.com  Wed Oct  4 13:34:44 2017
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Wed, 4 Oct 2017 09:34:44 -0400
Subject: RFR (M): 8188224: Generalize Atomic::load/store to use templates
In-Reply-To: <59D4DCC9.6080107@oracle.com>
References: <59D38293.7030800@oracle.com>
 <712e1c4e-b38b-11c3-4b51-d88f1560a063@oracle.com>
 <59D38963.2070806@oracle.com>
 <c2f2ebe2-287c-aa77-4e2a-5396f689455e@oracle.com>
 <59D4DCC9.6080107@oracle.com>
Message-ID: <1610a769-7a46-7568-191f-5f480b7fea99@oracle.com>


On 10/4/17 9:06 AM, Erik ?sterlund wrote:
> Hi Coleen,
>
> On 2017-10-04 14:08, coleen.phillimore at oracle.com wrote:
>>
>> So this change is becoming more familiar but I think it's because the 
>> comment is repeated now for cmpxchg, add, and now load and store.?? 
>> My scanning ability is too limited to spot the differences.? I don't 
>> like the duplicated comments at all.
>>
>> I don't know if this is possible and not with this change, but I 
>> think there should be a class platformAtomic.hpp which consolidates 
>> these comments and moves the platform* stuff out of atomic.hpp, to be 
>> included or subclassed by atomic.hpp.? Then we can find our desired 
>> Atomic::blah functions again.?? I would like an RFE for this.
>
> I see what you are saying. When you think about it, is almost as if we 
> want the comments themselves to be template expanded for each 
> operation. (joking)

LOL, I almost wrote this :)
> I will file an RFE for this.

Thanks!
Coleen

>
>
>> Otherwise, I've pattern matched this and it seems correct and am fine 
>> with checking this in.
>>
>> http://cr.openjdk.java.net/~eosterlund/8188224/webrev.01/src/hotspot/os_cpu/windows_x86/atomic_windows_x86.hpp.udiff.html 
>>
>>
>> These changes I really like because now we don't have to go hunting 
>> to see that atomic::load/store is just *thing.
>
> Thank you for the review!
>
> /Erik
>
>>
>> Thanks!
>> Coleen
>>
>> On 10/3/17 8:58 AM, Erik ?sterlund wrote:
>>> Hi David,
>>>
>>> Thanks for the review.
>>> The comments have been removed.
>>>
>>> New full webrev:
>>> http://cr.openjdk.java.net/~eosterlund/8188224/webrev.01/
>>>
>>> New incremental webrev:
>>> http://cr.openjdk.java.net/~eosterlund/8188224/webrev.00_01/
>>>
>>> Thanks,
>>> /Erik
>>>
>>> On 2017-10-03 14:44, David Holmes wrote:
>>>> Hi Erik,
>>>>
>>>> A lot of jumping through hoops just to do a direct load/store in 
>>>> the bulk of cases - but okay, we're embracing templates.
>>>>
>>>> 66?? // Atomically store to a location
>>>> 67?? // See comment above about using jlong atomics on 32-bit 
>>>> platforms
>>>>
>>>> The comment at #67 and the equivalent one for load can be deleted. 
>>>> The "comment above" should only be referring to r-m-w atomic ops 
>>>> not basic load and store. All platforms must have a means to do 
>>>> atomic load/store of 64-bit due to Java volatile variables (eg by 
>>>> using floating-point unit on 32-bit) but may not have cmpxchg<8> 
>>>> capability. (I failed to convince the author of this when those 
>>>> comments went in. ;-) )
>>>>
>>>> Cheers,
>>>> David
>>>>
>>>> On 3/10/2017 10:29 PM, Erik ?sterlund wrote:
>>>>> Hi,
>>>>>
>>>>> The time has come to generalize Atomic::load/store with templates 
>>>>> - the last operation to generalize in Atomic.
>>>>> The design was inspired by Atomic::xchg and uses a similar 
>>>>> mechanism to validate the passed in arguments. It was also 
>>>>> designed with coming OrderAccess changes in mind. OrderAccess also 
>>>>> contains loads and stores that will reuse the LoadImpl and 
>>>>> StoreImpl infrastructure in Atomic::load/store. (the type checking 
>>>>> for what is okay to pass in to Atomic::load/store is very much the 
>>>>> same for OrderAccess::load_acquire/*store*).
>>>>>
>>>>> One thing worth mentioning is that the bsd zero port (but notably 
>>>>> not the linux zero port) had a leading fence for atomic stores of 
>>>>> jint when #if !defined(ARM) && !defined(M68K) is true without any 
>>>>> comment describing why. So I took the liberty of removing it. 
>>>>> Atomic should not have any fencing at all - that is what 
>>>>> OrderAccess is for. In fact Atomic does not promise any memory 
>>>>> ordering semantics for loads and stores. Atomic merely provides 
>>>>> relaxed accesses that are atomic. Worth mentioning nevertheless in 
>>>>> case anyone wants to keep that jint Atomic::store fence on bsd 
>>>>> zero !M68K && !ARM.
>>>>>
>>>>> Bug:
>>>>> https://bugs.openjdk.java.net/browse/JDK-8188224
>>>>>
>>>>> Webrev:
>>>>> http://cr.openjdk.java.net/~eosterlund/8188224/webrev.00/
>>>>>
>>>>> Testing: JPRT, mach5 hs-tier3
>>>>>
>>>>> Thanks,
>>>>> /Erik
>>>
>>
>


From bob.vandette at oracle.com  Wed Oct  4 18:14:29 2017
From: bob.vandette at oracle.com (Bob Vandette)
Date: Wed, 4 Oct 2017 14:14:29 -0400
Subject: RFR: 8146115 - Improve docker container detection and resource
 configuration usage 
In-Reply-To: <d3c36d08-1c9f-69bb-4e0c-aa00d3e62691@oracle.com>
References: <74630458-926E-4B3E-B967-6F6ADCA0A406@oracle.com>
 <7e9322e8-274d-9fb2-f6a5-8cc612e3fe68@oracle.com>
 <33d617e7-7ec4-ebde-efa1-5602189e8470@oracle.com>
 <12909a67-6876-a40c-85b9-b959ed9f02df@oracle.com>
 <39AD9F8D-7E2B-4C15-8525-36DBA7C74302@oracle.com>
 <d3c36d08-1c9f-69bb-4e0c-aa00d3e62691@oracle.com>
Message-ID: <F25C2D01-C643-4838-B2A7-DAA19D1B1EF5@oracle.com>

Robbin,

I?ve looked into this issue and you are correct.  I do have to examine both the
sched_getaffinity results as well as the cgroup cpu subsystem configuration
files in order to provide a reasonable value for active_processors.  If I was only
interested in cpusets, I could simply rely on the getaffinity call but I also want to
factor in shares and quotas as well.

I had assumed that when sched_setaffinity was called (in your case by numactl) that the 
cgroup cpu config files would be updated to reflect the current processor affinity for the 
running process. This is not correct.  I have updated my changeset and have successfully
run with your examples below.  I?ll post a new webrev soon.

Thanks,
Bob.


> 
>> I still want to include the flag for at least one Java release in the event that the new behavior causes some regression
>> in behavior.  I?m trying to make the detection robust so that it will fallback to the current behavior in the event
>> that cgroups is not configured as expected but I?d like to have a way of forcing the issue.  JDK 10 is not
>> supposed to be a long term support release which makes it a good target for this new behavior.
>> I agree with David that once we commit to cgroups, we should extract all VM configuration data from that
>> source.  There?s more information available for cpusets than just processor affinity that we might want to
>> consider when calculating the number of processors to assume for the VM.  There?s exclusivity and
>> effective cpu data available in addition to the cpuset string.
> 
> cgroup only contains limits, not the real hard limits.
> You most consider the affinity mask. We that have numa nodes do:
> 
> [rehn at rehn-ws dev]$ numactl --cpunodebind=1 --membind=1 java -Xlog:os=debug -cp . ForEver | grep proc
> [0.001s][debug][os] Initial active processor count set to 16
> [rehn at rehn-ws dev]$ numactl --cpunodebind=1 --membind=1 java -Xlog:os=debug -XX:+UseContainerSupport -cp . ForEver | grep proc
> [0.001s][debug][os] Initial active processor count set to 32
> 
> when benchmarking all the time and that must be set to 16 otherwise the flag is really bad for us.
> So the flag actually breaks the little numa support we have now.
> 
> Thanks, Robbin


From robbin.ehn at oracle.com  Wed Oct  4 18:30:34 2017
From: robbin.ehn at oracle.com (Robbin Ehn)
Date: Wed, 4 Oct 2017 20:30:34 +0200
Subject: RFR: 8146115 - Improve docker container detection and resource
 configuration usage
In-Reply-To: <F25C2D01-C643-4838-B2A7-DAA19D1B1EF5@oracle.com>
References: <74630458-926E-4B3E-B967-6F6ADCA0A406@oracle.com>
 <7e9322e8-274d-9fb2-f6a5-8cc612e3fe68@oracle.com>
 <33d617e7-7ec4-ebde-efa1-5602189e8470@oracle.com>
 <12909a67-6876-a40c-85b9-b959ed9f02df@oracle.com>
 <39AD9F8D-7E2B-4C15-8525-36DBA7C74302@oracle.com>
 <d3c36d08-1c9f-69bb-4e0c-aa00d3e62691@oracle.com>
 <F25C2D01-C643-4838-B2A7-DAA19D1B1EF5@oracle.com>
Message-ID: <1d05a95f-75db-e0ca-e069-12fe41502e4f@oracle.com>

Thanks Bob for looking into this.

On 10/04/2017 08:14 PM, Bob Vandette wrote:
> Robbin,
> 
> I?ve looked into this issue and you are correct.  I do have to examine both the
> sched_getaffinity results as well as the cgroup cpu subsystem configuration
> files in order to provide a reasonable value for active_processors.  If I was only
> interested in cpusets, I could simply rely on the getaffinity call but I also want to
> factor in shares and quotas as well.

We had a quick discussion at the office, we actually do think that you could skip reading the shares and quotas.
It really depends on what the user expect, if he give us 4 cpu's with 50% or 2 full cpu what do he expect the differences would be?
One could argue that he 'knows' that he will only use max 50% and thus we can act as if he is giving us 4 full cpu.
But I'll leave that up to you, just a tough we had.

> 
> I had assumed that when sched_setaffinity was called (in your case by numactl) that the
> cgroup cpu config files would be updated to reflect the current processor affinity for the
> running process. This is not correct.  I have updated my changeset and have successfully
> run with your examples below.  I?ll post a new webrev soon.
> 

I see, thanks again!

/Robbin

> Thanks,
> Bob.
> 
> 
>>
>>> I still want to include the flag for at least one Java release in the event that the new behavior causes some regression
>>> in behavior.  I?m trying to make the detection robust so that it will fallback to the current behavior in the event
>>> that cgroups is not configured as expected but I?d like to have a way of forcing the issue.  JDK 10 is not
>>> supposed to be a long term support release which makes it a good target for this new behavior.
>>> I agree with David that once we commit to cgroups, we should extract all VM configuration data from that
>>> source.  There?s more information available for cpusets than just processor affinity that we might want to
>>> consider when calculating the number of processors to assume for the VM.  There?s exclusivity and
>>> effective cpu data available in addition to the cpuset string.
>>
>> cgroup only contains limits, not the real hard limits.
>> You most consider the affinity mask. We that have numa nodes do:
>>
>> [rehn at rehn-ws dev]$ numactl --cpunodebind=1 --membind=1 java -Xlog:os=debug -cp . ForEver | grep proc
>> [0.001s][debug][os] Initial active processor count set to 16
>> [rehn at rehn-ws dev]$ numactl --cpunodebind=1 --membind=1 java -Xlog:os=debug -XX:+UseContainerSupport -cp . ForEver | grep proc
>> [0.001s][debug][os] Initial active processor count set to 32
>>
>> when benchmarking all the time and that must be set to 16 otherwise the flag is really bad for us.
>> So the flag actually breaks the little numa support we have now.
>>
>> Thanks, Robbin
> 

From bob.vandette at oracle.com  Wed Oct  4 18:51:04 2017
From: bob.vandette at oracle.com (Bob Vandette)
Date: Wed, 4 Oct 2017 14:51:04 -0400
Subject: RFR: 8146115 - Improve docker container detection and resource
 configuration usage
In-Reply-To: <1d05a95f-75db-e0ca-e069-12fe41502e4f@oracle.com>
References: <74630458-926E-4B3E-B967-6F6ADCA0A406@oracle.com>
 <7e9322e8-274d-9fb2-f6a5-8cc612e3fe68@oracle.com>
 <33d617e7-7ec4-ebde-efa1-5602189e8470@oracle.com>
 <12909a67-6876-a40c-85b9-b959ed9f02df@oracle.com>
 <39AD9F8D-7E2B-4C15-8525-36DBA7C74302@oracle.com>
 <d3c36d08-1c9f-69bb-4e0c-aa00d3e62691@oracle.com>
 <F25C2D01-C643-4838-B2A7-DAA19D1B1EF5@oracle.com>
 <1d05a95f-75db-e0ca-e069-12fe41502e4f@oracle.com>
Message-ID: <D2E3D675-2A4A-4ED1-BC30-6C52B185B731@oracle.com>


> On Oct 4, 2017, at 2:30 PM, Robbin Ehn <robbin.ehn at oracle.com> wrote:
> 
> Thanks Bob for looking into this.
> 
> On 10/04/2017 08:14 PM, Bob Vandette wrote:
>> Robbin,
>> I?ve looked into this issue and you are correct.  I do have to examine both the
>> sched_getaffinity results as well as the cgroup cpu subsystem configuration
>> files in order to provide a reasonable value for active_processors.  If I was only
>> interested in cpusets, I could simply rely on the getaffinity call but I also want to
>> factor in shares and quotas as well.
> 
> We had a quick discussion at the office, we actually do think that you could skip reading the shares and quotas.
> It really depends on what the user expect, if he give us 4 cpu's with 50% or 2 full cpu what do he expect the differences would be?
> One could argue that he 'knows' that he will only use max 50% and thus we can act as if he is giving us 4 full cpu.
> But I'll leave that up to you, just a tough we had.

It?s my opinion that we should do something if someone makes the effort to configure their
containers to use quotas or shares.  There are many different opinions on what the right that
right ?something? is.

Many developers that are trying to deploy apps that use containers say they don?t like
cpusets.  This is too limiting for them especially when the server configurations vary
within their organization. 

From everything I?ve read including source code, there seems to be a consensus that
shares and quotas are being used as a way to specify a fraction of a system (number of cpus).

Docker added ?cpus which is implemented using quotas and periods.  They adjust these
two parameters to provide a way of calculating the number of cpus that will be available
to a process (quota/period).  Amazon also documents that cpu shares are defined to be a multiple of 1024.
Where 1024 represents a single cpu and a share value of N*1024 represents N cpus.

Of course these are just conventions.  This is why I provided a way of specifying the
number of CPUs so folks deploying Java services can be certain they get what they want.

Bob.

> 
>> I had assumed that when sched_setaffinity was called (in your case by numactl) that the
>> cgroup cpu config files would be updated to reflect the current processor affinity for the
>> running process. This is not correct.  I have updated my changeset and have successfully
>> run with your examples below.  I?ll post a new webrev soon.
> 
> I see, thanks again!
> 
> /Robbin
> 
>> Thanks,
>> Bob.
>>> 
>>>> I still want to include the flag for at least one Java release in the event that the new behavior causes some regression
>>>> in behavior.  I?m trying to make the detection robust so that it will fallback to the current behavior in the event
>>>> that cgroups is not configured as expected but I?d like to have a way of forcing the issue.  JDK 10 is not
>>>> supposed to be a long term support release which makes it a good target for this new behavior.
>>>> I agree with David that once we commit to cgroups, we should extract all VM configuration data from that
>>>> source.  There?s more information available for cpusets than just processor affinity that we might want to
>>>> consider when calculating the number of processors to assume for the VM.  There?s exclusivity and
>>>> effective cpu data available in addition to the cpuset string.
>>> 
>>> cgroup only contains limits, not the real hard limits.
>>> You most consider the affinity mask. We that have numa nodes do:
>>> 
>>> [rehn at rehn-ws dev]$ numactl --cpunodebind=1 --membind=1 java -Xlog:os=debug -cp . ForEver | grep proc
>>> [0.001s][debug][os] Initial active processor count set to 16
>>> [rehn at rehn-ws dev]$ numactl --cpunodebind=1 --membind=1 java -Xlog:os=debug -XX:+UseContainerSupport -cp . ForEver | grep proc
>>> [0.001s][debug][os] Initial active processor count set to 32
>>> 
>>> when benchmarking all the time and that must be set to 16 otherwise the flag is really bad for us.
>>> So the flag actually breaks the little numa support we have now.
>>> 
>>> Thanks, Robbin


From ceeaspb at gmail.com  Wed Oct  4 20:01:09 2017
From: ceeaspb at gmail.com (Alex Bagehot)
Date: Wed, 4 Oct 2017 21:01:09 +0100
Subject: RFR: 8146115 - Improve docker container detection and resource
 configuration usage
In-Reply-To: <D2E3D675-2A4A-4ED1-BC30-6C52B185B731@oracle.com>
References: <74630458-926E-4B3E-B967-6F6ADCA0A406@oracle.com>
 <7e9322e8-274d-9fb2-f6a5-8cc612e3fe68@oracle.com>
 <33d617e7-7ec4-ebde-efa1-5602189e8470@oracle.com>
 <12909a67-6876-a40c-85b9-b959ed9f02df@oracle.com>
 <39AD9F8D-7E2B-4C15-8525-36DBA7C74302@oracle.com>
 <d3c36d08-1c9f-69bb-4e0c-aa00d3e62691@oracle.com>
 <F25C2D01-C643-4838-B2A7-DAA19D1B1EF5@oracle.com>
 <1d05a95f-75db-e0ca-e069-12fe41502e4f@oracle.com>
 <D2E3D675-2A4A-4ED1-BC30-6C52B185B731@oracle.com>
Message-ID: <CAHeneC8cxBJ61A4u1AkfmpP63ZG6-WEkMyFLKnhRv=G6W32pFw@mail.gmail.com>

Hi,

On Wed, Oct 4, 2017 at 7:51 PM, Bob Vandette <bob.vandette at oracle.com>
wrote:

>
> > On Oct 4, 2017, at 2:30 PM, Robbin Ehn <robbin.ehn at oracle.com> wrote:
> >
> > Thanks Bob for looking into this.
> >
> > On 10/04/2017 08:14 PM, Bob Vandette wrote:
> >> Robbin,
> >> I?ve looked into this issue and you are correct.  I do have to examine
> both the
> >> sched_getaffinity results as well as the cgroup cpu subsystem
> configuration
> >> files in order to provide a reasonable value for active_processors.  If
> I was only
> >> interested in cpusets, I could simply rely on the getaffinity call but
> I also want to
> >> factor in shares and quotas as well.
> >
> > We had a quick discussion at the office, we actually do think that you
> could skip reading the shares and quotas.
> > It really depends on what the user expect, if he give us 4 cpu's with
> 50% or 2 full cpu what do he expect the differences would be?
> > One could argue that he 'knows' that he will only use max 50% and thus
> we can act as if he is giving us 4 full cpu.
> > But I'll leave that up to you, just a tough we had.
>
> It?s my opinion that we should do something if someone makes the effort to
> configure their
> containers to use quotas or shares.  There are many different opinions on
> what the right that
> right ?something? is.
>

It might be interesting to look at some real instances of how java might[3]
be deployed in containers.
Marathon/Mesos[1] and Kubernetes[2] use shares and quotas so this is a vast
chunk of deployments that need both of them today.


>
> Many developers that are trying to deploy apps that use containers say
> they don?t like
> cpusets.  This is too limiting for them especially when the server
> configurations vary
> within their organization.
>

True, however Kubernetes has an alpha feature[5] where it allocates cpusets
to containers that request a whole number of cpus. Previously without
cpusets any container could run on any cpu which we know might not be good
for some workloads that want isolation. A request for a fractional or
burstable amount of cpu would be allocated from a shared cpu pool. So
although manual allocation of cpusets will be flakey[3] , automation should
be able to make it work.


>
> From everything I?ve read including source code, there seems to be a
> consensus that
> shares and quotas are being used as a way to specify a fraction of a
> system (number of cpus).
>

A refinement[6] on this is:
Shares can be used for guaranteed cpu - you will always get your share.
Quota[4] is a limit/constraint - you can never get more than the quota.
So given the below limit of how many shares will be allocated on a host you
can have burstable(or overcommit) capacity if your shares are less than
your quota.


>
> Docker added ?cpus which is implemented using quotas and periods.  They
> adjust these
> two parameters to provide a way of calculating the number of cpus that
> will be available
> to a process (quota/period).  Amazon also documents that cpu shares are
> defined to be a multiple of 1024.
> Where 1024 represents a single cpu and a share value of N*1024 represents
> N cpus.
>

Kubernetes and Mesos/Marathon also use the N*1024 shares per host to
allocate resources automatically.

Hopefully this provides some background on what a couple of orchestration
systems that will be running java are doing currently in this area.
Thanks,
Alex


[1] https://github.com/apache/mesos/commit/346cc8dd528a28a6e
1f1cbdb4c95b8bdea2f6070 / (now out of date but appears to be a reasonable
intro : https://zcox.wordpress.com/2014/09/17/cpu-resources-in-docke
r-mesos-and-marathon/ )
[1a] https://youtu.be/hJyAfC-Z2xk?t=2439

[2] https://kubernetes.io/docs/concepts/configuration/manage
-compute-resources-container/

[3] https://youtu.be/w1rZOY5gbvk?t=2479

[4] https://www.kernel.org/doc/Documentation/scheduler/sched-bwc.txt
https://landley.net/kdocs/ols/2010/ols2010-pages-245-254.pdf
https://lwn.net/Articles/428175/

[5]
https://github.com/kubernetes/community/blob/43ce57ac476b9f2ce3f0220354a075e095a0d469/contributors/design-proposals/node/cpu-manager.md
/ https://github.com/kubernetes/kubernetes/commit/
00f0e0f6504ad8dd85fcbbd6294cd7cf2475fc72 / https://vimeo.com/226858314

[6] https://kubernetes.io/docs/concepts/configuration/manage-
compute-resources-container/#how-pods-with-resource-limits-are-run


> Of course these are just conventions.  This is why I provided a way of
> specifying the
> number of CPUs so folks deploying Java services can be certain they get
> what they want.
>
> Bob.
>
> >
> >> I had assumed that when sched_setaffinity was called (in your case by
> numactl) that the
> >> cgroup cpu config files would be updated to reflect the current
> processor affinity for the
> >> running process. This is not correct.  I have updated my changeset and
> have successfully
> >> run with your examples below.  I?ll post a new webrev soon.
> >
> > I see, thanks again!
> >
> > /Robbin
> >
> >> Thanks,
> >> Bob.
> >>>
> >>>> I still want to include the flag for at least one Java release in the
> event that the new behavior causes some regression
> >>>> in behavior.  I?m trying to make the detection robust so that it will
> fallback to the current behavior in the event
> >>>> that cgroups is not configured as expected but I?d like to have a way
> of forcing the issue.  JDK 10 is not
> >>>> supposed to be a long term support release which makes it a good
> target for this new behavior.
> >>>> I agree with David that once we commit to cgroups, we should extract
> all VM configuration data from that
> >>>> source.  There?s more information available for cpusets than just
> processor affinity that we might want to
> >>>> consider when calculating the number of processors to assume for the
> VM.  There?s exclusivity and
> >>>> effective cpu data available in addition to the cpuset string.
> >>>
> >>> cgroup only contains limits, not the real hard limits.
> >>> You most consider the affinity mask. We that have numa nodes do:
> >>>
> >>> [rehn at rehn-ws dev]$ numactl --cpunodebind=1 --membind=1 java
> -Xlog:os=debug -cp . ForEver | grep proc
> >>> [0.001s][debug][os] Initial active processor count set to 16
> >>> [rehn at rehn-ws dev]$ numactl --cpunodebind=1 --membind=1 java
> -Xlog:os=debug -XX:+UseContainerSupport -cp . ForEver | grep proc
> >>> [0.001s][debug][os] Initial active processor count set to 32
> >>>
> >>> when benchmarking all the time and that must be set to 16 otherwise
> the flag is really bad for us.
> >>> So the flag actually breaks the little numa support we have now.
> >>>
> >>> Thanks, Robbin
>
>

From david.holmes at oracle.com  Wed Oct  4 21:51:47 2017
From: david.holmes at oracle.com (David Holmes)
Date: Thu, 5 Oct 2017 07:51:47 +1000
Subject: RFR: 8146115 - Improve docker container detection and resource
 configuration usage
In-Reply-To: <CAHeneC8cxBJ61A4u1AkfmpP63ZG6-WEkMyFLKnhRv=G6W32pFw@mail.gmail.com>
References: <74630458-926E-4B3E-B967-6F6ADCA0A406@oracle.com>
 <7e9322e8-274d-9fb2-f6a5-8cc612e3fe68@oracle.com>
 <33d617e7-7ec4-ebde-efa1-5602189e8470@oracle.com>
 <12909a67-6876-a40c-85b9-b959ed9f02df@oracle.com>
 <39AD9F8D-7E2B-4C15-8525-36DBA7C74302@oracle.com>
 <d3c36d08-1c9f-69bb-4e0c-aa00d3e62691@oracle.com>
 <F25C2D01-C643-4838-B2A7-DAA19D1B1EF5@oracle.com>
 <1d05a95f-75db-e0ca-e069-12fe41502e4f@oracle.com>
 <D2E3D675-2A4A-4ED1-BC30-6C52B185B731@oracle.com>
 <CAHeneC8cxBJ61A4u1AkfmpP63ZG6-WEkMyFLKnhRv=G6W32pFw@mail.gmail.com>
Message-ID: <615e504d-94af-bae3-b721-6ca1dac6a567@oracle.com>

Hi Alex,

Can you tell me how shares/quotas are actually implemented in terms of 
allocating "cpus" to processes when shares/quotas are being applied? For 
example in a 12 cpu system if I have a 50% share do I get all 12 CPUs 
for 50% of a "quantum" each, or do I get 6 CPUs for a full quantum each?

When we try to use the "number of processors" to control the number of 
threads created, or the number of partitions in a task, then we really 
want to know how many CPUs we can actually be concurrently running on!

Thanks,
David

On 5/10/2017 6:01 AM, Alex Bagehot wrote:
> Hi,
> 
> On Wed, Oct 4, 2017 at 7:51 PM, Bob Vandette <bob.vandette at oracle.com>
> wrote:
> 
>>
>>> On Oct 4, 2017, at 2:30 PM, Robbin Ehn <robbin.ehn at oracle.com> wrote:
>>>
>>> Thanks Bob for looking into this.
>>>
>>> On 10/04/2017 08:14 PM, Bob Vandette wrote:
>>>> Robbin,
>>>> I?ve looked into this issue and you are correct.  I do have to examine
>> both the
>>>> sched_getaffinity results as well as the cgroup cpu subsystem
>> configuration
>>>> files in order to provide a reasonable value for active_processors.  If
>> I was only
>>>> interested in cpusets, I could simply rely on the getaffinity call but
>> I also want to
>>>> factor in shares and quotas as well.
>>>
>>> We had a quick discussion at the office, we actually do think that you
>> could skip reading the shares and quotas.
>>> It really depends on what the user expect, if he give us 4 cpu's with
>> 50% or 2 full cpu what do he expect the differences would be?
>>> One could argue that he 'knows' that he will only use max 50% and thus
>> we can act as if he is giving us 4 full cpu.
>>> But I'll leave that up to you, just a tough we had.
>>
>> It?s my opinion that we should do something if someone makes the effort to
>> configure their
>> containers to use quotas or shares.  There are many different opinions on
>> what the right that
>> right ?something? is.
>>
> 
> It might be interesting to look at some real instances of how java might[3]
> be deployed in containers.
> Marathon/Mesos[1] and Kubernetes[2] use shares and quotas so this is a vast
> chunk of deployments that need both of them today.
> 
> 
>>
>> Many developers that are trying to deploy apps that use containers say
>> they don?t like
>> cpusets.  This is too limiting for them especially when the server
>> configurations vary
>> within their organization.
>>
> 
> True, however Kubernetes has an alpha feature[5] where it allocates cpusets
> to containers that request a whole number of cpus. Previously without
> cpusets any container could run on any cpu which we know might not be good
> for some workloads that want isolation. A request for a fractional or
> burstable amount of cpu would be allocated from a shared cpu pool. So
> although manual allocation of cpusets will be flakey[3] , automation should
> be able to make it work.
> 
> 
>>
>>  From everything I?ve read including source code, there seems to be a
>> consensus that
>> shares and quotas are being used as a way to specify a fraction of a
>> system (number of cpus).
>>
> 
> A refinement[6] on this is:
> Shares can be used for guaranteed cpu - you will always get your share.
> Quota[4] is a limit/constraint - you can never get more than the quota.
> So given the below limit of how many shares will be allocated on a host you
> can have burstable(or overcommit) capacity if your shares are less than
> your quota.
> 
> 
>>
>> Docker added ?cpus which is implemented using quotas and periods.  They
>> adjust these
>> two parameters to provide a way of calculating the number of cpus that
>> will be available
>> to a process (quota/period).  Amazon also documents that cpu shares are
>> defined to be a multiple of 1024.
>> Where 1024 represents a single cpu and a share value of N*1024 represents
>> N cpus.
>>
> 
> Kubernetes and Mesos/Marathon also use the N*1024 shares per host to
> allocate resources automatically.
> 
> Hopefully this provides some background on what a couple of orchestration
> systems that will be running java are doing currently in this area.
> Thanks,
> Alex
> 
> 
> [1] https://github.com/apache/mesos/commit/346cc8dd528a28a6e
> 1f1cbdb4c95b8bdea2f6070 / (now out of date but appears to be a reasonable
> intro : https://zcox.wordpress.com/2014/09/17/cpu-resources-in-docke
> r-mesos-and-marathon/ )
> [1a] https://youtu.be/hJyAfC-Z2xk?t=2439
> 
> [2] https://kubernetes.io/docs/concepts/configuration/manage
> -compute-resources-container/
> 
> [3] https://youtu.be/w1rZOY5gbvk?t=2479
> 
> [4] https://www.kernel.org/doc/Documentation/scheduler/sched-bwc.txt
> https://landley.net/kdocs/ols/2010/ols2010-pages-245-254.pdf
> https://lwn.net/Articles/428175/
> 
> [5]
> https://github.com/kubernetes/community/blob/43ce57ac476b9f2ce3f0220354a075e095a0d469/contributors/design-proposals/node/cpu-manager.md
> / https://github.com/kubernetes/kubernetes/commit/
> 00f0e0f6504ad8dd85fcbbd6294cd7cf2475fc72 / https://vimeo.com/226858314
> 
> [6] https://kubernetes.io/docs/concepts/configuration/manage-
> compute-resources-container/#how-pods-with-resource-limits-are-run
> 
> 
>> Of course these are just conventions.  This is why I provided a way of
>> specifying the
>> number of CPUs so folks deploying Java services can be certain they get
>> what they want.
>>
>> Bob.
>>
>>>
>>>> I had assumed that when sched_setaffinity was called (in your case by
>> numactl) that the
>>>> cgroup cpu config files would be updated to reflect the current
>> processor affinity for the
>>>> running process. This is not correct.  I have updated my changeset and
>> have successfully
>>>> run with your examples below.  I?ll post a new webrev soon.
>>>
>>> I see, thanks again!
>>>
>>> /Robbin
>>>
>>>> Thanks,
>>>> Bob.
>>>>>
>>>>>> I still want to include the flag for at least one Java release in the
>> event that the new behavior causes some regression
>>>>>> in behavior.  I?m trying to make the detection robust so that it will
>> fallback to the current behavior in the event
>>>>>> that cgroups is not configured as expected but I?d like to have a way
>> of forcing the issue.  JDK 10 is not
>>>>>> supposed to be a long term support release which makes it a good
>> target for this new behavior.
>>>>>> I agree with David that once we commit to cgroups, we should extract
>> all VM configuration data from that
>>>>>> source.  There?s more information available for cpusets than just
>> processor affinity that we might want to
>>>>>> consider when calculating the number of processors to assume for the
>> VM.  There?s exclusivity and
>>>>>> effective cpu data available in addition to the cpuset string.
>>>>>
>>>>> cgroup only contains limits, not the real hard limits.
>>>>> You most consider the affinity mask. We that have numa nodes do:
>>>>>
>>>>> [rehn at rehn-ws dev]$ numactl --cpunodebind=1 --membind=1 java
>> -Xlog:os=debug -cp . ForEver | grep proc
>>>>> [0.001s][debug][os] Initial active processor count set to 16
>>>>> [rehn at rehn-ws dev]$ numactl --cpunodebind=1 --membind=1 java
>> -Xlog:os=debug -XX:+UseContainerSupport -cp . ForEver | grep proc
>>>>> [0.001s][debug][os] Initial active processor count set to 32
>>>>>
>>>>> when benchmarking all the time and that must be set to 16 otherwise
>> the flag is really bad for us.
>>>>> So the flag actually breaks the little numa support we have now.
>>>>>
>>>>> Thanks, Robbin
>>
>>

From vladimir.kozlov at oracle.com  Wed Oct  4 23:05:33 2017
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Wed, 4 Oct 2017 16:05:33 -0700
Subject: [10] RFR(S) 8188775: Module jdk.internal.vm.compiler.management has
 not been granted accessClassInPackage.org.graalvm.compiler.hotspot
Message-ID: <a374f47d-c4a5-0fb4-8674-dba65624532f@oracle.com>

https://bugs.openjdk.java.net/browse/JDK-8188775

Changes for 8182701[1] missed changes in default.policy for new module jdk.internal.vm.compiler.management.

Add missing code:

src/java.base/share/lib/security/default.policy
@@ -154,6 +154,10 @@
      permission java.security.AllPermission;
  };

+grant codeBase "jrt:/jdk.internal.vm.compiler.management" {
+    permission java.security.AllPermission;
+};
+
  grant codeBase "jrt:/jdk.jsobject" {
      permission java.security.AllPermission;
  };

Verified with failed test.

Thanks,
Vladimir

[1] http://hg.openjdk.java.net/jdk10/hs/rev/8b2054b7d02c

From mandy.chung at oracle.com  Wed Oct  4 23:07:07 2017
From: mandy.chung at oracle.com (mandy chung)
Date: Wed, 4 Oct 2017 16:07:07 -0700
Subject: [10] RFR(S) 8188775: Module jdk.internal.vm.compiler.management
 has not been granted accessClassInPackage.org.graalvm.compiler.hotspot
In-Reply-To: <a374f47d-c4a5-0fb4-8674-dba65624532f@oracle.com>
References: <a374f47d-c4a5-0fb4-8674-dba65624532f@oracle.com>
Message-ID: <2e050a60-0f6e-503f-df39-31108f0da6d1@oracle.com>

+1

Mandy

On 10/4/17 4:05 PM, Vladimir Kozlov wrote:
> https://bugs.openjdk.java.net/browse/JDK-8188775
>
> Changes for 8182701[1] missed changes in default.policy for new module 
> jdk.internal.vm.compiler.management.
>
> Add missing code:
>
> src/java.base/share/lib/security/default.policy
> @@ -154,6 +154,10 @@
> ???? permission java.security.AllPermission;
> ?};
>
> +grant codeBase "jrt:/jdk.internal.vm.compiler.management" {
> +??? permission java.security.AllPermission;
> +};
> +
> ?grant codeBase "jrt:/jdk.jsobject" {
> ???? permission java.security.AllPermission;
> ?};
>
> Verified with failed test.
>
> Thanks,
> Vladimir
>
> [1] http://hg.openjdk.java.net/jdk10/hs/rev/8b2054b7d02c


From vladimir.kozlov at oracle.com  Wed Oct  4 23:12:27 2017
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Wed, 4 Oct 2017 16:12:27 -0700
Subject: [10] RFR(XS) 8188776: jdk.internal.vm.ci can't export package to
 upgradeable modules
Message-ID: <c651ff90-1e1a-a830-907a-88075aeb983b@oracle.com>

https://bugs.openjdk.java.net/browse/JDK-8188776

8182701 added exports for jdk.vm.ci.runtime package [1] but did not add new exception in the test.

Added missing exception in JdkQualifiedExportTest.java test:

--- a/test/jdk/jdk/modules/etc/JdkQualifiedExportTest.java
+++ b/test/jdk/jdk/modules/etc/JdkQualifiedExportTest.java
@@ -70,6 +70,7 @@

      static Set<String> KNOWN_EXCEPTIONS =
          Set.of("jdk.internal.vm.ci/jdk.vm.ci.services",
+               "jdk.internal.vm.ci/jdk.vm.ci.runtime",
                 "jdk.jsobject/jdk.internal.netscape.javascript.spi");

      static void checkExports(ModuleDescriptor md) {

Verified with this test.

Thanks,
Vladimir

[1] http://hg.openjdk.java.net/jdk10/hs/rev/8b2054b7d02c#l3.1

From vladimir.kozlov at oracle.com  Wed Oct  4 23:12:55 2017
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Wed, 4 Oct 2017 16:12:55 -0700
Subject: [10] RFR(S) 8188775: Module jdk.internal.vm.compiler.management
 has not been granted accessClassInPackage.org.graalvm.compiler.hotspot
In-Reply-To: <2e050a60-0f6e-503f-df39-31108f0da6d1@oracle.com>
References: <a374f47d-c4a5-0fb4-8674-dba65624532f@oracle.com>
 <2e050a60-0f6e-503f-df39-31108f0da6d1@oracle.com>
Message-ID: <7a5e843d-4da1-25e8-d21c-908977707d4c@oracle.com>

Thank you, Mandy

Vladimir

On 10/4/17 4:07 PM, mandy chung wrote:
> +1
> 
> Mandy
> 
> On 10/4/17 4:05 PM, Vladimir Kozlov wrote:
>> https://bugs.openjdk.java.net/browse/JDK-8188775
>>
>> Changes for 8182701[1] missed changes in default.policy for new module jdk.internal.vm.compiler.management.
>>
>> Add missing code:
>>
>> src/java.base/share/lib/security/default.policy
>> @@ -154,6 +154,10 @@
>> ???? permission java.security.AllPermission;
>> ?};
>>
>> +grant codeBase "jrt:/jdk.internal.vm.compiler.management" {
>> +??? permission java.security.AllPermission;
>> +};
>> +
>> ?grant codeBase "jrt:/jdk.jsobject" {
>> ???? permission java.security.AllPermission;
>> ?};
>>
>> Verified with failed test.
>>
>> Thanks,
>> Vladimir
>>
>> [1] http://hg.openjdk.java.net/jdk10/hs/rev/8b2054b7d02c
> 

From mandy.chung at oracle.com  Wed Oct  4 23:15:43 2017
From: mandy.chung at oracle.com (mandy chung)
Date: Wed, 4 Oct 2017 16:15:43 -0700
Subject: [10] RFR(XS) 8188776: jdk.internal.vm.ci can't export package to
 upgradeable modules
In-Reply-To: <c651ff90-1e1a-a830-907a-88075aeb983b@oracle.com>
References: <c651ff90-1e1a-a830-907a-88075aeb983b@oracle.com>
Message-ID: <1f6bdab4-8c1d-61a4-4abf-f294590e2eff@oracle.com>

+1

Looks like JDK regression tests were not run before pushing JDK-8182701?

Mandy

On 10/4/17 4:12 PM, Vladimir Kozlov wrote:
> https://bugs.openjdk.java.net/browse/JDK-8188776
>
> 8182701 added exports for jdk.vm.ci.runtime package [1] but did not 
> add new exception in the test.
>
> Added missing exception in JdkQualifiedExportTest.java test:
>
> --- a/test/jdk/jdk/modules/etc/JdkQualifiedExportTest.java
> +++ b/test/jdk/jdk/modules/etc/JdkQualifiedExportTest.java
> @@ -70,6 +70,7 @@
>
> ???? static Set<String> KNOWN_EXCEPTIONS =
> ???????? Set.of("jdk.internal.vm.ci/jdk.vm.ci.services",
> +?????????????? "jdk.internal.vm.ci/jdk.vm.ci.runtime",
> "jdk.jsobject/jdk.internal.netscape.javascript.spi");
>
> ???? static void checkExports(ModuleDescriptor md) {
>
> Verified with this test.
>
> Thanks,
> Vladimir
>
> [1] http://hg.openjdk.java.net/jdk10/hs/rev/8b2054b7d02c#l3.1


From vladimir.kozlov at oracle.com  Wed Oct  4 23:34:23 2017
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Wed, 4 Oct 2017 16:34:23 -0700
Subject: [10] RFR(XS) 8188776: jdk.internal.vm.ci can't export package to
 upgradeable modules
In-Reply-To: <1f6bdab4-8c1d-61a4-4abf-f294590e2eff@oracle.com>
References: <c651ff90-1e1a-a830-907a-88075aeb983b@oracle.com>
 <1f6bdab4-8c1d-61a4-4abf-f294590e2eff@oracle.com>
Message-ID: <deba104e-88c3-bba2-4204-f4f1f9afe5ef@oracle.com>

Thank you, Mandy

On 10/4/17 4:15 PM, mandy chung wrote:
> +1
> 
> Looks like JDK regression tests were not run before pushing JDK-8182701?

Yes, only hotspot jtreg tests were run unfortunately before the push.
We do run jdk_lang regularly in tier5 Nightly testing.

Thanks,
Vladimir

> 
> Mandy
> 
> On 10/4/17 4:12 PM, Vladimir Kozlov wrote:
>> https://bugs.openjdk.java.net/browse/JDK-8188776
>>
>> 8182701 added exports for jdk.vm.ci.runtime package [1] but did not add new exception in the test.
>>
>> Added missing exception in JdkQualifiedExportTest.java test:
>>
>> --- a/test/jdk/jdk/modules/etc/JdkQualifiedExportTest.java
>> +++ b/test/jdk/jdk/modules/etc/JdkQualifiedExportTest.java
>> @@ -70,6 +70,7 @@
>>
>> ???? static Set<String> KNOWN_EXCEPTIONS =
>> ???????? Set.of("jdk.internal.vm.ci/jdk.vm.ci.services",
>> +?????????????? "jdk.internal.vm.ci/jdk.vm.ci.runtime",
>> "jdk.jsobject/jdk.internal.netscape.javascript.spi");
>>
>> ???? static void checkExports(ModuleDescriptor md) {
>>
>> Verified with this test.
>>
>> Thanks,
>> Vladimir
>>
>> [1] http://hg.openjdk.java.net/jdk10/hs/rev/8b2054b7d02c#l3.1
> 

From HORIE at jp.ibm.com  Thu Oct  5 09:15:55 2017
From: HORIE at jp.ibm.com (Michihiro Horie)
Date: Thu, 5 Oct 2017 18:15:55 +0900
Subject: RFR:8188802:PPC64: Failure on
 assert(lrgmask.is_aligned_sets(RegMask::SlotsPerVecX)) 
Message-ID: <OFB7003FE2.F6F5CB0D-ON002581B0.0031CDC6-492581B0.0032E56F@notes.na.collabserv.com>


Dear all,

Would you please review the following change?
Bug: https://bugs.openjdk.java.net/browse/JDK-8188802
Webrev: http://cr.openjdk.java.net/~mhorie/8188802/webrev.00/

This change fixes the assertion failures, which occur after introducing "
8188139:PPC64: Superword Level Parallelization with VSX".
I exchanged the order of declarations of alloc_classes for SR and VSR.
After this fix, another assertion in rc_class() in ppc.ad failed, I
modified the assertion itself to take into account newly added VSRs.

I would be happy to revise code if these changes do not make sense.

Best regards,
--
Michihiro,
IBM Research - Tokyo

From martin.doerr at sap.com  Thu Oct  5 11:05:30 2017
From: martin.doerr at sap.com (Doerr, Martin)
Date: Thu, 5 Oct 2017 11:05:30 +0000
Subject: RFR:8188802:PPC64: Failure on
 assert(lrgmask.is_aligned_sets(RegMask::SlotsPerVecX)) 
In-Reply-To: <OFB7003FE2.F6F5CB0D-ON002581B0.0031CDC6-492581B0.0032E56F@notes.na.collabserv.com>
References: <OFB7003FE2.F6F5CB0D-ON002581B0.0031CDC6-492581B0.0032E56F@notes.na.collabserv.com>
Message-ID: <058ad834758242c2a7bc9e39b1aa06df@sap.com>

Hi Michihiro,

pushed this change as it enables us to build and run the VM again.
I have introduced a switch "SuperwordUseVSX" which I only enable on >=Power8. Reason is that you're using Power8 instructions which broke the VM for older processors.

Regards,
Martin


From: Michihiro Horie [mailto:HORIE at jp.ibm.com]
Sent: Donnerstag, 5. Oktober 2017 11:16
To: ppc-aix-port-dev at openjdk.java.net; hotspot-dev at openjdk.java.net; Doerr, Martin <martin.doerr at sap.com>
Cc: Hiroshi H Horii <HORII at jp.ibm.com>; Gustavo Romero <gromero at linux.vnet.ibm.com>; Kazunori Ogata <OGATAK at jp.ibm.com>
Subject: RFR:8188802:PPC64: Failure on assert(lrgmask.is_aligned_sets(RegMask::SlotsPerVecX))


Dear all,

Would you please review the following change?
Bug: https://bugs.openjdk.java.net/browse/JDK-8188802
Webrev: http://cr.openjdk.java.net/~mhorie/8188802/webrev.00/

This change fixes the assertion failures, which occur after introducing "8188139:PPC64: Superword Level Parallelization with VSX".
I exchanged the order of declarations of alloc_classes for SR and VSR.
After this fix, another assertion in rc_class() in ppc.ad failed, I modified the assertion itself to take into account newly added VSRs.

I would be happy to revise code if these changes do not make sense.

Best regards,
--
Michihiro,
IBM Research - Tokyo

From erik.osterlund at oracle.com  Thu Oct  5 13:55:45 2017
From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=)
Date: Thu, 5 Oct 2017 15:55:45 +0200
Subject: RFR (M): 8188813: Generalize OrderAccess to use templates
Message-ID: <59D639E1.7070104@oracle.com>

Hi,

Now that Atomic has been generalized with templates, the same should to 
be done to OrderAccess.

Bug:
https://bugs.openjdk.java.net/browse/JDK-8188813

Webrev:
http://cr.openjdk.java.net/~eosterlund/8188813/webrev.00/

Testing: mach5 hs-tier3

Since Atomic already has a mechanism for type checking generic arguments 
for Atomic::load/store, and OrderAccess also is a bunch of semantically 
decorated loads and stores, I decided to reuse the template wheel that 
was already invented (Atomic::LoadImpl and Atomic::StoreImpl).
Therefore, I made OrderAccess privately inherit Atomic so that this 
infrastructure could be reused. A whole bunch of code has been nuked 
with this generalization.

It is worth noting that I have added PrimitiveConversion functionality 
for doubles and floats which translates to using the union trick for 
casting double to and from int64_t and float to and from int32_t when 
passing down doubles and ints to the API. I need the former two, because 
Java supports volatile double and volatile float, and therefore runtime 
support for that needs to be able to use floats and doubles. I also 
added PrimitiveConversion functionality for the subclasses of oop 
(instanceOop and friends). The base class oop already supported this, so 
it seemed natural that the subclasses should support it too.

Thanks,
/Erik

From goetz.lindenmaier at sap.com  Thu Oct  5 16:11:58 2017
From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz)
Date: Thu, 5 Oct 2017 16:11:58 +0000
Subject: Normalize help flags of tools in jdk?
Message-ID: <794c228f8b3a4810ae7c885402dda687@sap.com>

Hi,

I would like to normalize the help flags of the tools in jdk/bin.
java accepts -?, -h and --help. I think that's a good set the others
should support, too.

If this is appreciated, I would complete this webrev to cover 
all the cases where this is doable with acceptable effort:
http://cr.openjdk.java.net/~goetz/wr17/helpMessage/webrev/

Some tools exit with '1' after displaying the help message,
while most exit with '0'.  Is that intended?

See also the test I added, it's implemented similar to 
tools/launcher/VersionCheck.java.

Best regards,
  Goetz.


From ceeaspb at gmail.com  Thu Oct  5 16:43:13 2017
From: ceeaspb at gmail.com (Alex Bagehot)
Date: Thu, 5 Oct 2017 17:43:13 +0100
Subject: RFR: 8146115 - Improve docker container detection and resource
 configuration usage
In-Reply-To: <615e504d-94af-bae3-b721-6ca1dac6a567@oracle.com>
References: <74630458-926E-4B3E-B967-6F6ADCA0A406@oracle.com>
 <7e9322e8-274d-9fb2-f6a5-8cc612e3fe68@oracle.com>
 <33d617e7-7ec4-ebde-efa1-5602189e8470@oracle.com>
 <12909a67-6876-a40c-85b9-b959ed9f02df@oracle.com>
 <39AD9F8D-7E2B-4C15-8525-36DBA7C74302@oracle.com>
 <d3c36d08-1c9f-69bb-4e0c-aa00d3e62691@oracle.com>
 <F25C2D01-C643-4838-B2A7-DAA19D1B1EF5@oracle.com>
 <1d05a95f-75db-e0ca-e069-12fe41502e4f@oracle.com>
 <D2E3D675-2A4A-4ED1-BC30-6C52B185B731@oracle.com>
 <CAHeneC8cxBJ61A4u1AkfmpP63ZG6-WEkMyFLKnhRv=G6W32pFw@mail.gmail.com>
 <615e504d-94af-bae3-b721-6ca1dac6a567@oracle.com>
Message-ID: <CAHeneC9tdgc9aCx-VLa8TvYufq_V-z+ApoAv5Cewi6UzX9uz8Q@mail.gmail.com>

Hi David,

On Wed, Oct 4, 2017 at 10:51 PM, David Holmes <david.holmes at oracle.com>
wrote:

> Hi Alex,
>
> Can you tell me how shares/quotas are actually implemented in terms of
> allocating "cpus" to processes when shares/quotas are being applied?


The allocation of cpus to processes/threads(tasks as the kernel sees them)
or the other way round is called balancing, which is done by Scheduling
domains[3].

cpu shares use CFS "group" scheduling[1] to apply the share to all the
tasks(threads) in the container. The container cpu shares weight maps
directly to a task's weight in CFS, which given it is part of a group is
divided by the number of tasks in the group (ie. a default container share
of 1024 with 2 threads in the container/group would result in each
thread/task having a 512 weight[4]). The same values used by nice[2] also.

You can observe the task weight and other scheduler numbers in
/proc/sched_debug [4]. You can also kernel trace scheduler activity which
typically tells you the tasks involved, the cpu, the event: switch or
wakeup, etc.


> For example in a 12 cpu system if I have a 50% share do I get all 12 CPUs
> for 50% of a "quantum" each, or do I get 6 CPUs for a full quantum each?
>

You get 12 cpus for 50% of the time on the average if there is another
workload that has the same weight as you and is consuming as much as it can.
If there's nothing else running on the machine you get 12 cpus for 100% of
the time with a cpu shares only config (ie. the burst capacity).

I validated that the share was balanced over all the cpus by running linux
perf events and checking that there were cpu samples on all cpus. There's
bound to be other ways of doing it also.


>
> When we try to use the "number of processors" to control the number of
> threads created, or the number of partitions in a task, then we really want
> to know how many CPUs we can actually be concurrently running on!
>

Makes sense to check. Hopefully there aren't any major errors or omissions
in the above.
Thanks,
Alex

[1] https://lwn.net/Articles/240474/
[2] https://github.com/torvalds/linux/blob/368f89984bb971b9f8b69eeb85ab19
a89f985809/kernel/sched/core.c#L6735
[3] https://lwn.net/Articles/80911/ / http://www.i3s.unice.fr/~
jplozi/wastedcores/files/extended_talk.pdf

[4]

cfs_rq[13]:/system.slice/docker-f5681788d6daab249c90810fe60da4
29a2565b901ff34245922a578635b5d607.scope

  .exec_clock                    : 0.000000

  .MIN_vruntime                  : 0.000001

  .min_vruntime                  : 8090.087297

  .max_vruntime                  : 0.000001

  .spread                        : 0.000000

  .spread0                       : -124692718.052832

  .nr_spread_over                : 0

  .nr_running                    : 1

  .load                          : 1024

  .runnable_load_avg             : 1023

  .blocked_load_avg              : 0

  .tg_load_avg                   : 2046

  .tg_load_contrib               : 1023

  .tg_runnable_contrib           : 1023

  .tg->runnable_avg              : 2036

  .tg->cfs_bandwidth.timer_active: 0

  .throttled                     : 0

  .throttle_count                : 0

  .se->exec_start                : 236081964.515645

  .se->vruntime                  : 24403993.326934

  .se->sum_exec_runtime          : 8091.135873

  .se->load.weight               : 512

  .se->avg.runnable_avg_sum      : 45979

  .se->avg.runnable_avg_period   : 45979

  .se->avg.load_avg_contrib      : 511

  .se->avg.decay_count           : 0


>
> Thanks,
> David
>
>
> On 5/10/2017 6:01 AM, Alex Bagehot wrote:
>
>> Hi,
>>
>> On Wed, Oct 4, 2017 at 7:51 PM, Bob Vandette <bob.vandette at oracle.com>
>> wrote:
>>
>>
>>> On Oct 4, 2017, at 2:30 PM, Robbin Ehn <robbin.ehn at oracle.com> wrote:
>>>>
>>>> Thanks Bob for looking into this.
>>>>
>>>> On 10/04/2017 08:14 PM, Bob Vandette wrote:
>>>>
>>>>> Robbin,
>>>>> I?ve looked into this issue and you are correct.  I do have to examine
>>>>>
>>>> both the
>>>
>>>> sched_getaffinity results as well as the cgroup cpu subsystem
>>>>>
>>>> configuration
>>>
>>>> files in order to provide a reasonable value for active_processors.  If
>>>>>
>>>> I was only
>>>
>>>> interested in cpusets, I could simply rely on the getaffinity call but
>>>>>
>>>> I also want to
>>>
>>>> factor in shares and quotas as well.
>>>>>
>>>>
>>>> We had a quick discussion at the office, we actually do think that you
>>>>
>>> could skip reading the shares and quotas.
>>>
>>>> It really depends on what the user expect, if he give us 4 cpu's with
>>>>
>>> 50% or 2 full cpu what do he expect the differences would be?
>>>
>>>> One could argue that he 'knows' that he will only use max 50% and thus
>>>>
>>> we can act as if he is giving us 4 full cpu.
>>>
>>>> But I'll leave that up to you, just a tough we had.
>>>>
>>>
>>> It?s my opinion that we should do something if someone makes the effort
>>> to
>>> configure their
>>> containers to use quotas or shares.  There are many different opinions on
>>> what the right that
>>> right ?something? is.
>>>
>>>
>> It might be interesting to look at some real instances of how java
>> might[3]
>> be deployed in containers.
>> Marathon/Mesos[1] and Kubernetes[2] use shares and quotas so this is a
>> vast
>> chunk of deployments that need both of them today.
>>
>>
>>
>>> Many developers that are trying to deploy apps that use containers say
>>> they don?t like
>>> cpusets.  This is too limiting for them especially when the server
>>> configurations vary
>>> within their organization.
>>>
>>>
>> True, however Kubernetes has an alpha feature[5] where it allocates
>> cpusets
>> to containers that request a whole number of cpus. Previously without
>> cpusets any container could run on any cpu which we know might not be good
>> for some workloads that want isolation. A request for a fractional or
>> burstable amount of cpu would be allocated from a shared cpu pool. So
>> although manual allocation of cpusets will be flakey[3] , automation
>> should
>> be able to make it work.
>>
>>
>>
>>>  From everything I?ve read including source code, there seems to be a
>>> consensus that
>>> shares and quotas are being used as a way to specify a fraction of a
>>> system (number of cpus).
>>>
>>>
>> A refinement[6] on this is:
>> Shares can be used for guaranteed cpu - you will always get your share.
>> Quota[4] is a limit/constraint - you can never get more than the quota.
>> So given the below limit of how many shares will be allocated on a host
>> you
>> can have burstable(or overcommit) capacity if your shares are less than
>> your quota.
>>
>>
>>
>>> Docker added ?cpus which is implemented using quotas and periods.  They
>>> adjust these
>>> two parameters to provide a way of calculating the number of cpus that
>>> will be available
>>> to a process (quota/period).  Amazon also documents that cpu shares are
>>> defined to be a multiple of 1024.
>>> Where 1024 represents a single cpu and a share value of N*1024 represents
>>> N cpus.
>>>
>>>
>> Kubernetes and Mesos/Marathon also use the N*1024 shares per host to
>> allocate resources automatically.
>>
>> Hopefully this provides some background on what a couple of orchestration
>> systems that will be running java are doing currently in this area.
>> Thanks,
>> Alex
>>
>>
>> [1] https://github.com/apache/mesos/commit/346cc8dd528a28a6e
>> 1f1cbdb4c95b8bdea2f6070 / (now out of date but appears to be a reasonable
>> intro : https://zcox.wordpress.com/2014/09/17/cpu-resources-in-docke
>> r-mesos-and-marathon/ )
>> [1a] https://youtu.be/hJyAfC-Z2xk?t=2439
>>
>> [2] https://kubernetes.io/docs/concepts/configuration/manage
>> -compute-resources-container/
>>
>> [3] https://youtu.be/w1rZOY5gbvk?t=2479
>>
>> [4] https://www.kernel.org/doc/Documentation/scheduler/sched-bwc.txt
>> https://landley.net/kdocs/ols/2010/ols2010-pages-245-254.pdf
>> https://lwn.net/Articles/428175/
>>
>> [5]
>> https://github.com/kubernetes/community/blob/43ce57ac476b9f2
>> ce3f0220354a075e095a0d469/contributors/design-proposals/node
>> /cpu-manager.md
>> / https://github.com/kubernetes/kubernetes/commit/
>> 00f0e0f6504ad8dd85fcbbd6294cd7cf2475fc72 / https://vimeo.com/226858314
>>
>>
>> [6] https://kubernetes.io/docs/concepts/configuration/manage-
>> compute-resources-container/#how-pods-with-resource-limits-are-run
>>
>>
>> Of course these are just conventions.  This is why I provided a way of
>>> specifying the
>>> number of CPUs so folks deploying Java services can be certain they get
>>> what they want.
>>>
>>> Bob.
>>>
>>>
>>>> I had assumed that when sched_setaffinity was called (in your case by
>>>>>
>>>> numactl) that the
>>>
>>>> cgroup cpu config files would be updated to reflect the current
>>>>>
>>>> processor affinity for the
>>>
>>>> running process. This is not correct.  I have updated my changeset and
>>>>>
>>>> have successfully
>>>
>>>> run with your examples below.  I?ll post a new webrev soon.
>>>>>
>>>>
>>>> I see, thanks again!
>>>>
>>>> /Robbin
>>>>
>>>> Thanks,
>>>>> Bob.
>>>>>
>>>>>>
>>>>>> I still want to include the flag for at least one Java release in the
>>>>>>>
>>>>>> event that the new behavior causes some regression
>>>
>>>> in behavior.  I?m trying to make the detection robust so that it will
>>>>>>>
>>>>>> fallback to the current behavior in the event
>>>
>>>> that cgroups is not configured as expected but I?d like to have a way
>>>>>>>
>>>>>> of forcing the issue.  JDK 10 is not
>>>
>>>> supposed to be a long term support release which makes it a good
>>>>>>>
>>>>>> target for this new behavior.
>>>
>>>> I agree with David that once we commit to cgroups, we should extract
>>>>>>>
>>>>>> all VM configuration data from that
>>>
>>>> source.  There?s more information available for cpusets than just
>>>>>>>
>>>>>> processor affinity that we might want to
>>>
>>>> consider when calculating the number of processors to assume for the
>>>>>>>
>>>>>> VM.  There?s exclusivity and
>>>
>>>> effective cpu data available in addition to the cpuset string.
>>>>>>>
>>>>>>
>>>>>> cgroup only contains limits, not the real hard limits.
>>>>>> You most consider the affinity mask. We that have numa nodes do:
>>>>>>
>>>>>> [rehn at rehn-ws dev]$ numactl --cpunodebind=1 --membind=1 java
>>>>>>
>>>>> -Xlog:os=debug -cp . ForEver | grep proc
>>>
>>>> [0.001s][debug][os] Initial active processor count set to 16
>>>>>> [rehn at rehn-ws dev]$ numactl --cpunodebind=1 --membind=1 java
>>>>>>
>>>>> -Xlog:os=debug -XX:+UseContainerSupport -cp . ForEver | grep proc
>>>
>>>> [0.001s][debug][os] Initial active processor count set to 32
>>>>>>
>>>>>> when benchmarking all the time and that must be set to 16 otherwise
>>>>>>
>>>>> the flag is really bad for us.
>>>
>>>> So the flag actually breaks the little numa support we have now.
>>>>>>
>>>>>> Thanks, Robbin
>>>>>>
>>>>>
>>>
>>>

From jaroslav.tulach at oracle.com  Thu Oct  5 15:32:39 2017
From: jaroslav.tulach at oracle.com (Jaroslav Tulach)
Date: Thu, 05 Oct 2017 17:32:39 +0200
Subject: [10] RFR(S) 8188775: Module jdk.internal.vm.compiler.management
 has not been granted accessClassInPackage.org.graalvm.compiler.hotspot
In-Reply-To: <a374f47d-c4a5-0fb4-8674-dba65624532f@oracle.com>
References: <a374f47d-c4a5-0fb4-8674-dba65624532f@oracle.com>
Message-ID: <2799842.XxlxnWyqlB@pracovni>

Opps. Sorry for causing the problem. I haven't executed the test in question 
and thus I thought everything is OK.

Thanks Vladimir for creating the fix.
-jt

On st?eda 4. ??jna 2017 16:05:33 CEST Vladimir Kozlov wrote:
> https://bugs.openjdk.java.net/browse/JDK-8188775
> 
> Changes for 8182701[1] missed changes in default.policy for new module
> jdk.internal.vm.compiler.management.
> 
> Add missing code:
> 
> src/java.base/share/lib/security/default.policy
> @@ -154,6 +154,10 @@
>       permission java.security.AllPermission;
>   };
> 
> +grant codeBase "jrt:/jdk.internal.vm.compiler.management" {
> +    permission java.security.AllPermission;
> +};
> +
>   grant codeBase "jrt:/jdk.jsobject" {
>       permission java.security.AllPermission;
>   };
> 
> Verified with failed test.
> 
> Thanks,
> Vladimir
> 
> [1] http://hg.openjdk.java.net/jdk10/hs/rev/8b2054b7d02c


From bob.vandette at oracle.com  Thu Oct  5 17:57:26 2017
From: bob.vandette at oracle.com (Bob Vandette)
Date: Thu, 5 Oct 2017 13:57:26 -0400
Subject: RFR: 8146115 - Improve docker container detection and resource
 configuration usage
In-Reply-To: <CAHeneC9tdgc9aCx-VLa8TvYufq_V-z+ApoAv5Cewi6UzX9uz8Q@mail.gmail.com>
References: <74630458-926E-4B3E-B967-6F6ADCA0A406@oracle.com>
 <7e9322e8-274d-9fb2-f6a5-8cc612e3fe68@oracle.com>
 <33d617e7-7ec4-ebde-efa1-5602189e8470@oracle.com>
 <12909a67-6876-a40c-85b9-b959ed9f02df@oracle.com>
 <39AD9F8D-7E2B-4C15-8525-36DBA7C74302@oracle.com>
 <d3c36d08-1c9f-69bb-4e0c-aa00d3e62691@oracle.com>
 <F25C2D01-C643-4838-B2A7-DAA19D1B1EF5@oracle.com>
 <1d05a95f-75db-e0ca-e069-12fe41502e4f@oracle.com>
 <D2E3D675-2A4A-4ED1-BC30-6C52B185B731@oracle.com>
 <CAHeneC8cxBJ61A4u1AkfmpP63ZG6-WEkMyFLKnhRv=G6W32pFw@mail.gmail.com>
 <615e504d-94af-bae3-b721-6ca1dac6a567@oracle.com>
 <CAHeneC9tdgc9aCx-VLa8TvYufq_V-z+ApoAv5Cewi6UzX9uz8Q@mail.gmail.com>
Message-ID: <1BD883DB-8C8B-405D-8F85-3A026B19286F@oracle.com>


> On Oct 5, 2017, at 12:43 PM, Alex Bagehot <ceeaspb at gmail.com> wrote:
> 
> Hi David,
> 
> On Wed, Oct 4, 2017 at 10:51 PM, David Holmes <david.holmes at oracle.com <mailto:david.holmes at oracle.com>> wrote:
> Hi Alex,
> 
> Can you tell me how shares/quotas are actually implemented in terms of allocating "cpus" to processes when shares/quotas are being applied?
> 
> The allocation of cpus to processes/threads(tasks as the kernel sees them) or the other way round is called balancing, which is done by Scheduling domains[3].
> 
> cpu shares use CFS "group" scheduling[1] to apply the share to all the tasks(threads) in the container. The container cpu shares weight maps directly to a task's weight in CFS, which given it is part of a group is divided by the number of tasks in the group (ie. a default container share of 1024 with 2 threads in the container/group would result in each thread/task having a 512 weight[4]). The same values used by nice[2] also.
> 
> You can observe the task weight and other scheduler numbers in /proc/sched_debug [4]. You can also kernel trace scheduler activity which typically tells you the tasks involved, the cpu, the event: switch or wakeup, etc.
>  
> For example in a 12 cpu system if I have a 50% share do I get all 12 CPUs for 50% of a "quantum" each, or do I get 6 CPUs for a full quantum each?
> 
> You get 12 cpus for 50% of the time on the average if there is another workload that has the same weight as you and is consuming as much as it can.
> If there's nothing else running on the machine you get 12 cpus for 100% of the time with a cpu shares only config (ie. the burst capacity).
> 
> I validated that the share was balanced over all the cpus by running linux perf events and checking that there were cpu samples on all cpus. There's bound to be other ways of doing it also.
>  
> 
> When we try to use the "number of processors" to control the number of threads created, or the number of partitions in a task, then we really want to know how many CPUs we can actually be concurrently running on!

I?m not sure that the primary question for serverless container execution.  Just because you 
might happen to burst and have available to you more CPU time than you specified in your 
shares doesn?t mean that a multi-threaded application running in one of these containers 
should configure itself to use all available host processors.  This would result in over-burdoning
the system at times of high load.

The Java runtime, at startup, configures several subsystems to use a number of threads for 
each system based on the number of available processors.  These subsystems include things like
the number of GC threads, JIT compiler and thread pools.

The problem I am trying to solve is to come up with a single number of CPUs based on container 
knowledge that can be used for the Java runtime subsystem to configure itself.  I believe that we 
should trust the implementor of the Mesos or Kubernetes setup and honor their wishes when coming 
up with this number and not just use the processor affinity or number of cpus in the cpuset.

The challenge is determining the right algorithm that doesn?t penalize the VM.

My current implementation does this:

total available logical processors = min (cpusets,sched_getaffinity,shares/1024, quota/period)

All fractional units are rounded up to the next whole number.

Bob.

> 
> Makes sense to check. Hopefully there aren't any major errors or omissions in the above. 
> Thanks,
> Alex
> 
> [1] https://lwn.net/Articles/240474/ <https://lwn.net/Articles/240474/>
> [2] https://github.com/torvalds/linux/blob/368f89984bb971b9f8b69eeb85ab19a89f985809/kernel/sched/core.c#L6735 <https://github.com/torvalds/linux/blob/368f89984bb971b9f8b69eeb85ab19a89f985809/kernel/sched/core.c#L6735>
> [3] https://lwn.net/Articles/80911/ <https://lwn.net/Articles/80911/> / http://www.i3s.unice.fr/~jplozi/wastedcores/files/extended_talk.pdf <http://www.i3s.unice.fr/~jplozi/wastedcores/files/extended_talk.pdf>
> 
> [4]
> cfs_rq[13]:/system.slice/docker-f5681788d6daab249c90810fe60da429a2565b901ff34245922a578635b5d607.scope
> 
>   .exec_clock                    : 0.000000
> 
>   .MIN_vruntime                  : 0.000001
> 
>   .min_vruntime                  : 8090.087297
> 
>   .max_vruntime                  : 0.000001
> 
>   .spread                        : 0.000000
> 
>   .spread0                       : -124692718.052832
> 
>   .nr_spread_over                : 0
> 
>   .nr_running                    : 1
> 
>   .load                          : 1024
> 
>   .runnable_load_avg             : 1023
> 
>   .blocked_load_avg              : 0
> 
>   .tg_load_avg                   : 2046
> 
>   .tg_load_contrib               : 1023
> 
>   .tg_runnable_contrib           : 1023
> 
>   .tg->runnable_avg              : 2036
> 
>   .tg->cfs_bandwidth.timer_active: 0
> 
>   .throttled                     : 0
> 
>   .throttle_count                : 0
> 
>   .se->exec_start                : 236081964.515645
> 
>   .se->vruntime                  : 24403993.326934
> 
>   .se->sum_exec_runtime          : 8091.135873
> 
>   .se->load.weight               : 512
> 
>   .se->avg.runnable_avg_sum      : 45979
> 
>   .se->avg.runnable_avg_period   : 45979
> 
>   .se->avg.load_avg_contrib      : 511
> 
>   .se->avg.decay_count           : 0
> 
>  
> 
> Thanks,
> David
> 
> 
> On 5/10/2017 6:01 AM, Alex Bagehot wrote:
> Hi,
> 
> On Wed, Oct 4, 2017 at 7:51 PM, Bob Vandette <bob.vandette at oracle.com <mailto:bob.vandette at oracle.com>>
> wrote:
> 
> 
> On Oct 4, 2017, at 2:30 PM, Robbin Ehn <robbin.ehn at oracle.com <mailto:robbin.ehn at oracle.com>> wrote:
> 
> Thanks Bob for looking into this.
> 
> On 10/04/2017 08:14 PM, Bob Vandette wrote:
> Robbin,
> I?ve looked into this issue and you are correct.  I do have to examine
> both the
> sched_getaffinity results as well as the cgroup cpu subsystem
> configuration
> files in order to provide a reasonable value for active_processors.  If
> I was only
> interested in cpusets, I could simply rely on the getaffinity call but
> I also want to
> factor in shares and quotas as well.
> 
> We had a quick discussion at the office, we actually do think that you
> could skip reading the shares and quotas.
> It really depends on what the user expect, if he give us 4 cpu's with
> 50% or 2 full cpu what do he expect the differences would be?
> One could argue that he 'knows' that he will only use max 50% and thus
> we can act as if he is giving us 4 full cpu.
> But I'll leave that up to you, just a tough we had.
> 
> It?s my opinion that we should do something if someone makes the effort to
> configure their
> containers to use quotas or shares.  There are many different opinions on
> what the right that
> right ?something? is.
> 
> 
> It might be interesting to look at some real instances of how java might[3]
> be deployed in containers.
> Marathon/Mesos[1] and Kubernetes[2] use shares and quotas so this is a vast
> chunk of deployments that need both of them today.
> 
> 
> 
> Many developers that are trying to deploy apps that use containers say
> they don?t like
> cpusets.  This is too limiting for them especially when the server
> configurations vary
> within their organization.
> 
> 
> True, however Kubernetes has an alpha feature[5] where it allocates cpusets
> to containers that request a whole number of cpus. Previously without
> cpusets any container could run on any cpu which we know might not be good
> for some workloads that want isolation. A request for a fractional or
> burstable amount of cpu would be allocated from a shared cpu pool. So
> although manual allocation of cpusets will be flakey[3] , automation should
> be able to make it work.
> 
> 
> 
>  From everything I?ve read including source code, there seems to be a
> consensus that
> shares and quotas are being used as a way to specify a fraction of a
> system (number of cpus).
> 
> 
> A refinement[6] on this is:
> Shares can be used for guaranteed cpu - you will always get your share.
> Quota[4] is a limit/constraint - you can never get more than the quota.
> So given the below limit of how many shares will be allocated on a host you
> can have burstable(or overcommit) capacity if your shares are less than
> your quota.
> 
> 
> 
> Docker added ?cpus which is implemented using quotas and periods.  They
> adjust these
> two parameters to provide a way of calculating the number of cpus that
> will be available
> to a process (quota/period).  Amazon also documents that cpu shares are
> defined to be a multiple of 1024.
> Where 1024 represents a single cpu and a share value of N*1024 represents
> N cpus.
> 
> 
> Kubernetes and Mesos/Marathon also use the N*1024 shares per host to
> allocate resources automatically.
> 
> Hopefully this provides some background on what a couple of orchestration
> systems that will be running java are doing currently in this area.
> Thanks,
> Alex
> 
> 
> [1] https://github.com/apache/mesos/commit/346cc8dd528a28a6e <https://github.com/apache/mesos/commit/346cc8dd528a28a6e>
> 1f1cbdb4c95b8bdea2f6070 / (now out of date but appears to be a reasonable
> intro : https://zcox.wordpress.com/2014/09/17/cpu-resources-in-docke <https://zcox.wordpress.com/2014/09/17/cpu-resources-in-docke>
> r-mesos-and-marathon/ )
> [1a] https://youtu.be/hJyAfC-Z2xk?t=2439 <https://youtu.be/hJyAfC-Z2xk?t=2439>
> 
> [2] https://kubernetes.io/docs/concepts/configuration/manage <https://kubernetes.io/docs/concepts/configuration/manage>
> -compute-resources-container/
> 
> [3] https://youtu.be/w1rZOY5gbvk?t=2479 <https://youtu.be/w1rZOY5gbvk?t=2479>
> 
> [4] https://www.kernel.org/doc/Documentation/scheduler/sched-bwc.txt <https://www.kernel.org/doc/Documentation/scheduler/sched-bwc.txt>
> https://landley.net/kdocs/ols/2010/ols2010-pages-245-254.pdf <https://landley.net/kdocs/ols/2010/ols2010-pages-245-254.pdf>
> https://lwn.net/Articles/428175/ <https://lwn.net/Articles/428175/>
> 
> [5]
> https://github.com/kubernetes/community/blob/43ce57ac476b9f2ce3f0220354a075e095a0d469/contributors/design-proposals/node/cpu-manager.md <https://github.com/kubernetes/community/blob/43ce57ac476b9f2ce3f0220354a075e095a0d469/contributors/design-proposals/node/cpu-manager.md>
> / https://github.com/kubernetes/kubernetes/commit/ <https://github.com/kubernetes/kubernetes/commit/>
> 00f0e0f6504ad8dd85fcbbd6294cd7cf2475fc72 / https://vimeo.com/226858314 <https://vimeo.com/226858314>
> 
> 
> [6] https://kubernetes.io/docs/concepts/configuration/manage- <https://kubernetes.io/docs/concepts/configuration/manage->
> compute-resources-container/#how-pods-with-resource-limits-are-run
> 
> 
> Of course these are just conventions.  This is why I provided a way of
> specifying the
> number of CPUs so folks deploying Java services can be certain they get
> what they want.
> 
> Bob.
> 
> 
> I had assumed that when sched_setaffinity was called (in your case by
> numactl) that the
> cgroup cpu config files would be updated to reflect the current
> processor affinity for the
> running process. This is not correct.  I have updated my changeset and
> have successfully
> run with your examples below.  I?ll post a new webrev soon.
> 
> I see, thanks again!
> 
> /Robbin
> 
> Thanks,
> Bob.
> 
> I still want to include the flag for at least one Java release in the
> event that the new behavior causes some regression
> in behavior.  I?m trying to make the detection robust so that it will
> fallback to the current behavior in the event
> that cgroups is not configured as expected but I?d like to have a way
> of forcing the issue.  JDK 10 is not
> supposed to be a long term support release which makes it a good
> target for this new behavior.
> I agree with David that once we commit to cgroups, we should extract
> all VM configuration data from that
> source.  There?s more information available for cpusets than just
> processor affinity that we might want to
> consider when calculating the number of processors to assume for the
> VM.  There?s exclusivity and
> effective cpu data available in addition to the cpuset string.
> 
> cgroup only contains limits, not the real hard limits.
> You most consider the affinity mask. We that have numa nodes do:
> 
> [rehn at rehn-ws dev]$ numactl --cpunodebind=1 --membind=1 java
> -Xlog:os=debug -cp . ForEver | grep proc
> [0.001s][debug][os] Initial active processor count set to 16
> [rehn at rehn-ws dev]$ numactl --cpunodebind=1 --membind=1 java
> -Xlog:os=debug -XX:+UseContainerSupport -cp . ForEver | grep proc
> [0.001s][debug][os] Initial active processor count set to 32
> 
> when benchmarking all the time and that must be set to 16 otherwise
> the flag is really bad for us.
> So the flag actually breaks the little numa support we have now.
> 
> Thanks, Robbin
> 
> 
> 


From karen.kinnear at oracle.com  Thu Oct  5 19:13:30 2017
From: karen.kinnear at oracle.com (Karen Kinnear)
Date: Thu, 5 Oct 2017 15:13:30 -0400
Subject: CFV: New hotspot Group Member: Ioi Lam
In-Reply-To: <c80c0f72-aa17-1a18-d0b1-a759afd8283d@oracle.com>
References: <c80c0f72-aa17-1a18-d0b1-a759afd8283d@oracle.com>
Message-ID: <F163C78E-9B0A-4666-B472-24709F1FCAC8@oracle.com>

vote: yes

Karen

> On Oct 2, 2017, at 11:24 AM, coleen.phillimore at oracle.com wrote:
> 
> I hereby nominate Ioi Lam (OpenJDK user name: iklam) to Membership in the hotspot Group.
> 
> Ioi has been working on the hotspot project for over 5 years and is a Reviewer in the JDK 9 Project with 79 changes.   He is an expert in the area of class data sharing.
> 
> Votes are due by Monday, October 16, 2017.
> 
> Only current Members of the hotspot Group [1] are eligible to vote on this nomination. Votes must be cast in the open by replying to this mailing list.
> 
> For Lazy Consensus voting instructions, see [2].
> 
> Coleen
> 
> [1]http://openjdk.java.net/census#hotspot
> [2]http://openjdk.java.net/groups/#member-vote


From robbin.ehn at oracle.com  Thu Oct  5 19:17:10 2017
From: robbin.ehn at oracle.com (Robbin Ehn)
Date: Thu, 5 Oct 2017 21:17:10 +0200
Subject: RFR: 8146115 - Improve docker container detection and resource
 configuration usage
In-Reply-To: <CAHeneC9tdgc9aCx-VLa8TvYufq_V-z+ApoAv5Cewi6UzX9uz8Q@mail.gmail.com>
References: <74630458-926E-4B3E-B967-6F6ADCA0A406@oracle.com>
 <7e9322e8-274d-9fb2-f6a5-8cc612e3fe68@oracle.com>
 <33d617e7-7ec4-ebde-efa1-5602189e8470@oracle.com>
 <12909a67-6876-a40c-85b9-b959ed9f02df@oracle.com>
 <39AD9F8D-7E2B-4C15-8525-36DBA7C74302@oracle.com>
 <d3c36d08-1c9f-69bb-4e0c-aa00d3e62691@oracle.com>
 <F25C2D01-C643-4838-B2A7-DAA19D1B1EF5@oracle.com>
 <1d05a95f-75db-e0ca-e069-12fe41502e4f@oracle.com>
 <D2E3D675-2A4A-4ED1-BC30-6C52B185B731@oracle.com>
 <CAHeneC8cxBJ61A4u1AkfmpP63ZG6-WEkMyFLKnhRv=G6W32pFw@mail.gmail.com>
 <615e504d-94af-bae3-b721-6ca1dac6a567@oracle.com>
 <CAHeneC9tdgc9aCx-VLa8TvYufq_V-z+ApoAv5Cewi6UzX9uz8Q@mail.gmail.com>
Message-ID: <65492519-d2b2-82ae-37a0-4540d4c5b937@oracle.com>

Hi Alex, just a short question,

You said something about "Marathon/Mesos[1] and Kubernetes[2] use shares and quotas"
If you only use shares and quotas, do you not care about numa? (read trust kernel)
On would think that you would setup a cgroup per numa node and split those into cgroups with shares/quotas.

Thanks, Robbin

On 10/05/2017 06:43 PM, Alex Bagehot wrote:
> Hi David,
> 
> On Wed, Oct 4, 2017 at 10:51 PM, David Holmes <david.holmes at oracle.com>
> wrote:
> 
>> Hi Alex,
>>
>> Can you tell me how shares/quotas are actually implemented in terms of
>> allocating "cpus" to processes when shares/quotas are being applied?
> 
> 
> The allocation of cpus to processes/threads(tasks as the kernel sees them)
> or the other way round is called balancing, which is done by Scheduling
> domains[3].
> 
> cpu shares use CFS "group" scheduling[1] to apply the share to all the
> tasks(threads) in the container. The container cpu shares weight maps
> directly to a task's weight in CFS, which given it is part of a group is
> divided by the number of tasks in the group (ie. a default container share
> of 1024 with 2 threads in the container/group would result in each
> thread/task having a 512 weight[4]). The same values used by nice[2] also.
> 
> You can observe the task weight and other scheduler numbers in
> /proc/sched_debug [4]. You can also kernel trace scheduler activity which
> typically tells you the tasks involved, the cpu, the event: switch or
> wakeup, etc.
> 
> 
>> For example in a 12 cpu system if I have a 50% share do I get all 12 CPUs
>> for 50% of a "quantum" each, or do I get 6 CPUs for a full quantum each?
>>
> 
> You get 12 cpus for 50% of the time on the average if there is another
> workload that has the same weight as you and is consuming as much as it can.
> If there's nothing else running on the machine you get 12 cpus for 100% of
> the time with a cpu shares only config (ie. the burst capacity).
> 
> I validated that the share was balanced over all the cpus by running linux
> perf events and checking that there were cpu samples on all cpus. There's
> bound to be other ways of doing it also.
> 
> 
>>
>> When we try to use the "number of processors" to control the number of
>> threads created, or the number of partitions in a task, then we really want
>> to know how many CPUs we can actually be concurrently running on!
>>
> 
> Makes sense to check. Hopefully there aren't any major errors or omissions
> in the above.
> Thanks,
> Alex
> 
> [1] https://lwn.net/Articles/240474/
> [2] https://github.com/torvalds/linux/blob/368f89984bb971b9f8b69eeb85ab19
> a89f985809/kernel/sched/core.c#L6735
> [3] https://lwn.net/Articles/80911/ / http://www.i3s.unice.fr/~
> jplozi/wastedcores/files/extended_talk.pdf
> 
> [4]
> 
> cfs_rq[13]:/system.slice/docker-f5681788d6daab249c90810fe60da4
> 29a2565b901ff34245922a578635b5d607.scope
> 
>    .exec_clock                    : 0.000000
> 
>    .MIN_vruntime                  : 0.000001
> 
>    .min_vruntime                  : 8090.087297
> 
>    .max_vruntime                  : 0.000001
> 
>    .spread                        : 0.000000
> 
>    .spread0                       : -124692718.052832
> 
>    .nr_spread_over                : 0
> 
>    .nr_running                    : 1
> 
>    .load                          : 1024
> 
>    .runnable_load_avg             : 1023
> 
>    .blocked_load_avg              : 0
> 
>    .tg_load_avg                   : 2046
> 
>    .tg_load_contrib               : 1023
> 
>    .tg_runnable_contrib           : 1023
> 
>    .tg->runnable_avg              : 2036
> 
>    .tg->cfs_bandwidth.timer_active: 0
> 
>    .throttled                     : 0
> 
>    .throttle_count                : 0
> 
>    .se->exec_start                : 236081964.515645
> 
>    .se->vruntime                  : 24403993.326934
> 
>    .se->sum_exec_runtime          : 8091.135873
> 
>    .se->load.weight               : 512
> 
>    .se->avg.runnable_avg_sum      : 45979
> 
>    .se->avg.runnable_avg_period   : 45979
> 
>    .se->avg.load_avg_contrib      : 511
> 
>    .se->avg.decay_count           : 0
> 
> 
>>
>> Thanks,
>> David
>>
>>
>> On 5/10/2017 6:01 AM, Alex Bagehot wrote:
>>
>>> Hi,
>>>
>>> On Wed, Oct 4, 2017 at 7:51 PM, Bob Vandette <bob.vandette at oracle.com>
>>> wrote:
>>>
>>>
>>>> On Oct 4, 2017, at 2:30 PM, Robbin Ehn <robbin.ehn at oracle.com> wrote:
>>>>>
>>>>> Thanks Bob for looking into this.
>>>>>
>>>>> On 10/04/2017 08:14 PM, Bob Vandette wrote:
>>>>>
>>>>>> Robbin,
>>>>>> I?ve looked into this issue and you are correct.  I do have to examine
>>>>>>
>>>>> both the
>>>>
>>>>> sched_getaffinity results as well as the cgroup cpu subsystem
>>>>>>
>>>>> configuration
>>>>
>>>>> files in order to provide a reasonable value for active_processors.  If
>>>>>>
>>>>> I was only
>>>>
>>>>> interested in cpusets, I could simply rely on the getaffinity call but
>>>>>>
>>>>> I also want to
>>>>
>>>>> factor in shares and quotas as well.
>>>>>>
>>>>>
>>>>> We had a quick discussion at the office, we actually do think that you
>>>>>
>>>> could skip reading the shares and quotas.
>>>>
>>>>> It really depends on what the user expect, if he give us 4 cpu's with
>>>>>
>>>> 50% or 2 full cpu what do he expect the differences would be?
>>>>
>>>>> One could argue that he 'knows' that he will only use max 50% and thus
>>>>>
>>>> we can act as if he is giving us 4 full cpu.
>>>>
>>>>> But I'll leave that up to you, just a tough we had.
>>>>>
>>>>
>>>> It?s my opinion that we should do something if someone makes the effort
>>>> to
>>>> configure their
>>>> containers to use quotas or shares.  There are many different opinions on
>>>> what the right that
>>>> right ?something? is.
>>>>
>>>>
>>> It might be interesting to look at some real instances of how java
>>> might[3]
>>> be deployed in containers.
>>> Marathon/Mesos[1] and Kubernetes[2] use shares and quotas so this is a
>>> vast
>>> chunk of deployments that need both of them today.
>>>
>>>
>>>
>>>> Many developers that are trying to deploy apps that use containers say
>>>> they don?t like
>>>> cpusets.  This is too limiting for them especially when the server
>>>> configurations vary
>>>> within their organization.
>>>>
>>>>
>>> True, however Kubernetes has an alpha feature[5] where it allocates
>>> cpusets
>>> to containers that request a whole number of cpus. Previously without
>>> cpusets any container could run on any cpu which we know might not be good
>>> for some workloads that want isolation. A request for a fractional or
>>> burstable amount of cpu would be allocated from a shared cpu pool. So
>>> although manual allocation of cpusets will be flakey[3] , automation
>>> should
>>> be able to make it work.
>>>
>>>
>>>
>>>>   From everything I?ve read including source code, there seems to be a
>>>> consensus that
>>>> shares and quotas are being used as a way to specify a fraction of a
>>>> system (number of cpus).
>>>>
>>>>
>>> A refinement[6] on this is:
>>> Shares can be used for guaranteed cpu - you will always get your share.
>>> Quota[4] is a limit/constraint - you can never get more than the quota.
>>> So given the below limit of how many shares will be allocated on a host
>>> you
>>> can have burstable(or overcommit) capacity if your shares are less than
>>> your quota.
>>>
>>>
>>>
>>>> Docker added ?cpus which is implemented using quotas and periods.  They
>>>> adjust these
>>>> two parameters to provide a way of calculating the number of cpus that
>>>> will be available
>>>> to a process (quota/period).  Amazon also documents that cpu shares are
>>>> defined to be a multiple of 1024.
>>>> Where 1024 represents a single cpu and a share value of N*1024 represents
>>>> N cpus.
>>>>
>>>>
>>> Kubernetes and Mesos/Marathon also use the N*1024 shares per host to
>>> allocate resources automatically.
>>>
>>> Hopefully this provides some background on what a couple of orchestration
>>> systems that will be running java are doing currently in this area.
>>> Thanks,
>>> Alex
>>>
>>>
>>> [1] https://github.com/apache/mesos/commit/346cc8dd528a28a6e
>>> 1f1cbdb4c95b8bdea2f6070 / (now out of date but appears to be a reasonable
>>> intro : https://zcox.wordpress.com/2014/09/17/cpu-resources-in-docke
>>> r-mesos-and-marathon/ )
>>> [1a] https://youtu.be/hJyAfC-Z2xk?t=2439
>>>
>>> [2] https://kubernetes.io/docs/concepts/configuration/manage
>>> -compute-resources-container/
>>>
>>> [3] https://youtu.be/w1rZOY5gbvk?t=2479
>>>
>>> [4] https://www.kernel.org/doc/Documentation/scheduler/sched-bwc.txt
>>> https://landley.net/kdocs/ols/2010/ols2010-pages-245-254.pdf
>>> https://lwn.net/Articles/428175/
>>>
>>> [5]
>>> https://github.com/kubernetes/community/blob/43ce57ac476b9f2
>>> ce3f0220354a075e095a0d469/contributors/design-proposals/node
>>> /cpu-manager.md
>>> / https://github.com/kubernetes/kubernetes/commit/
>>> 00f0e0f6504ad8dd85fcbbd6294cd7cf2475fc72 / https://vimeo.com/226858314
>>>
>>>
>>> [6] https://kubernetes.io/docs/concepts/configuration/manage-
>>> compute-resources-container/#how-pods-with-resource-limits-are-run
>>>
>>>
>>> Of course these are just conventions.  This is why I provided a way of
>>>> specifying the
>>>> number of CPUs so folks deploying Java services can be certain they get
>>>> what they want.
>>>>
>>>> Bob.
>>>>
>>>>
>>>>> I had assumed that when sched_setaffinity was called (in your case by
>>>>>>
>>>>> numactl) that the
>>>>
>>>>> cgroup cpu config files would be updated to reflect the current
>>>>>>
>>>>> processor affinity for the
>>>>
>>>>> running process. This is not correct.  I have updated my changeset and
>>>>>>
>>>>> have successfully
>>>>
>>>>> run with your examples below.  I?ll post a new webrev soon.
>>>>>>
>>>>>
>>>>> I see, thanks again!
>>>>>
>>>>> /Robbin
>>>>>
>>>>> Thanks,
>>>>>> Bob.
>>>>>>
>>>>>>>
>>>>>>> I still want to include the flag for at least one Java release in the
>>>>>>>>
>>>>>>> event that the new behavior causes some regression
>>>>
>>>>> in behavior.  I?m trying to make the detection robust so that it will
>>>>>>>>
>>>>>>> fallback to the current behavior in the event
>>>>
>>>>> that cgroups is not configured as expected but I?d like to have a way
>>>>>>>>
>>>>>>> of forcing the issue.  JDK 10 is not
>>>>
>>>>> supposed to be a long term support release which makes it a good
>>>>>>>>
>>>>>>> target for this new behavior.
>>>>
>>>>> I agree with David that once we commit to cgroups, we should extract
>>>>>>>>
>>>>>>> all VM configuration data from that
>>>>
>>>>> source.  There?s more information available for cpusets than just
>>>>>>>>
>>>>>>> processor affinity that we might want to
>>>>
>>>>> consider when calculating the number of processors to assume for the
>>>>>>>>
>>>>>>> VM.  There?s exclusivity and
>>>>
>>>>> effective cpu data available in addition to the cpuset string.
>>>>>>>>
>>>>>>>
>>>>>>> cgroup only contains limits, not the real hard limits.
>>>>>>> You most consider the affinity mask. We that have numa nodes do:
>>>>>>>
>>>>>>> [rehn at rehn-ws dev]$ numactl --cpunodebind=1 --membind=1 java
>>>>>>>
>>>>>> -Xlog:os=debug -cp . ForEver | grep proc
>>>>
>>>>> [0.001s][debug][os] Initial active processor count set to 16
>>>>>>> [rehn at rehn-ws dev]$ numactl --cpunodebind=1 --membind=1 java
>>>>>>>
>>>>>> -Xlog:os=debug -XX:+UseContainerSupport -cp . ForEver | grep proc
>>>>
>>>>> [0.001s][debug][os] Initial active processor count set to 32
>>>>>>>
>>>>>>> when benchmarking all the time and that must be set to 16 otherwise
>>>>>>>
>>>>>> the flag is really bad for us.
>>>>
>>>>> So the flag actually breaks the little numa support we have now.
>>>>>>>
>>>>>>> Thanks, Robbin
>>>>>>>
>>>>>>
>>>>
>>>>

From zgu at redhat.com  Thu Oct  5 19:47:37 2017
From: zgu at redhat.com (Zhengyu Gu)
Date: Thu, 5 Oct 2017 15:47:37 -0400
Subject: RFR(XXS) 8187685: NMT: Tracking compiler memory usage of thread's
 resource area
Message-ID: <69808d92-6ac8-9d83-61dc-6bb45936b4dc@redhat.com>

Compiler uses resource area for compilation, let's bias it to mtCompiler 
for more accurate memory counting.

Bug: https://bugs.openjdk.java.net/browse/JDK-8187685
Webrev: http://cr.openjdk.java.net/~zgu/8187685/webrev.00/index.html


Discussion thread: 
http://mail.openjdk.java.net/pipermail/hotspot-dev/2017-September/028360.html

Test:

   hotspot_tier1  fastdebug and release on Linux x64.

Thanks,

-Zhengyu

From coleen.phillimore at oracle.com  Thu Oct  5 21:55:31 2017
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Thu, 5 Oct 2017 17:55:31 -0400
Subject: Result: New hotspot Group Member: Markus Gronlund
Message-ID: <efd0a12d-4696-1a59-645b-1fc3f3335b36@oracle.com>

The vote for Markus Gronlund [1] is now closed.

Yes: 11
Veto: 0
Abstain: 0

According to the Bylaws definition of Lazy Consensus, this is sufficient 
to approve the nomination.

Coleen Phillimore

[1] 
http://mail.openjdk.java.net/pipermail/hotspot-dev/2017-September/028362.html

From david.holmes at oracle.com  Thu Oct  5 22:12:30 2017
From: david.holmes at oracle.com (David Holmes)
Date: Fri, 6 Oct 2017 08:12:30 +1000
Subject: RFR: 8146115 - Improve docker container detection and resource
 configuration usage
In-Reply-To: <1BD883DB-8C8B-405D-8F85-3A026B19286F@oracle.com>
References: <74630458-926E-4B3E-B967-6F6ADCA0A406@oracle.com>
 <7e9322e8-274d-9fb2-f6a5-8cc612e3fe68@oracle.com>
 <33d617e7-7ec4-ebde-efa1-5602189e8470@oracle.com>
 <12909a67-6876-a40c-85b9-b959ed9f02df@oracle.com>
 <39AD9F8D-7E2B-4C15-8525-36DBA7C74302@oracle.com>
 <d3c36d08-1c9f-69bb-4e0c-aa00d3e62691@oracle.com>
 <F25C2D01-C643-4838-B2A7-DAA19D1B1EF5@oracle.com>
 <1d05a95f-75db-e0ca-e069-12fe41502e4f@oracle.com>
 <D2E3D675-2A4A-4ED1-BC30-6C52B185B731@oracle.com>
 <CAHeneC8cxBJ61A4u1AkfmpP63ZG6-WEkMyFLKnhRv=G6W32pFw@mail.gmail.com>
 <615e504d-94af-bae3-b721-6ca1dac6a567@oracle.com>
 <CAHeneC9tdgc9aCx-VLa8TvYufq_V-z+ApoAv5Cewi6UzX9uz8Q@mail.gmail.com>
 <1BD883DB-8C8B-405D-8F85-3A026B19286F@oracle.com>
Message-ID: <5f7f3d85-db48-fe6e-28f5-e3f4858f33e8@oracle.com>

Hi Bob,

On 6/10/2017 3:57 AM, Bob Vandette wrote:
> 
>> On Oct 5, 2017, at 12:43 PM, Alex Bagehot <ceeaspb at gmail.com 
>> <mailto:ceeaspb at gmail.com>> wrote:
>>
>> Hi David,
>>
>> On Wed, Oct 4, 2017 at 10:51 PM, David Holmes <david.holmes at oracle.com 
>> <mailto:david.holmes at oracle.com>> wrote:
>>
>>     Hi Alex,
>>
>>     Can you tell me how shares/quotas are actually implemented in
>>     terms of allocating "cpus" to processes when shares/quotas are
>>     being applied? 
>>
>>
>> The allocation of cpus to processes/threads(tasks as the kernel sees 
>> them) or the other way round is called balancing, which is done by 
>> Scheduling domains[3].
>>
>> cpu shares use CFS "group" scheduling[1] to apply the share to all the 
>> tasks(threads) in the container. The container cpu shares weight maps 
>> directly to a task's weight in CFS, which given it is part of a group 
>> is divided by the number of tasks in the group (ie. a default 
>> container share of 1024 with 2 threads in the container/group would 
>> result in each thread/task having a 512 weight[4]). The same values 
>> used by nice[2] also.
>>
>> You can observe the task weight and other scheduler numbers in 
>> /proc/sched_debug [4]. You can also kernel trace scheduler activity 
>> which typically tells you the tasks involved, the cpu, the event: 
>> switch or wakeup, etc.
>>
>>     For example in a 12 cpu system if I have a 50% share do I get all
>>     12 CPUs for 50% of a "quantum" each, or do I get 6 CPUs for a full
>>     quantum each?
>>
>>
>> You get 12 cpus for 50% of the time on the average if there is another 
>> workload that has the same weight as you and is consuming as much as 
>> it can.
>> If there's nothing else running on the machine you get 12 cpus for 
>> 100% of the time with a cpu shares only config (ie. the burst capacity).
>>
>> I validated that the share was balanced over all the cpus by running 
>> linux perf events and checking that there were cpu samples on all 
>> cpus. There's bound to be other ways of doing it also.
>>
>>
>>     When we try to use the "number of processors" to control the
>>     number of threads created, or the number of partitions in a task,
>>     then we really want to know how many CPUs we can actually be
>>     concurrently running on!
> 
> I?m not sure that the primary question for serverless container 
> execution.  Just because you might happen to burst and have available
> to you more CPU time than you specified in your shares doesn?t mean
> that a multi-threaded application running in one of these containers 
> should configure itself to use all available host processors.  This 
> would result in over-burdoning the system at times of high load.

And conversely if you restrict yourself to the "share" of processors you 
get over time (ie 6 instead of 12) then you can severely impact the 
performance (response time in particular) of the VM and the application 
running on the VM.

But I don't see how this can overburden the system. If you app is 
running alone you get to use all 12 cpus for 100% of the time and life 
is good. If another app starts up then your 100% drops proportionately. 
If you schedule 12 apps all with a 1/12 share then everyone gets up to 
12 cpus for 1/12 of the time. It's only if you try to schedule a set of 
apps with a utilization total greater than 1 does the system become 
overloaded.

> The Java runtime, at startup, configures several subsystems to use a 
> number of threads for each system based on the number of available
> processors.  These subsystems include things like the number of GC
> threads, JIT compiler and thread pools.

> The problem I am trying to solve is to come up with a single number
> of CPUs based on container knowledge that can be used for the Java
> runtime subsystem to configure itself.  I believe that we should
> trust the implementor of the Mesos or Kubernetes setup and honor 
> their wishes when coming up with this number and not just use the
> processor affinity or number of cpus in the cpuset.

I don't agree, as has been discussed before. It's perfectly fine, even 
desirable, in my opinion to have 12 threads executing concurrently for 
50% of the time, rather than only 6 threads for 100% (assuming the 
scheduling technology is even clever enough to realize it can grant your 
threads 100%).

Over time the amount of work your app can execute is the same, but the 
time taken for an individual subtask can vary. If you are just doing 
one-shot batch processing then it makes no difference. If you're running 
an app that itself services incoming requests then the response time to 
individual requests can be impacted. To take the worst-case scenario, 
imagine you get 12 concurrent requests that would each take 1/12 of your 
cpu quota. With 12 threads on 12 cpus you can service all 12 requests 
with a response time of 1/12 time units. But with 6 threads on 6 cpus 
you can only service 6 requests with a 1/12 response time, and the other 
6 will have a 1/6 response time.

> The challenge is determining the right algorithm that doesn?t penalize 
> the VM.

Agreed. But I think the current algorithm may penalize the VM, and more 
importantly the application it is running.

> My current implementation does this:
> 
> total available logical processors = min 
> (cpusets,sched_getaffinity,shares/1024, quota/period)
> 
> All fractional units are rounded up to the next whole number.

My point has always been that I just don't think producing a single 
number from all these factors is the right/best way to deal with this. I 
think we really want to be able to answer the question "how many 
processors can I concurrently execute on" distinct from the question of 
"how much of a time slice will I get on each of those processors". To me 
"how many" is the question that "availableProcessors" should be 
answering - and only that question. How much "share" do I get is a 
different question, and perhaps one that the VM and the application need 
to be able to ask.

BTW sched_getaffinity should already account for cpusets ??

Cheers,
David

> Bob.
> 
>>
>> Makes sense to check. Hopefully there aren't any major errors or 
>> omissions in the above.
>> Thanks,
>> Alex
>>
>> [1] https://lwn.net/Articles/240474/ <https://lwn.net/Articles/240474/>
>> [2] 
>> https://github.com/torvalds/linux/blob/368f89984bb971b9f8b69eeb85ab19a89f985809/kernel/sched/core.c#L6735 
>> <https://github.com/torvalds/linux/blob/368f89984bb971b9f8b69eeb85ab19a89f985809/kernel/sched/core.c#L6735>
>> [3] https://lwn.net/Articles/80911/ <https://lwn.net/Articles/80911/> 
>> / http://www.i3s.unice.fr/~jplozi/wastedcores/files/extended_talk.pdf 
>> <http://www.i3s.unice.fr/~jplozi/wastedcores/files/extended_talk.pdf>
>>
>> [4]
>>
>> cfs_rq[13]:/system.slice/docker-f5681788d6daab249c90810fe60da429a2565b901ff34245922a578635b5d607.scope
>>
>> .exec_clock: 0.000000
>>
>> .MIN_vruntime: 0.000001
>>
>> .min_vruntime: 8090.087297
>>
>> .max_vruntime: 0.000001
>>
>> .spread: 0.000000
>>
>> .spread0 : -124692718.052832
>>
>> .nr_spread_over: 0
>>
>> .nr_running: 1
>>
>> .load: 1024
>>
>> .runnable_load_avg : 1023
>>
>> .blocked_load_avg: 0
>>
>> .tg_load_avg : 2046
>>
>> .tg_load_contrib : 1023
>>
>> .tg_runnable_contrib : 1023
>>
>> .tg->runnable_avg: 2036
>>
>> .tg->cfs_bandwidth.timer_active: 0
>>
>> .throttled : 0
>>
>> .throttle_count: 0
>>
>> .se->exec_start: 236081964.515645
>>
>> .se->vruntime: 24403993.326934
>>
>> .se->sum_exec_runtime: 8091.135873
>>
>> .se->load.weight : 512
>>
>> .se->avg.runnable_avg_sum: 45979
>>
>> .se->avg.runnable_avg_period : 45979
>>
>> .se->avg.load_avg_contrib: 511
>>
>> .se->avg.decay_count : 0
>>
>>
>>     Thanks,
>>     David
>>
>>
>>     On 5/10/2017 6:01 AM, Alex Bagehot wrote:
>>
>>         Hi,
>>
>>         On Wed, Oct 4, 2017 at 7:51 PM, Bob Vandette
>>         <bob.vandette at oracle.com <mailto:bob.vandette at oracle.com>>
>>         wrote:
>>
>>
>>                 On Oct 4, 2017, at 2:30 PM, Robbin Ehn
>>                 <robbin.ehn at oracle.com <mailto:robbin.ehn at oracle.com>>
>>                 wrote:
>>
>>                 Thanks Bob for looking into this.
>>
>>                 On 10/04/2017 08:14 PM, Bob Vandette wrote:
>>
>>                     Robbin,
>>                     I?ve looked into this issue and you are correct. 
>>                     I do have to examine
>>
>>             both the
>>
>>                     sched_getaffinity results as well as the cgroup
>>                     cpu subsystem
>>
>>             configuration
>>
>>                     files in order to provide a reasonable value for
>>                     active_processors.? If
>>
>>             I was only
>>
>>                     interested in cpusets, I could simply rely on the
>>                     getaffinity call but
>>
>>             I also want to
>>
>>                     factor in shares and quotas as well.
>>
>>
>>                 We had a quick discussion at the office, we actually
>>                 do think that you
>>
>>             could skip reading the shares and quotas.
>>
>>                 It really depends on what the user expect, if he give
>>                 us 4 cpu's with
>>
>>             50% or 2 full cpu what do he expect the differences would be?
>>
>>                 One could argue that he 'knows' that he will only use
>>                 max 50% and thus
>>
>>             we can act as if he is giving us 4 full cpu.
>>
>>                 But I'll leave that up to you, just a tough we had.
>>
>>
>>             It?s my opinion that we should do something if someone
>>             makes the effort to
>>             configure their
>>             containers to use quotas or shares.? There are many
>>             different opinions on
>>             what the right that
>>             right ?something? is.
>>
>>
>>         It might be interesting to look at some real instances of how
>>         java might[3]
>>         be deployed in containers.
>>         Marathon/Mesos[1] and Kubernetes[2] use shares and quotas so
>>         this is a vast
>>         chunk of deployments that need both of them today.
>>
>>
>>
>>             Many developers that are trying to deploy apps that use
>>             containers say
>>             they don?t like
>>             cpusets.? This is too limiting for them especially when
>>             the server
>>             configurations vary
>>             within their organization.
>>
>>
>>         True, however Kubernetes has an alpha feature[5] where it
>>         allocates cpusets
>>         to containers that request a whole number of cpus. Previously
>>         without
>>         cpusets any container could run on any cpu which we know might
>>         not be good
>>         for some workloads that want isolation. A request for a
>>         fractional or
>>         burstable amount of cpu would be allocated from a shared cpu
>>         pool. So
>>         although manual allocation of cpusets will be flakey[3] ,
>>         automation should
>>         be able to make it work.
>>
>>
>>
>>             ?From everything I?ve read including source code, there
>>             seems to be a
>>             consensus that
>>             shares and quotas are being used as a way to specify a
>>             fraction of a
>>             system (number of cpus).
>>
>>
>>         A refinement[6] on this is:
>>         Shares can be used for guaranteed cpu - you will always get
>>         your share.
>>         Quota[4] is a limit/constraint - you can never get more than
>>         the quota.
>>         So given the below limit of how many shares will be allocated
>>         on a host you
>>         can have burstable(or overcommit) capacity if your shares are
>>         less than
>>         your quota.
>>
>>
>>
>>             Docker added ?cpus which is implemented using quotas and
>>             periods.? They
>>             adjust these
>>             two parameters to provide a way of calculating the number
>>             of cpus that
>>             will be available
>>             to a process (quota/period).? Amazon also documents that
>>             cpu shares are
>>             defined to be a multiple of 1024.
>>             Where 1024 represents a single cpu and a share value of
>>             N*1024 represents
>>             N cpus.
>>
>>
>>         Kubernetes and Mesos/Marathon also use the N*1024 shares per
>>         host to
>>         allocate resources automatically.
>>
>>         Hopefully this provides some background on what a couple of
>>         orchestration
>>         systems that will be running java are doing currently in this
>>         area.
>>         Thanks,
>>         Alex
>>
>>
>>         [1] https://github.com/apache/mesos/commit/346cc8dd528a28a6e
>>         <https://github.com/apache/mesos/commit/346cc8dd528a28a6e>
>>         1f1cbdb4c95b8bdea2f6070 / (now out of date but appears to be a
>>         reasonable
>>         intro :
>>         https://zcox.wordpress.com/2014/09/17/cpu-resources-in-docke
>>         <https://zcox.wordpress.com/2014/09/17/cpu-resources-in-docke>
>>         r-mesos-and-marathon/ )
>>         [1a] https://youtu.be/hJyAfC-Z2xk?t=2439
>>         <https://youtu.be/hJyAfC-Z2xk?t=2439>
>>
>>         [2] https://kubernetes.io/docs/concepts/configuration/manage
>>         <https://kubernetes.io/docs/concepts/configuration/manage>
>>         -compute-resources-container/
>>
>>         [3] https://youtu.be/w1rZOY5gbvk?t=2479
>>         <https://youtu.be/w1rZOY5gbvk?t=2479>
>>
>>         [4]
>>         https://www.kernel.org/doc/Documentation/scheduler/sched-bwc.txt
>>         <https://www.kernel.org/doc/Documentation/scheduler/sched-bwc.txt>
>>         https://landley.net/kdocs/ols/2010/ols2010-pages-245-254.pdf
>>         <https://landley.net/kdocs/ols/2010/ols2010-pages-245-254.pdf>
>>         https://lwn.net/Articles/428175/
>>         <https://lwn.net/Articles/428175/>
>>
>>         [5]
>>         https://github.com/kubernetes/community/blob/43ce57ac476b9f2ce3f0220354a075e095a0d469/contributors/design-proposals/node/cpu-manager.md
>>         <https://github.com/kubernetes/community/blob/43ce57ac476b9f2ce3f0220354a075e095a0d469/contributors/design-proposals/node/cpu-manager.md>
>>         / https://github.com/kubernetes/kubernetes/commit/
>>         <https://github.com/kubernetes/kubernetes/commit/>
>>         00f0e0f6504ad8dd85fcbbd6294cd7cf2475fc72 /
>>         https://vimeo.com/226858314
>>
>>
>>         [6] https://kubernetes.io/docs/concepts/configuration/manage-
>>         <https://kubernetes.io/docs/concepts/configuration/manage->
>>         compute-resources-container/#how-pods-with-resource-limits-are-run
>>
>>
>>             Of course these are just conventions.? This is why I
>>             provided a way of
>>             specifying the
>>             number of CPUs so folks deploying Java services can be
>>             certain they get
>>             what they want.
>>
>>             Bob.
>>
>>
>>                     I had assumed that when sched_setaffinity was
>>                     called (in your case by
>>
>>             numactl) that the
>>
>>                     cgroup cpu config files would be updated to
>>                     reflect the current
>>
>>             processor affinity for the
>>
>>                     running process. This is not correct.? I have
>>                     updated my changeset and
>>
>>             have successfully
>>
>>                     run with your examples below.? I?ll post a new
>>                     webrev soon.
>>
>>
>>                 I see, thanks again!
>>
>>                 /Robbin
>>
>>                     Thanks,
>>                     Bob.
>>
>>
>>                             I still want to include the flag for at
>>                             least one Java release in the
>>
>>             event that the new behavior causes some regression
>>
>>                             in behavior.? I?m trying to make the
>>                             detection robust so that it will
>>
>>             fallback to the current behavior in the event
>>
>>                             that cgroups is not configured as expected
>>                             but I?d like to have a way
>>
>>             of forcing the issue.? JDK 10 is not
>>
>>                             supposed to be a long term support release
>>                             which makes it a good
>>
>>             target for this new behavior.
>>
>>                             I agree with David that once we commit to
>>                             cgroups, we should extract
>>
>>             all VM configuration data from that
>>
>>                             source.? There?s more information
>>                             available for cpusets than just
>>
>>             processor affinity that we might want to
>>
>>                             consider when calculating the number of
>>                             processors to assume for the
>>
>>             VM.? There?s exclusivity and
>>
>>                             effective cpu data available in addition
>>                             to the cpuset string.
>>
>>
>>                         cgroup only contains limits, not the real hard
>>                         limits.
>>                         You most consider the affinity mask. We that
>>                         have numa nodes do:
>>
>>                         [rehn at rehn-ws dev]$ numactl --cpunodebind=1
>>                         --membind=1 java
>>
>>             -Xlog:os=debug -cp . ForEver | grep proc
>>
>>                         [0.001s][debug][os] Initial active processor
>>                         count set to 16
>>                         [rehn at rehn-ws dev]$ numactl --cpunodebind=1
>>                         --membind=1 java
>>
>>             -Xlog:os=debug -XX:+UseContainerSupport -cp . ForEver |
>>             grep proc
>>
>>                         [0.001s][debug][os] Initial active processor
>>                         count set to 32
>>
>>                         when benchmarking all the time and that must
>>                         be set to 16 otherwise
>>
>>             the flag is really bad for us.
>>
>>                         So the flag actually breaks the little numa
>>                         support we have now.
>>
>>                         Thanks, Robbin
>>
>>
>>
>>
> 

From david.holmes at oracle.com  Fri Oct  6 06:01:46 2017
From: david.holmes at oracle.com (David Holmes)
Date: Fri, 6 Oct 2017 16:01:46 +1000
Subject: RFR (M): 8188813: Generalize OrderAccess to use templates
In-Reply-To: <59D639E1.7070104@oracle.com>
References: <59D639E1.7070104@oracle.com>
Message-ID: <378cd133-e7c8-4ebb-b20e-cfbb2aa30c0d@oracle.com>

Hi Erik,

On 5/10/2017 11:55 PM, Erik ?sterlund wrote:
> Hi,
> 
> Now that Atomic has been generalized with templates, the same should to 
> be done to OrderAccess.
> 
> Bug:
> https://bugs.openjdk.java.net/browse/JDK-8188813
> 
> Webrev:
> http://cr.openjdk.java.net/~eosterlund/8188813/webrev.00/

Well I didn't see anything too scary looking. :) I assume we'll drop the 
ptr variants at some stage.

One query:

src/hotspot/share/gc/shared/cardTableModRefBS.inline.hpp

Can you declare "volatile jbyte* byte = ..." to avoid the volatile cast 
on the orderAccess call?

> Testing: mach5 hs-tier3
> 
> Since Atomic already has a mechanism for type checking generic arguments 
> for Atomic::load/store, and OrderAccess also is a bunch of semantically 
> decorated loads and stores, I decided to reuse the template wheel that 
> was already invented (Atomic::LoadImpl and Atomic::StoreImpl).
> Therefore, I made OrderAccess privately inherit Atomic so that this 
> infrastructure could be reused. A whole bunch of code has been nuked 
> with this generalization.

Good!

> It is worth noting that I have added PrimitiveConversion functionality 
> for doubles and floats which translates to using the union trick for 
> casting double to and from int64_t and float to and from int32_t when 
> passing down doubles and ints to the API. I need the former two, because 
> Java supports volatile double and volatile float, and therefore runtime 
> support for that needs to be able to use floats and doubles. I also 

I didn't quite follow that. What parts of the runtime need to operate on 
volatile float/double Java fields?

> added PrimitiveConversion functionality for the subclasses of oop 
> (instanceOop and friends). The base class oop already supported this, so 
> it seemed natural that the subclasses should support it too.

Ok.

Thanks,
David
-----

> Thanks,
> /Erik

From erik.osterlund at oracle.com  Fri Oct  6 06:48:26 2017
From: erik.osterlund at oracle.com (Erik =?ISO-8859-1?Q?=D6sterlund?=)
Date: Fri, 06 Oct 2017 08:48:26 +0200
Subject: RFR (M): 8188813: Generalize OrderAccess to use templates
In-Reply-To: <378cd133-e7c8-4ebb-b20e-cfbb2aa30c0d@oracle.com>
References: <59D639E1.7070104@oracle.com>
 <378cd133-e7c8-4ebb-b20e-cfbb2aa30c0d@oracle.com>
Message-ID: <1507272506.23180.14.camel@oracle.com>

Hi David,

On fre, 2017-10-06 at 16:01 +1000, David Holmes wrote:
> Hi Erik,
> 
> On 5/10/2017 11:55 PM, Erik ?sterlund wrote:
> > 
> > Hi,
> > 
> > Now that Atomic has been generalized with templates, the same
> > should to?
> > be done to OrderAccess.
> > 
> > Bug:
> > https://bugs.openjdk.java.net/browse/JDK-8188813
> > 
> > Webrev:
> > http://cr.openjdk.java.net/~eosterlund/8188813/webrev.00/
> Well I didn't see anything too scary looking. :) I assume we'll drop
> the?
> ptr variants at some stage.

Yes, that is indeed the plan.

> One query:
> 
> src/hotspot/share/gc/shared/cardTableModRefBS.inline.hpp
> 
> Can you declare "volatile jbyte* byte = ..." to avoid the volatile
> cast?
> on the orderAccess call?

Sure. Fixed.

> 
> > 
> > Testing: mach5 hs-tier3
> > 
> > Since Atomic already has a mechanism for type checking generic
> > arguments?
> > for Atomic::load/store, and OrderAccess also is a bunch of
> > semantically?
> > decorated loads and stores, I decided to reuse the template wheel
> > that?
> > was already invented (Atomic::LoadImpl and Atomic::StoreImpl).
> > Therefore, I made OrderAccess privately inherit Atomic so that
> > this?
> > infrastructure could be reused. A whole bunch of code has been
> > nuked?
> > with this generalization.
> Good!

:)

> 
> > 
> > It is worth noting that I have added PrimitiveConversion
> > functionality?
> > for doubles and floats which translates to using the union trick
> > for?
> > casting double to and from int64_t and float to and from int32_t
> > when?
> > passing down doubles and ints to the API. I need the former two,
> > because?
> > Java supports volatile double and volatile float, and therefore
> > runtime?
> > support for that needs to be able to use floats and doubles. I
> > also?
> I didn't quite follow that. What parts of the runtime need to operate
> on?
> volatile float/double Java fields?

At the moment, there are multiple places that support the use of Java-
volatile float/double.

Some examples:
* The static interpreter supports Java-volatile getfield/putfield (cf.
cppInterpreter_zero.cpp:588, bytecodeInterpreter.cpp:2023)
* unsafe supports getters and setters of Java-volatile doubles/floats
(cf. unsafe.cpp:476).

This support is not accidental. The Java language allows the use of
volatile floats and doubles. Therefore we must support them in our
runtime.

Thanks for the review.

/Erik

> 
> > 
> > added PrimitiveConversion functionality for the subclasses of oop?
> > (instanceOop and friends). The base class oop already supported
> > this, so?
> > it seemed natural that the subclasses should support it too.
> Ok.
> 
> Thanks,
> David
> -----
> 
> > 
> > Thanks,
> > /Erik

From ceeaspb at gmail.com  Fri Oct  6 07:20:34 2017
From: ceeaspb at gmail.com (Alex Bagehot)
Date: Fri, 6 Oct 2017 08:20:34 +0100
Subject: RFR: 8146115 - Improve docker container detection and resource
 configuration usage
In-Reply-To: <65492519-d2b2-82ae-37a0-4540d4c5b937@oracle.com>
References: <74630458-926E-4B3E-B967-6F6ADCA0A406@oracle.com>
 <7e9322e8-274d-9fb2-f6a5-8cc612e3fe68@oracle.com>
 <33d617e7-7ec4-ebde-efa1-5602189e8470@oracle.com>
 <12909a67-6876-a40c-85b9-b959ed9f02df@oracle.com>
 <39AD9F8D-7E2B-4C15-8525-36DBA7C74302@oracle.com>
 <d3c36d08-1c9f-69bb-4e0c-aa00d3e62691@oracle.com>
 <F25C2D01-C643-4838-B2A7-DAA19D1B1EF5@oracle.com>
 <1d05a95f-75db-e0ca-e069-12fe41502e4f@oracle.com>
 <D2E3D675-2A4A-4ED1-BC30-6C52B185B731@oracle.com>
 <CAHeneC8cxBJ61A4u1AkfmpP63ZG6-WEkMyFLKnhRv=G6W32pFw@mail.gmail.com>
 <615e504d-94af-bae3-b721-6ca1dac6a567@oracle.com>
 <CAHeneC9tdgc9aCx-VLa8TvYufq_V-z+ApoAv5Cewi6UzX9uz8Q@mail.gmail.com>
 <65492519-d2b2-82ae-37a0-4540d4c5b937@oracle.com>
Message-ID: <CAHeneC91zw_9yFJdpea3GHrzO2gA0MV=SRMn34aA=J_VwV32UQ@mail.gmail.com>

Hi Robbin,


On Thursday, October 5, 2017, Robbin Ehn <robbin.ehn at oracle.com> wrote:

> Hi Alex, just a short question,
>
> You said something about "Marathon/Mesos[1] and Kubernetes[2] use shares
> and quotas"
> If you only use shares and quotas, do you not care about numa? (read trust
> kernel)
> On would think that you would setup a cgroup per numa node and split those
> into cgroups with shares/quotas.


It's a good point.
I certainly care about numa; we test I think similar to you numactl 'ing
driver/server processes to be in control of that variable.

Kubernetes doesn't, yet [1]. Neither mesos [2].

Thanks
Alex

[1] https://github.com/kubernetes/kubernetes/issues/49964
[2]
https://issues.apache.org/jira/plugins/servlet/mobile#issue/MESOS-6548 /
https://issues.apache.org/jira/plugins/servlet/mobile#issue/MESOS-5342


> Thanks, Robbin
>
> On 10/05/2017 06:43 PM, Alex Bagehot wrote:
>
>> Hi David,
>>
>> On Wed, Oct 4, 2017 at 10:51 PM, David Holmes <david.holmes at oracle.com>
>> wrote:
>>
>> Hi Alex,
>>>
>>> Can you tell me how shares/quotas are actually implemented in terms of
>>> allocating "cpus" to processes when shares/quotas are being applied?
>>>
>>
>>
>> The allocation of cpus to processes/threads(tasks as the kernel sees them)
>> or the other way round is called balancing, which is done by Scheduling
>> domains[3].
>>
>> cpu shares use CFS "group" scheduling[1] to apply the share to all the
>> tasks(threads) in the container. The container cpu shares weight maps
>> directly to a task's weight in CFS, which given it is part of a group is
>> divided by the number of tasks in the group (ie. a default container share
>> of 1024 with 2 threads in the container/group would result in each
>> thread/task having a 512 weight[4]). The same values used by nice[2] also.
>>
>> You can observe the task weight and other scheduler numbers in
>> /proc/sched_debug [4]. You can also kernel trace scheduler activity which
>> typically tells you the tasks involved, the cpu, the event: switch or
>> wakeup, etc.
>>
>>
>> For example in a 12 cpu system if I have a 50% share do I get all 12 CPUs
>>> for 50% of a "quantum" each, or do I get 6 CPUs for a full quantum each?
>>>
>>>
>> You get 12 cpus for 50% of the time on the average if there is another
>> workload that has the same weight as you and is consuming as much as it
>> can.
>> If there's nothing else running on the machine you get 12 cpus for 100% of
>> the time with a cpu shares only config (ie. the burst capacity).
>>
>> I validated that the share was balanced over all the cpus by running linux
>> perf events and checking that there were cpu samples on all cpus. There's
>> bound to be other ways of doing it also.
>>
>>
>>
>>> When we try to use the "number of processors" to control the number of
>>> threads created, or the number of partitions in a task, then we really
>>> want
>>> to know how many CPUs we can actually be concurrently running on!
>>>
>>>
>> Makes sense to check. Hopefully there aren't any major errors or omissions
>> in the above.
>> Thanks,
>> Alex
>>
>> [1] https://lwn.net/Articles/240474/
>> [2] https://github.com/torvalds/linux/blob/368f89984bb971b9f8b69eeb85ab19
>> a89f985809/kernel/sched/core.c#L6735
>> [3] https://lwn.net/Articles/80911/ / http://www.i3s.unice.fr/~
>> jplozi/wastedcores/files/extended_talk.pdf
>>
>> [4]
>>
>> cfs_rq[13]:/system.slice/docker-f5681788d6daab249c90810fe60da4
>> 29a2565b901ff34245922a578635b5d607.scope
>>
>>    .exec_clock                    : 0.000000
>>
>>    .MIN_vruntime                  : 0.000001
>>
>>    .min_vruntime                  : 8090.087297
>>
>>    .max_vruntime                  : 0.000001
>>
>>    .spread                        : 0.000000
>>
>>    .spread0                       : -124692718.052832
>>
>>    .nr_spread_over                : 0
>>
>>    .nr_running                    : 1
>>
>>    .load                          : 1024
>>
>>    .runnable_load_avg             : 1023
>>
>>    .blocked_load_avg              : 0
>>
>>    .tg_load_avg                   : 2046
>>
>>    .tg_load_contrib               : 1023
>>
>>    .tg_runnable_contrib           : 1023
>>
>>    .tg->runnable_avg              : 2036
>>
>>    .tg->cfs_bandwidth.timer_active: 0
>>
>>    .throttled                     : 0
>>
>>    .throttle_count                : 0
>>
>>    .se->exec_start                : 236081964.515645
>>
>>    .se->vruntime                  : 24403993.326934
>>
>>    .se->sum_exec_runtime          : 8091.135873
>>
>>    .se->load.weight               : 512
>>
>>    .se->avg.runnable_avg_sum      : 45979
>>
>>    .se->avg.runnable_avg_period   : 45979
>>
>>    .se->avg.load_avg_contrib      : 511
>>
>>    .se->avg.decay_count           : 0
>>
>>
>>
>>> Thanks,
>>> David
>>>
>>>
>>> On 5/10/2017 6:01 AM, Alex Bagehot wrote:
>>>
>>> Hi,
>>>>
>>>> On Wed, Oct 4, 2017 at 7:51 PM, Bob Vandette <bob.vandette at oracle.com>
>>>> wrote:
>>>>
>>>>
>>>> On Oct 4, 2017, at 2:30 PM, Robbin Ehn <robbin.ehn at oracle.com> wrote:
>>>>>
>>>>>>
>>>>>> Thanks Bob for looking into this.
>>>>>>
>>>>>> On 10/04/2017 08:14 PM, Bob Vandette wrote:
>>>>>>
>>>>>> Robbin,
>>>>>>> I?ve looked into this issue and you are correct.  I do have to
>>>>>>> examine
>>>>>>>
>>>>>>> both the
>>>>>>
>>>>>
>>>>> sched_getaffinity results as well as the cgroup cpu subsystem
>>>>>>
>>>>>>>
>>>>>>> configuration
>>>>>>
>>>>>
>>>>> files in order to provide a reasonable value for active_processors.  If
>>>>>>
>>>>>>>
>>>>>>> I was only
>>>>>>
>>>>>
>>>>> interested in cpusets, I could simply rely on the getaffinity call but
>>>>>>
>>>>>>>
>>>>>>> I also want to
>>>>>>
>>>>>
>>>>> factor in shares and quotas as well.
>>>>>>
>>>>>>>
>>>>>>>
>>>>>> We had a quick discussion at the office, we actually do think that you
>>>>>>
>>>>>> could skip reading the shares and quotas.
>>>>>
>>>>> It really depends on what the user expect, if he give us 4 cpu's with
>>>>>>
>>>>>> 50% or 2 full cpu what do he expect the differences would be?
>>>>>
>>>>> One could argue that he 'knows' that he will only use max 50% and thus
>>>>>>
>>>>>> we can act as if he is giving us 4 full cpu.
>>>>>
>>>>> But I'll leave that up to you, just a tough we had.
>>>>>>
>>>>>>
>>>>> It?s my opinion that we should do something if someone makes the effort
>>>>> to
>>>>> configure their
>>>>> containers to use quotas or shares.  There are many different opinions
>>>>> on
>>>>> what the right that
>>>>> right ?something? is.
>>>>>
>>>>>
>>>>> It might be interesting to look at some real instances of how java
>>>> might[3]
>>>> be deployed in containers.
>>>> Marathon/Mesos[1] and Kubernetes[2] use shares and quotas so this is a
>>>> vast
>>>> chunk of deployments that need both of them today.
>>>>
>>>>
>>>>
>>>> Many developers that are trying to deploy apps that use containers say
>>>>> they don?t like
>>>>> cpusets.  This is too limiting for them especially when the server
>>>>> configurations vary
>>>>> within their organization.
>>>>>
>>>>>
>>>>> True, however Kubernetes has an alpha feature[5] where it allocates
>>>> cpusets
>>>> to containers that request a whole number of cpus. Previously without
>>>> cpusets any container could run on any cpu which we know might not be
>>>> good
>>>> for some workloads that want isolation. A request for a fractional or
>>>> burstable amount of cpu would be allocated from a shared cpu pool. So
>>>> although manual allocation of cpusets will be flakey[3] , automation
>>>> should
>>>> be able to make it work.
>>>>
>>>>
>>>>
>>>>   From everything I?ve read including source code, there seems to be a
>>>>> consensus that
>>>>> shares and quotas are being used as a way to specify a fraction of a
>>>>> system (number of cpus).
>>>>>
>>>>>
>>>>> A refinement[6] on this is:
>>>> Shares can be used for guaranteed cpu - you will always get your share.
>>>> Quota[4] is a limit/constraint - you can never get more than the quota.
>>>> So given the below limit of how many shares will be allocated on a host
>>>> you
>>>> can have burstable(or overcommit) capacity if your shares are less than
>>>> your quota.
>>>>
>>>>
>>>>
>>>> Docker added ?cpus which is implemented using quotas and periods.  They
>>>>> adjust these
>>>>> two parameters to provide a way of calculating the number of cpus that
>>>>> will be available
>>>>> to a process (quota/period).  Amazon also documents that cpu shares are
>>>>> defined to be a multiple of 1024.
>>>>> Where 1024 represents a single cpu and a share value of N*1024
>>>>> represents
>>>>> N cpus.
>>>>>
>>>>>
>>>>> Kubernetes and Mesos/Marathon also use the N*1024 shares per host to
>>>> allocate resources automatically.
>>>>
>>>> Hopefully this provides some background on what a couple of
>>>> orchestration
>>>> systems that will be running java are doing currently in this area.
>>>> Thanks,
>>>> Alex
>>>>
>>>>
>>>> [1] https://github.com/apache/mesos/commit/346cc8dd528a28a6e
>>>> 1f1cbdb4c95b8bdea2f6070 / (now out of date but appears to be a
>>>> reasonable
>>>> intro : https://zcox.wordpress.com/2014/09/17/cpu-resources-in-docke
>>>> r-mesos-and-marathon/ )
>>>> [1a] https://youtu.be/hJyAfC-Z2xk?t=2439
>>>>
>>>> [2] https://kubernetes.io/docs/concepts/configuration/manage
>>>> -compute-resources-container/
>>>>
>>>> [3] https://youtu.be/w1rZOY5gbvk?t=2479
>>>>
>>>> [4] https://www.kernel.org/doc/Documentation/scheduler/sched-bwc.txt
>>>> https://landley.net/kdocs/ols/2010/ols2010-pages-245-254.pdf
>>>> https://lwn.net/Articles/428175/
>>>>
>>>> [5]
>>>> https://github.com/kubernetes/community/blob/43ce57ac476b9f2
>>>> ce3f0220354a075e095a0d469/contributors/design-proposals/node
>>>> /cpu-manager.md
>>>> / https://github.com/kubernetes/kubernetes/commit/
>>>> 00f0e0f6504ad8dd85fcbbd6294cd7cf2475fc72 / https://vimeo.com/226858314
>>>>
>>>>
>>>> [6] https://kubernetes.io/docs/concepts/configuration/manage-
>>>> compute-resources-container/#how-pods-with-resource-limits-are-run
>>>>
>>>>
>>>> Of course these are just conventions.  This is why I provided a way of
>>>>
>>>>> specifying the
>>>>> number of CPUs so folks deploying Java services can be certain they get
>>>>> what they want.
>>>>>
>>>>> Bob.
>>>>>
>>>>>
>>>>> I had assumed that when sched_setaffinity was called (in your case by
>>>>>>
>>>>>>>
>>>>>>> numactl) that the
>>>>>>
>>>>>
>>>>> cgroup cpu config files would be updated to reflect the current
>>>>>>
>>>>>>>
>>>>>>> processor affinity for the
>>>>>>
>>>>>
>>>>> running process. This is not correct.  I have updated my changeset and
>>>>>>
>>>>>>>
>>>>>>> have successfully
>>>>>>
>>>>>
>>>>> run with your examples below.  I?ll post a new webrev soon.
>>>>>>
>>>>>>>
>>>>>>>
>>>>>> I see, thanks again!
>>>>>>
>>>>>> /Robbin
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>>> Bob.
>>>>>>>
>>>>>>>
>>>>>>>> I still want to include the flag for at least one Java release in
>>>>>>>> the
>>>>>>>>
>>>>>>>>>
>>>>>>>>> event that the new behavior causes some regression
>>>>>>>>
>>>>>>>
>>>>> in behavior.  I?m trying to make the detection robust so that it will
>>>>>>
>>>>>>>
>>>>>>>>> fallback to the current behavior in the event
>>>>>>>>
>>>>>>>
>>>>> that cgroups is not configured as expected but I?d like to have a way
>>>>>>
>>>>>>>
>>>>>>>>> of forcing the issue.  JDK 10 is not
>>>>>>>>
>>>>>>>
>>>>> supposed to be a long term support release which makes it a good
>>>>>>
>>>>>>>
>>>>>>>>> target for this new behavior.
>>>>>>>>
>>>>>>>
>>>>> I agree with David that once we commit to cgroups, we should extract
>>>>>>
>>>>>>>
>>>>>>>>> all VM configuration data from that
>>>>>>>>
>>>>>>>
>>>>> source.  There?s more information available for cpusets than just
>>>>>>
>>>>>>>
>>>>>>>>> processor affinity that we might want to
>>>>>>>>
>>>>>>>
>>>>> consider when calculating the number of processors to assume for the
>>>>>>
>>>>>>>
>>>>>>>>> VM.  There?s exclusivity and
>>>>>>>>
>>>>>>>
>>>>> effective cpu data available in addition to the cpuset string.
>>>>>>
>>>>>>>
>>>>>>>>>
>>>>>>>> cgroup only contains limits, not the real hard limits.
>>>>>>>> You most consider the affinity mask. We that have numa nodes do:
>>>>>>>>
>>>>>>>> [rehn at rehn-ws dev]$ numactl --cpunodebind=1 --membind=1 java
>>>>>>>>
>>>>>>>> -Xlog:os=debug -cp . ForEver | grep proc
>>>>>>>
>>>>>>
>>>>> [0.001s][debug][os] Initial active processor count set to 16
>>>>>>
>>>>>>> [rehn at rehn-ws dev]$ numactl --cpunodebind=1 --membind=1 java
>>>>>>>>
>>>>>>>> -Xlog:os=debug -XX:+UseContainerSupport -cp . ForEver | grep proc
>>>>>>>
>>>>>>
>>>>> [0.001s][debug][os] Initial active processor count set to 32
>>>>>>
>>>>>>>
>>>>>>>> when benchmarking all the time and that must be set to 16 otherwise
>>>>>>>>
>>>>>>>> the flag is really bad for us.
>>>>>>>
>>>>>>
>>>>> So the flag actually breaks the little numa support we have now.
>>>>>>
>>>>>>>
>>>>>>>> Thanks, Robbin
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>
>>>>>

From david.holmes at oracle.com  Fri Oct  6 08:19:47 2017
From: david.holmes at oracle.com (David Holmes)
Date: Fri, 6 Oct 2017 18:19:47 +1000
Subject: RFR (M): 8188813: Generalize OrderAccess to use templates
In-Reply-To: <1507272506.23180.14.camel@oracle.com>
References: <59D639E1.7070104@oracle.com>
 <378cd133-e7c8-4ebb-b20e-cfbb2aa30c0d@oracle.com>
 <1507272506.23180.14.camel@oracle.com>
Message-ID: <f5406b49-acb4-9114-667c-69853f8a93e0@oracle.com>

On 6/10/2017 4:48 PM, Erik ?sterlund wrote:
> Hi David,
> 
> On fre, 2017-10-06 at 16:01 +1000, David Holmes wrote:
>> Hi Erik,
>>
>> On 5/10/2017 11:55 PM, Erik ?sterlund wrote:
>>>
>>> Hi,
>>>
>>> Now that Atomic has been generalized with templates, the same
>>> should to
>>> be done to OrderAccess.
>>>
>>> Bug:
>>> https://bugs.openjdk.java.net/browse/JDK-8188813
>>>
>>> Webrev:
>>> http://cr.openjdk.java.net/~eosterlund/8188813/webrev.00/
>> Well I didn't see anything too scary looking. :) I assume we'll drop
>> the
>> ptr variants at some stage.
> 
> Yes, that is indeed the plan.
> 
>> One query:
>>
>> src/hotspot/share/gc/shared/cardTableModRefBS.inline.hpp
>>
>> Can you declare "volatile jbyte* byte = ..." to avoid the volatile
>> cast
>> on the orderAccess call?
> 
> Sure. Fixed.
> 
>>
>>>
>>> Testing: mach5 hs-tier3
>>>
>>> Since Atomic already has a mechanism for type checking generic
>>> arguments
>>> for Atomic::load/store, and OrderAccess also is a bunch of
>>> semantically
>>> decorated loads and stores, I decided to reuse the template wheel
>>> that
>>> was already invented (Atomic::LoadImpl and Atomic::StoreImpl).
>>> Therefore, I made OrderAccess privately inherit Atomic so that
>>> this
>>> infrastructure could be reused. A whole bunch of code has been
>>> nuked
>>> with this generalization.
>> Good!
> 
> :)
> 
>>
>>>
>>> It is worth noting that I have added PrimitiveConversion
>>> functionality
>>> for doubles and floats which translates to using the union trick
>>> for
>>> casting double to and from int64_t and float to and from int32_t
>>> when
>>> passing down doubles and ints to the API. I need the former two,
>>> because
>>> Java supports volatile double and volatile float, and therefore
>>> runtime
>>> support for that needs to be able to use floats and doubles. I
>>> also
>> I didn't quite follow that. What parts of the runtime need to operate
>> on
>> volatile float/double Java fields?
> 
> At the moment, there are multiple places that support the use of Java-
> volatile float/double.
> 
> Some examples:
> * The static interpreter supports Java-volatile getfield/putfield (cf.
> cppInterpreter_zero.cpp:588, bytecodeInterpreter.cpp:2023)

Yes this is the _implementation_ of volatile field access for 
floats/doubles. I don't count that as a "use". :)

> * unsafe supports getters and setters of Java-volatile doubles/floats
> (cf. unsafe.cpp:476).

Yes this is more of a "use" but again very specific.

> This support is not accidental. The Java language allows the use of
> volatile floats and doubles. Therefore we must support them in our
> runtime.

Not quite what I meant. :) Other than the implementation of the Java 
volatile field accesses (direct of via Unsafe or intrinsics) I was 
wondering where we might need to do this. The general runtime tends not 
to do arbitary orderAccess or atomic operations on floats/doubles.

Cheers,
David

> Thanks for the review.
> 
> /Erik
> 
>>
>>>
>>> added PrimitiveConversion functionality for the subclasses of oop
>>> (instanceOop and friends). The base class oop already supported
>>> this, so
>>> it seemed natural that the subclasses should support it too.
>> Ok.
>>
>> Thanks,
>> David
>> -----
>>
>>>
>>> Thanks,
>>> /Erik

From volker.simonis at gmail.com  Fri Oct  6 08:28:03 2017
From: volker.simonis at gmail.com (Volker Simonis)
Date: Fri, 6 Oct 2017 10:28:03 +0200
Subject: CFV: New hotspot Group Member: Ioi Lam
In-Reply-To: <c80c0f72-aa17-1a18-d0b1-a759afd8283d@oracle.com>
References: <c80c0f72-aa17-1a18-d0b1-a759afd8283d@oracle.com>
Message-ID: <CA+3eh13r5KFfm9TPNttE6DFgA_4485HU2rgqTp_fBjh0VZqxYg@mail.gmail.com>

Vote: yes


On Mon, Oct 2, 2017 at 5:24 PM,  <coleen.phillimore at oracle.com> wrote:
> I hereby nominate Ioi Lam (OpenJDK user name: iklam) to Membership in the
> hotspot Group.
>
> Ioi has been working on the hotspot project for over 5 years and is a
> Reviewer in the JDK 9 Project with 79 changes.   He is an expert in the area
> of class data sharing.
>
> Votes are due by Monday, October 16, 2017.
>
> Only current Members of the hotspot Group [1] are eligible to vote on this
> nomination. Votes must be cast in the open by replying to this mailing list.
>
> For Lazy Consensus voting instructions, see [2].
>
> Coleen
>
> [1]http://openjdk.java.net/census#hotspot
> [2]http://openjdk.java.net/groups/#member-vote

From goetz.lindenmaier at sap.com  Fri Oct  6 08:29:53 2017
From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz)
Date: Fri, 6 Oct 2017 08:29:53 +0000
Subject: New hotspot Group Member: Ioi Lam
In-Reply-To: <c80c0f72-aa17-1a18-d0b1-a759afd8283d@oracle.com>
References: <c80c0f72-aa17-1a18-d0b1-a759afd8283d@oracle.com>
Message-ID: <81e479634d5b43b9a7253c666241d7ba@sap.com>

vote: yes

> -----Original Message-----
> From: hotspot-dev [mailto:hotspot-dev-bounces at openjdk.java.net] On
> Behalf Of coleen.phillimore at oracle.com
> Sent: Montag, 2. Oktober 2017 17:25
> To: hotspot-dev developers <hotspot-dev at openjdk.java.net>
> Subject: CFV: New hotspot Group Member: Ioi Lam
> 
> I hereby nominate Ioi Lam (OpenJDK user name: iklam) to Membership in
> the hotspot Group.
> 
> Ioi has been working on the hotspot project for over 5 years and is a
> Reviewer in the JDK 9 Project with 79 changes.?? He is an expert in the
> area of class data sharing.
> 
> Votes are due by Monday, October 16, 2017.
> 
> Only current Members of the hotspot Group [1] are eligible to vote on
> this nomination. Votes must be cast in the open by replying to this
> mailing list.
> 
> For Lazy Consensus voting instructions, see [2].
> 
> Coleen
> 
> [1]http://openjdk.java.net/census#hotspot
> [2]http://openjdk.java.net/groups/#member-vote

From erik.osterlund at oracle.com  Fri Oct  6 08:49:07 2017
From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=)
Date: Fri, 6 Oct 2017 10:49:07 +0200
Subject: RFR (M): 8188813: Generalize OrderAccess to use templates
In-Reply-To: <f5406b49-acb4-9114-667c-69853f8a93e0@oracle.com>
References: <59D639E1.7070104@oracle.com>
 <378cd133-e7c8-4ebb-b20e-cfbb2aa30c0d@oracle.com>
 <1507272506.23180.14.camel@oracle.com>
 <f5406b49-acb4-9114-667c-69853f8a93e0@oracle.com>
Message-ID: <59D74383.5000204@oracle.com>

Hi David,

Thanks for looking into this.

On 2017-10-06 10:19, David Holmes wrote:
> On 6/10/2017 4:48 PM, Erik ?sterlund wrote:
>> Hi David,
>>
>> On fre, 2017-10-06 at 16:01 +1000, David Holmes wrote:
>>> Hi Erik,
>>>
>>> On 5/10/2017 11:55 PM, Erik ?sterlund wrote:
>>>>
>>>> Hi,
>>>>
>>>> Now that Atomic has been generalized with templates, the same
>>>> should to
>>>> be done to OrderAccess.
>>>>
>>>> Bug:
>>>> https://bugs.openjdk.java.net/browse/JDK-8188813
>>>>
>>>> Webrev:
>>>> http://cr.openjdk.java.net/~eosterlund/8188813/webrev.00/
>>> Well I didn't see anything too scary looking. :) I assume we'll drop
>>> the
>>> ptr variants at some stage.
>>
>> Yes, that is indeed the plan.
>>
>>> One query:
>>>
>>> src/hotspot/share/gc/shared/cardTableModRefBS.inline.hpp
>>>
>>> Can you declare "volatile jbyte* byte = ..." to avoid the volatile
>>> cast
>>> on the orderAccess call?
>>
>> Sure. Fixed.
>>
>>>
>>>>
>>>> Testing: mach5 hs-tier3
>>>>
>>>> Since Atomic already has a mechanism for type checking generic
>>>> arguments
>>>> for Atomic::load/store, and OrderAccess also is a bunch of
>>>> semantically
>>>> decorated loads and stores, I decided to reuse the template wheel
>>>> that
>>>> was already invented (Atomic::LoadImpl and Atomic::StoreImpl).
>>>> Therefore, I made OrderAccess privately inherit Atomic so that
>>>> this
>>>> infrastructure could be reused. A whole bunch of code has been
>>>> nuked
>>>> with this generalization.
>>> Good!
>>
>> :)
>>
>>>
>>>>
>>>> It is worth noting that I have added PrimitiveConversion
>>>> functionality
>>>> for doubles and floats which translates to using the union trick
>>>> for
>>>> casting double to and from int64_t and float to and from int32_t
>>>> when
>>>> passing down doubles and ints to the API. I need the former two,
>>>> because
>>>> Java supports volatile double and volatile float, and therefore
>>>> runtime
>>>> support for that needs to be able to use floats and doubles. I
>>>> also
>>> I didn't quite follow that. What parts of the runtime need to operate
>>> on
>>> volatile float/double Java fields?
>>
>> At the moment, there are multiple places that support the use of Java-
>> volatile float/double.
>>
>> Some examples:
>> * The static interpreter supports Java-volatile getfield/putfield (cf.
>> cppInterpreter_zero.cpp:588, bytecodeInterpreter.cpp:2023)
>
> Yes this is the _implementation_ of volatile field access for 
> floats/doubles. I don't count that as a "use". :)
>
>> * unsafe supports getters and setters of Java-volatile doubles/floats
>> (cf. unsafe.cpp:476).
>
> Yes this is more of a "use" but again very specific.

Naturally, in order to support Java-volatile doubles and floats in the 
VM, we have the choice of

1) Flicking the PrimitiveConversion double/float switch allowing this to 
be automatically solved by the API and not rewriting uses of OrderAccess 
for supporting Java-volatile, or
2) Treating ordered accesses of double/float as special cases requiring 
manual (and very specific) casting to do the same thing.

I thought alternative 1 was nicer, because I dislike unnecessary special 
cases.

>
>> This support is not accidental. The Java language allows the use of
>> volatile floats and doubles. Therefore we must support them in our
>> runtime.
>
> Not quite what I meant. :) Other than the implementation of the Java 
> volatile field accesses (direct of via Unsafe or intrinsics) I was 
> wondering where we might need to do this. The general runtime tends 
> not to do arbitary orderAccess or atomic operations on floats/doubles.

We do not need it for anything else than supporting Java-volatile in the VM.

Thanks,
/Erik

> Cheers,
> David
>
>> Thanks for the review.
>>
>> /Erik
>>
>>>
>>>>
>>>> added PrimitiveConversion functionality for the subclasses of oop
>>>> (instanceOop and friends). The base class oop already supported
>>>> this, so
>>>> it seemed natural that the subclasses should support it too.
>>> Ok.
>>>
>>> Thanks,
>>> David
>>> -----
>>>
>>>>
>>>> Thanks,
>>>> /Erik


From goetz.lindenmaier at sap.com  Fri Oct  6 09:13:18 2017
From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz)
Date: Fri, 6 Oct 2017 09:13:18 +0000
Subject: New hotspot Group Member: Markus Gronlund
In-Reply-To: <b5ad7653-e6fa-725c-1ade-2e948b4a9d27@oracle.com>
References: <b5ad7653-e6fa-725c-1ade-2e948b4a9d27@oracle.com>
Message-ID: <542a2862ad4844908b31c415ff0e2447@sap.com>

vote: yes

Best regards,
  Goetz.

> -----Original Message-----
> From: hotspot-dev [mailto:hotspot-dev-bounces at openjdk.java.net] On
> Behalf Of coleen.phillimore at oracle.com
> Sent: Dienstag, 19. September 2017 19:55
> To: hotspot-dev developers <hotspot-dev at openjdk.java.net>
> Subject: CFV: New hotspot Group Member: Markus Gronlund
> 
> I hereby nominate Markus Gronlund (OpenJDK user name: mgronlun) to
> Membership in the hotspot Group.
> 
> Markus has been working on the hotspot project for over 5 years and is a
> Reviewer in the JDK 9 Project with 51 changes.?? He is an expert in the
> area of event based tracing of Java programs.
> 
> Votes are due by Tuesday, October 3, 2017.
> 
> Only current Members of the hotspot Group [1] are eligible to vote on
> this nomination. Votes must be cast in the open by replying to this
> mailing list.
> 
> For Lazy Consensus voting instructions, see [2].
> 
> Coleen
> 
> [1]http://openjdk.java.net/census#hotspot
> [2]http://openjdk.java.net/groups/#member-vote
> 
> 


From coleen.phillimore at oracle.com  Fri Oct  6 15:09:38 2017
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Fri, 6 Oct 2017 11:09:38 -0400
Subject: RFR (M): 8188813: Generalize OrderAccess to use templates
In-Reply-To: <59D639E1.7070104@oracle.com>
References: <59D639E1.7070104@oracle.com>
Message-ID: <c72f28de-bb9c-a89a-b70f-4bde6a54674a@oracle.com>

http://cr.openjdk.java.net/~eosterlund/8188813/webrev.00/src/hotspot/os_cpu/linux_aarch64/orderAccess_linux_aarch64.inline.hpp.udiff.html

+template<size_t byte_size>
+struct OrderAccess::PlatformOrderedStore<byte_size, RELEASE_X_FENCE>
+ VALUE_OBJ_CLASS_SPEC
+{
+ template <typename T>
+ void operator()(T v, volatile T* p) const { release_store(p, v); 
fence(); }
+};

Isn't release_store() removed by this patch?? Or does this call back to 
OrderAccess::release_store, which seems circular (?)

Otherwise this looks really nice.

I'll remove the *_ptr versions with 
https://bugs.openjdk.java.net/browse/JDK-8188220 . It's been fun.

Thanks,
Coleen


On 10/5/17 9:55 AM, Erik ?sterlund wrote:
> Hi,
>
> Now that Atomic has been generalized with templates, the same should 
> to be done to OrderAccess.
>
> Bug:
> https://bugs.openjdk.java.net/browse/JDK-8188813
>
> Webrev:
> http://cr.openjdk.java.net/~eosterlund/8188813/webrev.00/
>
> Testing: mach5 hs-tier3
>
> Since Atomic already has a mechanism for type checking generic 
> arguments for Atomic::load/store, and OrderAccess also is a bunch of 
> semantically decorated loads and stores, I decided to reuse the 
> template wheel that was already invented (Atomic::LoadImpl and 
> Atomic::StoreImpl).
> Therefore, I made OrderAccess privately inherit Atomic so that this 
> infrastructure could be reused. A whole bunch of code has been nuked 
> with this generalization.
>
> It is worth noting that I have added PrimitiveConversion functionality 
> for doubles and floats which translates to using the union trick for 
> casting double to and from int64_t and float to and from int32_t when 
> passing down doubles and ints to the API. I need the former two, 
> because Java supports volatile double and volatile float, and 
> therefore runtime support for that needs to be able to use floats and 
> doubles. I also added PrimitiveConversion functionality for the 
> subclasses of oop (instanceOop and friends). The base class oop 
> already supported this, so it seemed natural that the subclasses 
> should support it too.
>
> Thanks,
> /Erik


From bob.vandette at oracle.com  Fri Oct  6 15:34:37 2017
From: bob.vandette at oracle.com (Bob Vandette)
Date: Fri, 6 Oct 2017 11:34:37 -0400
Subject: RFR: 8146115 - Improve docker container detection and resource
 configuration usage 
In-Reply-To: <5f7f3d85-db48-fe6e-28f5-e3f4858f33e8@oracle.com>
References: <74630458-926E-4B3E-B967-6F6ADCA0A406@oracle.com>
 <7e9322e8-274d-9fb2-f6a5-8cc612e3fe68@oracle.com>
 <33d617e7-7ec4-ebde-efa1-5602189e8470@oracle.com>
 <12909a67-6876-a40c-85b9-b959ed9f02df@oracle.com>
 <39AD9F8D-7E2B-4C15-8525-36DBA7C74302@oracle.com>
 <d3c36d08-1c9f-69bb-4e0c-aa00d3e62691@oracle.com>
 <F25C2D01-C643-4838-B2A7-DAA19D1B1EF5@oracle.com>
 <1d05a95f-75db-e0ca-e069-12fe41502e4f@oracle.com>
 <D2E3D675-2A4A-4ED1-BC30-6C52B185B731@oracle.com>
 <CAHeneC8cxBJ61A4u1AkfmpP63ZG6-WEkMyFLKnhRv=G6W32pFw@mail.gmail.com>
 <615e504d-94af-bae3-b721-6ca1dac6a567@oracle.com>
 <CAHeneC9tdgc9aCx-VLa8TvYufq_V-z+ApoAv5Cewi6UzX9uz8Q@mail.gmail.com>
 <1BD883DB-8C8B-405D-8F85-3A026B19286F@oracle.com>
 <5f7f3d85-db48-fe6e-28f5-e3f4858f33e8@oracle.com>
Message-ID: <A5CDB304-6B44-448E-ADF3-C736FB49C85C@oracle.com>


> On Oct 5, 2017, at 6:12 PM, David Holmes <David.Holmes at oracle.com> wrote:
> 
> Hi Bob,
> 
> On 6/10/2017 3:57 AM, Bob Vandette wrote:
>>> On Oct 5, 2017, at 12:43 PM, Alex Bagehot <ceeaspb at gmail.com <mailto:ceeaspb at gmail.com>> wrote:
>>> 
>>> Hi David,
>>> 
>>> On Wed, Oct 4, 2017 at 10:51 PM, David Holmes <david.holmes at oracle.com <mailto:david.holmes at oracle.com>> wrote:
>>> 
>>>    Hi Alex,
>>> 
>>>    Can you tell me how shares/quotas are actually implemented in
>>>    terms of allocating "cpus" to processes when shares/quotas are
>>>    being applied? 
>>> 
>>> The allocation of cpus to processes/threads(tasks as the kernel sees them) or the other way round is called balancing, which is done by Scheduling domains[3].
>>> 
>>> cpu shares use CFS "group" scheduling[1] to apply the share to all the tasks(threads) in the container. The container cpu shares weight maps directly to a task's weight in CFS, which given it is part of a group is divided by the number of tasks in the group (ie. a default container share of 1024 with 2 threads in the container/group would result in each thread/task having a 512 weight[4]). The same values used by nice[2] also.
>>> 
>>> You can observe the task weight and other scheduler numbers in /proc/sched_debug [4]. You can also kernel trace scheduler activity which typically tells you the tasks involved, the cpu, the event: switch or wakeup, etc.
>>> 
>>>    For example in a 12 cpu system if I have a 50% share do I get all
>>>    12 CPUs for 50% of a "quantum" each, or do I get 6 CPUs for a full
>>>    quantum each?
>>> 
>>> 
>>> You get 12 cpus for 50% of the time on the average if there is another workload that has the same weight as you and is consuming as much as it can.
>>> If there's nothing else running on the machine you get 12 cpus for 100% of the time with a cpu shares only config (ie. the burst capacity).
>>> 
>>> I validated that the share was balanced over all the cpus by running linux perf events and checking that there were cpu samples on all cpus. There's bound to be other ways of doing it also.
>>> 
>>> 
>>>    When we try to use the "number of processors" to control the
>>>    number of threads created, or the number of partitions in a task,
>>>    then we really want to know how many CPUs we can actually be
>>>    concurrently running on!
>> I?m not sure that the primary question for serverless container execution.  Just because you might happen to burst and have available
>> to you more CPU time than you specified in your shares doesn?t mean
>> that a multi-threaded application running in one of these containers should configure itself to use all available host processors.  This would result in over-burdoning the system at times of high load.
> 
> And conversely if you restrict yourself to the "share" of processors you get over time (ie 6 instead of 12) then you can severely impact the performance (response time in particular) of the VM and the application running on the VM.

So if someone configures an 88 way system to use 1/88 share, you don?t think they expect a highly threaded 
application to run slower than if they didn?t restrict the shares??   The whole idea about shares is to SHARE the
system.  Yes, you?d have better performance when the system is idle and only running a single application but that?s
not what these container frameworks are trying to accomplish.  They want to get the best performance when running many
many processes.  That?s what I?m optimizing for.

> 
> But I don't see how this can overburden the system. If you app is running alone you get to use all 12 cpus for 100% of the time and life is good. If another app starts up then your 100% drops proportionately. If you schedule 12 apps all with a 1/12 share then everyone gets up to 12 cpus for 1/12 of the time. It's only if you try to schedule a set of apps with a utilization total greater than 1 does the system become overloaded.

In my above example, If we run the VM ergonomics based on 88 CPUs, then we are wasting a lot of memory on thread stacks and when
many of these processes are running,  the system will context switch a lot more than it would if we restricted the creation of threads to
the share amount.

Bob.


> 
>> The Java runtime, at startup, configures several subsystems to use a number of threads for each system based on the number of available
>> processors.  These subsystems include things like the number of GC
>> threads, JIT compiler and thread pools.
> 
>> The problem I am trying to solve is to come up with a single number
>> of CPUs based on container knowledge that can be used for the Java
>> runtime subsystem to configure itself.  I believe that we should
>> trust the implementor of the Mesos or Kubernetes setup and honor their wishes when coming up with this number and not just use the
>> processor affinity or number of cpus in the cpuset.
> 
> I don't agree, as has been discussed before. It's perfectly fine, even desirable, in my opinion to have 12 threads executing concurrently for 50% of the time, rather than only 6 threads for 100% (assuming the scheduling technology is even clever enough to realize it can grant your threads 100%).
> 
> Over time the amount of work your app can execute is the same, but the time taken for an individual subtask can vary. If you are just doing one-shot batch processing then it makes no difference. If you're running an app that itself services incoming requests then the response time to individual requests can be impacted. To take the worst-case scenario, imagine you get 12 concurrent requests that would each take 1/12 of your cpu quota. With 12 threads on 12 cpus you can service all 12 requests with a response time of 1/12 time units. But with 6 threads on 6 cpus you can only service 6 requests with a 1/12 response time, and the other 6 will have a 1/6 response time.
> 
>> The challenge is determining the right algorithm that doesn?t penalize the VM.
> 
> Agreed. But I think the current algorithm may penalize the VM, and more importantly the application it is running.
> 
>> My current implementation does this:
>> total available logical processors = min (cpusets,sched_getaffinity,shares/1024, quota/period)
>> All fractional units are rounded up to the next whole number.
> 
> My point has always been that I just don't think producing a single number from all these factors is the right/best way to deal with this. I think we really want to be able to answer the question "how many processors can I concurrently execute on" distinct from the question of "how much of a time slice will I get on each of those processors". To me "how many" is the question that "availableProcessors" should be answering - and only that question. How much "share" do I get is a different question, and perhaps one that the VM and the application need to be able to ask.
> 
> BTW sched_getaffinity should already account for cpusets ??
> 
> Cheers,
> David
> 
>> Bob.
>>> 
>>> Makes sense to check. Hopefully there aren't any major errors or omissions in the above.
>>> Thanks,
>>> Alex
>>> 
>>> [1] https://lwn.net/Articles/240474/ <https://lwn.net/Articles/240474/>
>>> [2] https://github.com/torvalds/linux/blob/368f89984bb971b9f8b69eeb85ab19a89f985809/kernel/sched/core.c#L6735 <https://github.com/torvalds/linux/blob/368f89984bb971b9f8b69eeb85ab19a89f985809/kernel/sched/core.c#L6735>
>>> [3] https://lwn.net/Articles/80911/ <https://lwn.net/Articles/80911/> / http://www.i3s.unice.fr/~jplozi/wastedcores/files/extended_talk.pdf <http://www.i3s.unice.fr/~jplozi/wastedcores/files/extended_talk.pdf>
>>> 
>>> [4]
>>> 
>>> cfs_rq[13]:/system.slice/docker-f5681788d6daab249c90810fe60da429a2565b901ff34245922a578635b5d607.scope
>>> 
>>> .exec_clock: 0.000000
>>> 
>>> .MIN_vruntime: 0.000001
>>> 
>>> .min_vruntime: 8090.087297
>>> 
>>> .max_vruntime: 0.000001
>>> 
>>> .spread: 0.000000
>>> 
>>> .spread0 : -124692718.052832
>>> 
>>> .nr_spread_over: 0
>>> 
>>> .nr_running: 1
>>> 
>>> .load: 1024
>>> 
>>> .runnable_load_avg : 1023
>>> 
>>> .blocked_load_avg: 0
>>> 
>>> .tg_load_avg : 2046
>>> 
>>> .tg_load_contrib : 1023
>>> 
>>> .tg_runnable_contrib : 1023
>>> 
>>> .tg->runnable_avg: 2036
>>> 
>>> .tg->cfs_bandwidth.timer_active: 0
>>> 
>>> .throttled : 0
>>> 
>>> .throttle_count: 0
>>> 
>>> .se->exec_start: 236081964.515645
>>> 
>>> .se->vruntime: 24403993.326934
>>> 
>>> .se->sum_exec_runtime: 8091.135873
>>> 
>>> .se->load.weight : 512
>>> 
>>> .se->avg.runnable_avg_sum: 45979
>>> 
>>> .se->avg.runnable_avg_period : 45979
>>> 
>>> .se->avg.load_avg_contrib: 511
>>> 
>>> .se->avg.decay_count : 0
>>> 
>>> 
>>>    Thanks,
>>>    David
>>> 
>>> 
>>>    On 5/10/2017 6:01 AM, Alex Bagehot wrote:
>>> 
>>>        Hi,
>>> 
>>>        On Wed, Oct 4, 2017 at 7:51 PM, Bob Vandette
>>>        <bob.vandette at oracle.com <mailto:bob.vandette at oracle.com>>
>>>        wrote:
>>> 
>>> 
>>>                On Oct 4, 2017, at 2:30 PM, Robbin Ehn
>>>                <robbin.ehn at oracle.com <mailto:robbin.ehn at oracle.com>>
>>>                wrote:
>>> 
>>>                Thanks Bob for looking into this.
>>> 
>>>                On 10/04/2017 08:14 PM, Bob Vandette wrote:
>>> 
>>>                    Robbin,
>>>                    I?ve looked into this issue and you are correct.                     I do have to examine
>>> 
>>>            both the
>>> 
>>>                    sched_getaffinity results as well as the cgroup
>>>                    cpu subsystem
>>> 
>>>            configuration
>>> 
>>>                    files in order to provide a reasonable value for
>>>                    active_processors.  If
>>> 
>>>            I was only
>>> 
>>>                    interested in cpusets, I could simply rely on the
>>>                    getaffinity call but
>>> 
>>>            I also want to
>>> 
>>>                    factor in shares and quotas as well.
>>> 
>>> 
>>>                We had a quick discussion at the office, we actually
>>>                do think that you
>>> 
>>>            could skip reading the shares and quotas.
>>> 
>>>                It really depends on what the user expect, if he give
>>>                us 4 cpu's with
>>> 
>>>            50% or 2 full cpu what do he expect the differences would be?
>>> 
>>>                One could argue that he 'knows' that he will only use
>>>                max 50% and thus
>>> 
>>>            we can act as if he is giving us 4 full cpu.
>>> 
>>>                But I'll leave that up to you, just a tough we had.
>>> 
>>> 
>>>            It?s my opinion that we should do something if someone
>>>            makes the effort to
>>>            configure their
>>>            containers to use quotas or shares.  There are many
>>>            different opinions on
>>>            what the right that
>>>            right ?something? is.
>>> 
>>> 
>>>        It might be interesting to look at some real instances of how
>>>        java might[3]
>>>        be deployed in containers.
>>>        Marathon/Mesos[1] and Kubernetes[2] use shares and quotas so
>>>        this is a vast
>>>        chunk of deployments that need both of them today.
>>> 
>>> 
>>> 
>>>            Many developers that are trying to deploy apps that use
>>>            containers say
>>>            they don?t like
>>>            cpusets.  This is too limiting for them especially when
>>>            the server
>>>            configurations vary
>>>            within their organization.
>>> 
>>> 
>>>        True, however Kubernetes has an alpha feature[5] where it
>>>        allocates cpusets
>>>        to containers that request a whole number of cpus. Previously
>>>        without
>>>        cpusets any container could run on any cpu which we know might
>>>        not be good
>>>        for some workloads that want isolation. A request for a
>>>        fractional or
>>>        burstable amount of cpu would be allocated from a shared cpu
>>>        pool. So
>>>        although manual allocation of cpusets will be flakey[3] ,
>>>        automation should
>>>        be able to make it work.
>>> 
>>> 
>>> 
>>>             From everything I?ve read including source code, there
>>>            seems to be a
>>>            consensus that
>>>            shares and quotas are being used as a way to specify a
>>>            fraction of a
>>>            system (number of cpus).
>>> 
>>> 
>>>        A refinement[6] on this is:
>>>        Shares can be used for guaranteed cpu - you will always get
>>>        your share.
>>>        Quota[4] is a limit/constraint - you can never get more than
>>>        the quota.
>>>        So given the below limit of how many shares will be allocated
>>>        on a host you
>>>        can have burstable(or overcommit) capacity if your shares are
>>>        less than
>>>        your quota.
>>> 
>>> 
>>> 
>>>            Docker added ?cpus which is implemented using quotas and
>>>            periods.  They
>>>            adjust these
>>>            two parameters to provide a way of calculating the number
>>>            of cpus that
>>>            will be available
>>>            to a process (quota/period).  Amazon also documents that
>>>            cpu shares are
>>>            defined to be a multiple of 1024.
>>>            Where 1024 represents a single cpu and a share value of
>>>            N*1024 represents
>>>            N cpus.
>>> 
>>> 
>>>        Kubernetes and Mesos/Marathon also use the N*1024 shares per
>>>        host to
>>>        allocate resources automatically.
>>> 
>>>        Hopefully this provides some background on what a couple of
>>>        orchestration
>>>        systems that will be running java are doing currently in this
>>>        area.
>>>        Thanks,
>>>        Alex
>>> 
>>> 
>>>        [1] https://github.com/apache/mesos/commit/346cc8dd528a28a6e
>>>        <https://github.com/apache/mesos/commit/346cc8dd528a28a6e>
>>>        1f1cbdb4c95b8bdea2f6070 / (now out of date but appears to be a
>>>        reasonable
>>>        intro :
>>>        https://zcox.wordpress.com/2014/09/17/cpu-resources-in-docke
>>>        <https://zcox.wordpress.com/2014/09/17/cpu-resources-in-docke>
>>>        r-mesos-and-marathon/ )
>>>        [1a] https://youtu.be/hJyAfC-Z2xk?t=2439
>>>        <https://youtu.be/hJyAfC-Z2xk?t=2439>
>>> 
>>>        [2] https://kubernetes.io/docs/concepts/configuration/manage
>>>        <https://kubernetes.io/docs/concepts/configuration/manage>
>>>        -compute-resources-container/
>>> 
>>>        [3] https://youtu.be/w1rZOY5gbvk?t=2479
>>>        <https://youtu.be/w1rZOY5gbvk?t=2479>
>>> 
>>>        [4]
>>>        https://www.kernel.org/doc/Documentation/scheduler/sched-bwc.txt
>>>        <https://www.kernel.org/doc/Documentation/scheduler/sched-bwc.txt>
>>>        https://landley.net/kdocs/ols/2010/ols2010-pages-245-254.pdf
>>>        <https://landley.net/kdocs/ols/2010/ols2010-pages-245-254.pdf>
>>>        https://lwn.net/Articles/428175/
>>>        <https://lwn.net/Articles/428175/>
>>> 
>>>        [5]
>>>        https://github.com/kubernetes/community/blob/43ce57ac476b9f2ce3f0220354a075e095a0d469/contributors/design-proposals/node/cpu-manager.md
>>>        <https://github.com/kubernetes/community/blob/43ce57ac476b9f2ce3f0220354a075e095a0d469/contributors/design-proposals/node/cpu-manager.md>
>>>        / https://github.com/kubernetes/kubernetes/commit/
>>>        <https://github.com/kubernetes/kubernetes/commit/>
>>>        00f0e0f6504ad8dd85fcbbd6294cd7cf2475fc72 /
>>>        https://vimeo.com/226858314
>>> 
>>> 
>>>        [6] https://kubernetes.io/docs/concepts/configuration/manage-
>>>        <https://kubernetes.io/docs/concepts/configuration/manage->
>>>        compute-resources-container/#how-pods-with-resource-limits-are-run
>>> 
>>> 
>>>            Of course these are just conventions.  This is why I
>>>            provided a way of
>>>            specifying the
>>>            number of CPUs so folks deploying Java services can be
>>>            certain they get
>>>            what they want.
>>> 
>>>            Bob.
>>> 
>>> 
>>>                    I had assumed that when sched_setaffinity was
>>>                    called (in your case by
>>> 
>>>            numactl) that the
>>> 
>>>                    cgroup cpu config files would be updated to
>>>                    reflect the current
>>> 
>>>            processor affinity for the
>>> 
>>>                    running process. This is not correct.  I have
>>>                    updated my changeset and
>>> 
>>>            have successfully
>>> 
>>>                    run with your examples below.  I?ll post a new
>>>                    webrev soon.
>>> 
>>> 
>>>                I see, thanks again!
>>> 
>>>                /Robbin
>>> 
>>>                    Thanks,
>>>                    Bob.
>>> 
>>> 
>>>                            I still want to include the flag for at
>>>                            least one Java release in the
>>> 
>>>            event that the new behavior causes some regression
>>> 
>>>                            in behavior.  I?m trying to make the
>>>                            detection robust so that it will
>>> 
>>>            fallback to the current behavior in the event
>>> 
>>>                            that cgroups is not configured as expected
>>>                            but I?d like to have a way
>>> 
>>>            of forcing the issue.  JDK 10 is not
>>> 
>>>                            supposed to be a long term support release
>>>                            which makes it a good
>>> 
>>>            target for this new behavior.
>>> 
>>>                            I agree with David that once we commit to
>>>                            cgroups, we should extract
>>> 
>>>            all VM configuration data from that
>>> 
>>>                            source.  There?s more information
>>>                            available for cpusets than just
>>> 
>>>            processor affinity that we might want to
>>> 
>>>                            consider when calculating the number of
>>>                            processors to assume for the
>>> 
>>>            VM.  There?s exclusivity and
>>> 
>>>                            effective cpu data available in addition
>>>                            to the cpuset string.
>>> 
>>> 
>>>                        cgroup only contains limits, not the real hard
>>>                        limits.
>>>                        You most consider the affinity mask. We that
>>>                        have numa nodes do:
>>> 
>>>                        [rehn at rehn-ws dev]$ numactl --cpunodebind=1
>>>                        --membind=1 java
>>> 
>>>            -Xlog:os=debug -cp . ForEver | grep proc
>>> 
>>>                        [0.001s][debug][os] Initial active processor
>>>                        count set to 16
>>>                        [rehn at rehn-ws dev]$ numactl --cpunodebind=1
>>>                        --membind=1 java
>>> 
>>>            -Xlog:os=debug -XX:+UseContainerSupport -cp . ForEver |
>>>            grep proc
>>> 
>>>                        [0.001s][debug][os] Initial active processor
>>>                        count set to 32
>>> 
>>>                        when benchmarking all the time and that must
>>>                        be set to 16 otherwise
>>> 
>>>            the flag is really bad for us.
>>> 
>>>                        So the flag actually breaks the little numa
>>>                        support we have now.
>>> 
>>>                        Thanks, Robbin
>>> 
>>> 
>>> 
>>> 


From vladimir.kozlov at oracle.com  Fri Oct  6 17:22:24 2017
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Fri, 6 Oct 2017 10:22:24 -0700
Subject: RFR(XXS) 8187685: NMT: Tracking compiler memory usage of thread's
 resource area
In-Reply-To: <69808d92-6ac8-9d83-61dc-6bb45936b4dc@redhat.com>
References: <69808d92-6ac8-9d83-61dc-6bb45936b4dc@redhat.com>
Message-ID: <ee0c8423-0bf2-7647-ad48-d90d7315679a@oracle.com>

Good. Thank you, Zhengyu.

Vladimir

On 10/5/17 12:47 PM, Zhengyu Gu wrote:
> Compiler uses resource area for compilation, let's bias it to mtCompiler 
> for more accurate memory counting.
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8187685
> Webrev: http://cr.openjdk.java.net/~zgu/8187685/webrev.00/index.html
> 
> 
> Discussion thread: 
> http://mail.openjdk.java.net/pipermail/hotspot-dev/2017-September/028360.html 
> 
> 
> Test:
> 
>  ? hotspot_tier1? fastdebug and release on Linux x64.
> 
> Thanks,
> 
> -Zhengyu

From coleen.phillimore at oracle.com  Fri Oct  6 17:53:53 2017
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Fri, 6 Oct 2017 13:53:53 -0400
Subject: RFR(XXS) 8187685: NMT: Tracking compiler memory usage of thread's
 resource area
In-Reply-To: <ee0c8423-0bf2-7647-ad48-d90d7315679a@oracle.com>
References: <69808d92-6ac8-9d83-61dc-6bb45936b4dc@redhat.com>
 <ee0c8423-0bf2-7647-ad48-d90d7315679a@oracle.com>
Message-ID: <b80535bb-c89a-6378-1241-e40bab64321e@oracle.com>

This seems fine.? I'll sponsor it for you.
Coleen


On 10/6/17 1:22 PM, Vladimir Kozlov wrote:
> Good. Thank you, Zhengyu.
>
> Vladimir
>
> On 10/5/17 12:47 PM, Zhengyu Gu wrote:
>> Compiler uses resource area for compilation, let's bias it to 
>> mtCompiler for more accurate memory counting.
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8187685
>> Webrev: http://cr.openjdk.java.net/~zgu/8187685/webrev.00/index.html
>>
>>
>> Discussion thread: 
>> http://mail.openjdk.java.net/pipermail/hotspot-dev/2017-September/028360.html 
>>
>>
>> Test:
>>
>> ?? hotspot_tier1? fastdebug and release on Linux x64.
>>
>> Thanks,
>>
>> -Zhengyu


From ioi.lam at oracle.com  Fri Oct  6 20:19:20 2017
From: ioi.lam at oracle.com (Ioi Lam)
Date: Fri, 6 Oct 2017 13:19:20 -0700
Subject: RFR (XS) 8188828 Intermittent ClassNotFoundException:
 jdk.test.lib.Platform for compiler tests
Message-ID: <1be927fa-fa4b-1964-93f3-1c72386acf7b@oracle.com>

Please review this very simple change:

https://bugs.openjdk.java.net/browse/JDK-8188828
http://ioilinux.us.oracle.com/webrev/jdk10/8188828_compiler_test_class_not_found.v01/

The dependency of

 ??? FileInstaller -> Utils -> JDKToolLauncher -> Platform

has caused many intermittent ClassNotFoundException in the hotspot 
nightly runs.
While this fix does not address the root cause (proper dependencies are not
specified in the test cases -- which we are planning to fix), we will 
hopefully
see much fewer occurrences of this annoying failure scenario.

Thanks a lot to Igor for suggesting the simple fix!

- Ioi


From igor.ignatyev at oracle.com  Fri Oct  6 20:28:58 2017
From: igor.ignatyev at oracle.com (Igor Ignatyev)
Date: Fri, 6 Oct 2017 13:28:58 -0700
Subject: RFR (XS) 8188828 Intermittent ClassNotFoundException:
 jdk.test.lib.Platform for compiler tests
In-Reply-To: <1be927fa-fa4b-1964-93f3-1c72386acf7b@oracle.com>
References: <1be927fa-fa4b-1964-93f3-1c72386acf7b@oracle.com>
Message-ID: <56725166-18B0-47B4-A8FB-DED8B149604D@oracle.com>

Hi Ioi,

I'm really happy we found such a simple workaround for this annoying problem and hope it'll greatly reduce CNFE in our test runs.  

the fix looks good to me.

Thanks,
-- Igor

> On Oct 6, 2017, at 1:19 PM, Ioi Lam <ioi.lam at oracle.com> wrote:
> 
> Please review this very simple change:
> 
> https://bugs.openjdk.java.net/browse/JDK-8188828
> http://ioilinux.us.oracle.com/webrev/jdk10/8188828_compiler_test_class_not_found.v01/
> 
> The dependency of
> 
>     FileInstaller -> Utils -> JDKToolLauncher -> Platform
> 
> has caused many intermittent ClassNotFoundException in the hotspot nightly runs.
> While this fix does not address the root cause (proper dependencies are not
> specified in the test cases -- which we are planning to fix), we will hopefully
> see much fewer occurrences of this annoying failure scenario.
> 
> Thanks a lot to Igor for suggesting the simple fix!
> 
> - Ioi
> 


From george.triantafillou at oracle.com  Fri Oct  6 20:39:16 2017
From: george.triantafillou at oracle.com (George Triantafillou)
Date: Fri, 6 Oct 2017 16:39:16 -0400
Subject: RFR (XS) 8188828 Intermittent ClassNotFoundException:
 jdk.test.lib.Platform for compiler tests
In-Reply-To: <56725166-18B0-47B4-A8FB-DED8B149604D@oracle.com>
References: <1be927fa-fa4b-1964-93f3-1c72386acf7b@oracle.com>
 <56725166-18B0-47B4-A8FB-DED8B149604D@oracle.com>
Message-ID: <b328ae45-29b4-8ead-71dc-17a965c7534c@oracle.com>

Hi Ioi,

Looks good!

-George

On 10/6/2017 4:28 PM, Igor Ignatyev wrote:
> Hi Ioi,
>
> I'm really happy we found such a simple workaround for this annoying problem and hope it'll greatly reduce CNFE in our test runs.
>
> the fix looks good to me.
>
> Thanks,
> -- Igor
>
>> On Oct 6, 2017, at 1:19 PM, Ioi Lam <ioi.lam at oracle.com> wrote:
>>
>> Please review this very simple change:
>>
>> https://bugs.openjdk.java.net/browse/JDK-8188828
>> http://ioilinux.us.oracle.com/webrev/jdk10/8188828_compiler_test_class_not_found.v01/
>>
>> The dependency of
>>
>>      FileInstaller -> Utils -> JDKToolLauncher -> Platform
>>
>> has caused many intermittent ClassNotFoundException in the hotspot nightly runs.
>> While this fix does not address the root cause (proper dependencies are not
>> specified in the test cases -- which we are planning to fix), we will hopefully
>> see much fewer occurrences of this annoying failure scenario.
>>
>> Thanks a lot to Igor for suggesting the simple fix!
>>
>> - Ioi
>>


From zgu at redhat.com  Fri Oct  6 21:44:05 2017
From: zgu at redhat.com (Zhengyu Gu)
Date: Fri, 6 Oct 2017 17:44:05 -0400
Subject: RFR(XXS) 8187685: NMT: Tracking compiler memory usage of thread's
 resource area
In-Reply-To: <ee0c8423-0bf2-7647-ad48-d90d7315679a@oracle.com>
References: <69808d92-6ac8-9d83-61dc-6bb45936b4dc@redhat.com>
 <ee0c8423-0bf2-7647-ad48-d90d7315679a@oracle.com>
Message-ID: <16b25caf-8dc2-c899-3840-553908c5ebf5@redhat.com>

Thanks for the review, Vladimir.

-Zhengyu

On 10/06/2017 01:22 PM, Vladimir Kozlov wrote:
> Good. Thank you, Zhengyu.
> 
> Vladimir
> 
> On 10/5/17 12:47 PM, Zhengyu Gu wrote:
>> Compiler uses resource area for compilation, let's bias it to 
>> mtCompiler for more accurate memory counting.
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8187685
>> Webrev: http://cr.openjdk.java.net/~zgu/8187685/webrev.00/index.html
>>
>>
>> Discussion thread: 
>> http://mail.openjdk.java.net/pipermail/hotspot-dev/2017-September/028360.html 
>>
>>
>> Test:
>>
>>    hotspot_tier1  fastdebug and release on Linux x64.
>>
>> Thanks,
>>
>> -Zhengyu

From zgu at redhat.com  Fri Oct  6 21:45:42 2017
From: zgu at redhat.com (Zhengyu Gu)
Date: Fri, 6 Oct 2017 17:45:42 -0400
Subject: RFR(XXS) 8187685: NMT: Tracking compiler memory usage of thread's
 resource area
In-Reply-To: <b80535bb-c89a-6378-1241-e40bab64321e@oracle.com>
References: <69808d92-6ac8-9d83-61dc-6bb45936b4dc@redhat.com>
 <ee0c8423-0bf2-7647-ad48-d90d7315679a@oracle.com>
 <b80535bb-c89a-6378-1241-e40bab64321e@oracle.com>
Message-ID: <e21f496b-2fa3-1e74-b433-c17a5fcdccf9@redhat.com>

Hi Coleen,

Thanks for the review and sponsor!

-Zhengyu

On 10/06/2017 01:53 PM, coleen.phillimore at oracle.com wrote:
> This seems fine.  I'll sponsor it for you.
> Coleen
> 
> 
> On 10/6/17 1:22 PM, Vladimir Kozlov wrote:
>> Good. Thank you, Zhengyu.
>>
>> Vladimir
>>
>> On 10/5/17 12:47 PM, Zhengyu Gu wrote:
>>> Compiler uses resource area for compilation, let's bias it to 
>>> mtCompiler for more accurate memory counting.
>>>
>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8187685
>>> Webrev: http://cr.openjdk.java.net/~zgu/8187685/webrev.00/index.html
>>>
>>>
>>> Discussion thread: 
>>> http://mail.openjdk.java.net/pipermail/hotspot-dev/2017-September/028360.html 
>>>
>>>
>>> Test:
>>>
>>>    hotspot_tier1  fastdebug and release on Linux x64.
>>>
>>> Thanks,
>>>
>>> -Zhengyu
> 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 8187685.patch
Type: text/x-patch
Size: 2473 bytes
Desc: not available
URL: <http://mail.openjdk.java.net/pipermail/hotspot-dev/attachments/20171006/ce8c5387/8187685.patch>

From david.holmes at oracle.com  Fri Oct  6 23:10:34 2017
From: david.holmes at oracle.com (David Holmes)
Date: Sat, 7 Oct 2017 09:10:34 +1000
Subject: RFR (M): 8188813: Generalize OrderAccess to use templates
In-Reply-To: <c72f28de-bb9c-a89a-b70f-4bde6a54674a@oracle.com>
References: <59D639E1.7070104@oracle.com>
 <c72f28de-bb9c-a89a-b70f-4bde6a54674a@oracle.com>
Message-ID: <da152e8c-7a32-8571-114c-74a70529575b@oracle.com>

On 7/10/2017 1:09 AM, coleen.phillimore at oracle.com wrote:
> http://cr.openjdk.java.net/~eosterlund/8188813/webrev.00/src/hotspot/os_cpu/linux_aarch64/orderAccess_linux_aarch64.inline.hpp.udiff.html 
> 
> 
> +template<size_t byte_size>
> +struct OrderAccess::PlatformOrderedStore<byte_size, RELEASE_X_FENCE>
> + VALUE_OBJ_CLASS_SPEC
> +{
> + template <typename T>
> + void operator()(T v, volatile T* p) const { release_store(p, v); 
> fence(); }
> +};
> 
> Isn't release_store() removed by this patch?? Or does this call back to 
> OrderAccess::release_store, which seems circular (?)

It's the same as the existing implementation. Without a specialization 
for a specific CPU the release_store_fence is just a release_store then 
a fence.

David

> Otherwise this looks really nice.
> 
> I'll remove the *_ptr versions with 
> https://bugs.openjdk.java.net/browse/JDK-8188220 . It's been fun.
> 
> Thanks,
> Coleen
> 
> 
> On 10/5/17 9:55 AM, Erik ?sterlund wrote:
>> Hi,
>>
>> Now that Atomic has been generalized with templates, the same should 
>> to be done to OrderAccess.
>>
>> Bug:
>> https://bugs.openjdk.java.net/browse/JDK-8188813
>>
>> Webrev:
>> http://cr.openjdk.java.net/~eosterlund/8188813/webrev.00/
>>
>> Testing: mach5 hs-tier3
>>
>> Since Atomic already has a mechanism for type checking generic 
>> arguments for Atomic::load/store, and OrderAccess also is a bunch of 
>> semantically decorated loads and stores, I decided to reuse the 
>> template wheel that was already invented (Atomic::LoadImpl and 
>> Atomic::StoreImpl).
>> Therefore, I made OrderAccess privately inherit Atomic so that this 
>> infrastructure could be reused. A whole bunch of code has been nuked 
>> with this generalization.
>>
>> It is worth noting that I have added PrimitiveConversion functionality 
>> for doubles and floats which translates to using the union trick for 
>> casting double to and from int64_t and float to and from int32_t when 
>> passing down doubles and ints to the API. I need the former two, 
>> because Java supports volatile double and volatile float, and 
>> therefore runtime support for that needs to be able to use floats and 
>> doubles. I also added PrimitiveConversion functionality for the 
>> subclasses of oop (instanceOop and friends). The base class oop 
>> already supported this, so it seemed natural that the subclasses 
>> should support it too.
>>
>> Thanks,
>> /Erik
> 

From david.holmes at oracle.com  Fri Oct  6 23:28:14 2017
From: david.holmes at oracle.com (David Holmes)
Date: Sat, 7 Oct 2017 09:28:14 +1000
Subject: RFR: 8146115 - Improve docker container detection and resource
 configuration usage
In-Reply-To: <A5CDB304-6B44-448E-ADF3-C736FB49C85C@oracle.com>
References: <74630458-926E-4B3E-B967-6F6ADCA0A406@oracle.com>
 <7e9322e8-274d-9fb2-f6a5-8cc612e3fe68@oracle.com>
 <33d617e7-7ec4-ebde-efa1-5602189e8470@oracle.com>
 <12909a67-6876-a40c-85b9-b959ed9f02df@oracle.com>
 <39AD9F8D-7E2B-4C15-8525-36DBA7C74302@oracle.com>
 <d3c36d08-1c9f-69bb-4e0c-aa00d3e62691@oracle.com>
 <F25C2D01-C643-4838-B2A7-DAA19D1B1EF5@oracle.com>
 <1d05a95f-75db-e0ca-e069-12fe41502e4f@oracle.com>
 <D2E3D675-2A4A-4ED1-BC30-6C52B185B731@oracle.com>
 <CAHeneC8cxBJ61A4u1AkfmpP63ZG6-WEkMyFLKnhRv=G6W32pFw@mail.gmail.com>
 <615e504d-94af-bae3-b721-6ca1dac6a567@oracle.com>
 <CAHeneC9tdgc9aCx-VLa8TvYufq_V-z+ApoAv5Cewi6UzX9uz8Q@mail.gmail.com>
 <1BD883DB-8C8B-405D-8F85-3A026B19286F@oracle.com>
 <5f7f3d85-db48-fe6e-28f5-e3f4858f33e8@oracle.com>
 <A5CDB304-6B44-448E-ADF3-C736FB49C85C@oracle.com>
Message-ID: <ff0d651b-eea1-85a1-46d8-c4904556fce2@oracle.com>

On 7/10/2017 1:34 AM, Bob Vandette wrote:
> 
>> On Oct 5, 2017, at 6:12 PM, David Holmes <David.Holmes at oracle.com> wrote:
>>
>> Hi Bob,
>>
>> On 6/10/2017 3:57 AM, Bob Vandette wrote:
>>>> On Oct 5, 2017, at 12:43 PM, Alex Bagehot <ceeaspb at gmail.com <mailto:ceeaspb at gmail.com>> wrote:
>>>>
>>>> Hi David,
>>>>
>>>> On Wed, Oct 4, 2017 at 10:51 PM, David Holmes <david.holmes at oracle.com <mailto:david.holmes at oracle.com>> wrote:
>>>>
>>>>     Hi Alex,
>>>>
>>>>     Can you tell me how shares/quotas are actually implemented in
>>>>     terms of allocating "cpus" to processes when shares/quotas are
>>>>     being applied?
>>>>
>>>> The allocation of cpus to processes/threads(tasks as the kernel sees them) or the other way round is called balancing, which is done by Scheduling domains[3].
>>>>
>>>> cpu shares use CFS "group" scheduling[1] to apply the share to all the tasks(threads) in the container. The container cpu shares weight maps directly to a task's weight in CFS, which given it is part of a group is divided by the number of tasks in the group (ie. a default container share of 1024 with 2 threads in the container/group would result in each thread/task having a 512 weight[4]). The same values used by nice[2] also.
>>>>
>>>> You can observe the task weight and other scheduler numbers in /proc/sched_debug [4]. You can also kernel trace scheduler activity which typically tells you the tasks involved, the cpu, the event: switch or wakeup, etc.
>>>>
>>>>     For example in a 12 cpu system if I have a 50% share do I get all
>>>>     12 CPUs for 50% of a "quantum" each, or do I get 6 CPUs for a full
>>>>     quantum each?
>>>>
>>>>
>>>> You get 12 cpus for 50% of the time on the average if there is another workload that has the same weight as you and is consuming as much as it can.
>>>> If there's nothing else running on the machine you get 12 cpus for 100% of the time with a cpu shares only config (ie. the burst capacity).
>>>>
>>>> I validated that the share was balanced over all the cpus by running linux perf events and checking that there were cpu samples on all cpus. There's bound to be other ways of doing it also.
>>>>
>>>>
>>>>     When we try to use the "number of processors" to control the
>>>>     number of threads created, or the number of partitions in a task,
>>>>     then we really want to know how many CPUs we can actually be
>>>>     concurrently running on!
>>> I?m not sure that the primary question for serverless container execution.  Just because you might happen to burst and have available
>>> to you more CPU time than you specified in your shares doesn?t mean
>>> that a multi-threaded application running in one of these containers should configure itself to use all available host processors.  This would result in over-burdoning the system at times of high load.
>>
>> And conversely if you restrict yourself to the "share" of processors you get over time (ie 6 instead of 12) then you can severely impact the performance (response time in particular) of the VM and the application running on the VM.
> 
> So if someone configures an 88 way system to use 1/88 share, you don?t think they expect a highly threaded
> application to run slower than if they didn?t restrict the shares??   The whole idea about shares is to SHARE the
> system.  Yes, you?d have better performance when the system is idle and only running a single application but that?s
> not what these container frameworks are trying to accomplish.  They want to get the best performance when running many
> many processes.  That?s what I?m optimizing for.

In what I described you are SHARING the system. You're also getting the 
most benefit from a lightly loaded system.

To me the conceptual model for a 1/88 share of an 88-way system is that 
you get 88 processors that appear to run at 1/88 the speed of the 
physical ones. Not that you get 1 real full speed processor.

>>
>> But I don't see how this can overburden the system. If you app is running alone you get to use all 12 cpus for 100% of the time and life is good. If another app starts up then your 100% drops proportionately. If you schedule 12 apps all with a 1/12 share then everyone gets up to 12 cpus for 1/12 of the time. It's only if you try to schedule a set of apps with a utilization total greater than 1 does the system become overloaded.
> 
> In my above example, If we run the VM ergonomics based on 88 CPUs, then we are wasting a lot of memory on thread stacks and when
> many of these processes are running,  the system will context switch a lot more than it would if we restricted the creation of threads to
> the share amount.

Context switching is a function of threads and time. My way uses more 
threads and less time (per unit of work); yours uses less threads and 
more time. Seems like zero sum to me.

Memory use is a different matter, but only because you can restrict 
memory independently of cpus. So you will need to ensure your memory 
quotas can accommodate the number of threads you expect to run - regardless.

David
-----

> Bob.
> 
> 
>>
>>> The Java runtime, at startup, configures several subsystems to use a number of threads for each system based on the number of available
>>> processors.  These subsystems include things like the number of GC
>>> threads, JIT compiler and thread pools.
>>
>>> The problem I am trying to solve is to come up with a single number
>>> of CPUs based on container knowledge that can be used for the Java
>>> runtime subsystem to configure itself.  I believe that we should
>>> trust the implementor of the Mesos or Kubernetes setup and honor their wishes when coming up with this number and not just use the
>>> processor affinity or number of cpus in the cpuset.
>>
>> I don't agree, as has been discussed before. It's perfectly fine, even desirable, in my opinion to have 12 threads executing concurrently for 50% of the time, rather than only 6 threads for 100% (assuming the scheduling technology is even clever enough to realize it can grant your threads 100%).
>>
>> Over time the amount of work your app can execute is the same, but the time taken for an individual subtask can vary. If you are just doing one-shot batch processing then it makes no difference. If you're running an app that itself services incoming requests then the response time to individual requests can be impacted. To take the worst-case scenario, imagine you get 12 concurrent requests that would each take 1/12 of your cpu quota. With 12 threads on 12 cpus you can service all 12 requests with a response time of 1/12 time units. But with 6 threads on 6 cpus you can only service 6 requests with a 1/12 response time, and the other 6 will have a 1/6 response time.
>>
>>> The challenge is determining the right algorithm that doesn?t penalize the VM.
>>
>> Agreed. But I think the current algorithm may penalize the VM, and more importantly the application it is running.
>>
>>> My current implementation does this:
>>> total available logical processors = min (cpusets,sched_getaffinity,shares/1024, quota/period)
>>> All fractional units are rounded up to the next whole number.
>>
>> My point has always been that I just don't think producing a single number from all these factors is the right/best way to deal with this. I think we really want to be able to answer the question "how many processors can I concurrently execute on" distinct from the question of "how much of a time slice will I get on each of those processors". To me "how many" is the question that "availableProcessors" should be answering - and only that question. How much "share" do I get is a different question, and perhaps one that the VM and the application need to be able to ask.
>>
>> BTW sched_getaffinity should already account for cpusets ??
>>
>> Cheers,
>> David
>>
>>> Bob.
>>>>
>>>> Makes sense to check. Hopefully there aren't any major errors or omissions in the above.
>>>> Thanks,
>>>> Alex
>>>>
>>>> [1] https://lwn.net/Articles/240474/ <https://lwn.net/Articles/240474/>
>>>> [2] https://github.com/torvalds/linux/blob/368f89984bb971b9f8b69eeb85ab19a89f985809/kernel/sched/core.c#L6735 <https://github.com/torvalds/linux/blob/368f89984bb971b9f8b69eeb85ab19a89f985809/kernel/sched/core.c#L6735>
>>>> [3] https://lwn.net/Articles/80911/ <https://lwn.net/Articles/80911/> / http://www.i3s.unice.fr/~jplozi/wastedcores/files/extended_talk.pdf <http://www.i3s.unice.fr/~jplozi/wastedcores/files/extended_talk.pdf>
>>>>
>>>> [4]
>>>>
>>>> cfs_rq[13]:/system.slice/docker-f5681788d6daab249c90810fe60da429a2565b901ff34245922a578635b5d607.scope
>>>>
>>>> .exec_clock: 0.000000
>>>>
>>>> .MIN_vruntime: 0.000001
>>>>
>>>> .min_vruntime: 8090.087297
>>>>
>>>> .max_vruntime: 0.000001
>>>>
>>>> .spread: 0.000000
>>>>
>>>> .spread0 : -124692718.052832
>>>>
>>>> .nr_spread_over: 0
>>>>
>>>> .nr_running: 1
>>>>
>>>> .load: 1024
>>>>
>>>> .runnable_load_avg : 1023
>>>>
>>>> .blocked_load_avg: 0
>>>>
>>>> .tg_load_avg : 2046
>>>>
>>>> .tg_load_contrib : 1023
>>>>
>>>> .tg_runnable_contrib : 1023
>>>>
>>>> .tg->runnable_avg: 2036
>>>>
>>>> .tg->cfs_bandwidth.timer_active: 0
>>>>
>>>> .throttled : 0
>>>>
>>>> .throttle_count: 0
>>>>
>>>> .se->exec_start: 236081964.515645
>>>>
>>>> .se->vruntime: 24403993.326934
>>>>
>>>> .se->sum_exec_runtime: 8091.135873
>>>>
>>>> .se->load.weight : 512
>>>>
>>>> .se->avg.runnable_avg_sum: 45979
>>>>
>>>> .se->avg.runnable_avg_period : 45979
>>>>
>>>> .se->avg.load_avg_contrib: 511
>>>>
>>>> .se->avg.decay_count : 0
>>>>
>>>>
>>>>     Thanks,
>>>>     David
>>>>
>>>>
>>>>     On 5/10/2017 6:01 AM, Alex Bagehot wrote:
>>>>
>>>>         Hi,
>>>>
>>>>         On Wed, Oct 4, 2017 at 7:51 PM, Bob Vandette
>>>>         <bob.vandette at oracle.com <mailto:bob.vandette at oracle.com>>
>>>>         wrote:
>>>>
>>>>
>>>>                 On Oct 4, 2017, at 2:30 PM, Robbin Ehn
>>>>                 <robbin.ehn at oracle.com <mailto:robbin.ehn at oracle.com>>
>>>>                 wrote:
>>>>
>>>>                 Thanks Bob for looking into this.
>>>>
>>>>                 On 10/04/2017 08:14 PM, Bob Vandette wrote:
>>>>
>>>>                     Robbin,
>>>>                     I?ve looked into this issue and you are correct.                     I do have to examine
>>>>
>>>>             both the
>>>>
>>>>                     sched_getaffinity results as well as the cgroup
>>>>                     cpu subsystem
>>>>
>>>>             configuration
>>>>
>>>>                     files in order to provide a reasonable value for
>>>>                     active_processors.  If
>>>>
>>>>             I was only
>>>>
>>>>                     interested in cpusets, I could simply rely on the
>>>>                     getaffinity call but
>>>>
>>>>             I also want to
>>>>
>>>>                     factor in shares and quotas as well.
>>>>
>>>>
>>>>                 We had a quick discussion at the office, we actually
>>>>                 do think that you
>>>>
>>>>             could skip reading the shares and quotas.
>>>>
>>>>                 It really depends on what the user expect, if he give
>>>>                 us 4 cpu's with
>>>>
>>>>             50% or 2 full cpu what do he expect the differences would be?
>>>>
>>>>                 One could argue that he 'knows' that he will only use
>>>>                 max 50% and thus
>>>>
>>>>             we can act as if he is giving us 4 full cpu.
>>>>
>>>>                 But I'll leave that up to you, just a tough we had.
>>>>
>>>>
>>>>             It?s my opinion that we should do something if someone
>>>>             makes the effort to
>>>>             configure their
>>>>             containers to use quotas or shares.  There are many
>>>>             different opinions on
>>>>             what the right that
>>>>             right ?something? is.
>>>>
>>>>
>>>>         It might be interesting to look at some real instances of how
>>>>         java might[3]
>>>>         be deployed in containers.
>>>>         Marathon/Mesos[1] and Kubernetes[2] use shares and quotas so
>>>>         this is a vast
>>>>         chunk of deployments that need both of them today.
>>>>
>>>>
>>>>
>>>>             Many developers that are trying to deploy apps that use
>>>>             containers say
>>>>             they don?t like
>>>>             cpusets.  This is too limiting for them especially when
>>>>             the server
>>>>             configurations vary
>>>>             within their organization.
>>>>
>>>>
>>>>         True, however Kubernetes has an alpha feature[5] where it
>>>>         allocates cpusets
>>>>         to containers that request a whole number of cpus. Previously
>>>>         without
>>>>         cpusets any container could run on any cpu which we know might
>>>>         not be good
>>>>         for some workloads that want isolation. A request for a
>>>>         fractional or
>>>>         burstable amount of cpu would be allocated from a shared cpu
>>>>         pool. So
>>>>         although manual allocation of cpusets will be flakey[3] ,
>>>>         automation should
>>>>         be able to make it work.
>>>>
>>>>
>>>>
>>>>              From everything I?ve read including source code, there
>>>>             seems to be a
>>>>             consensus that
>>>>             shares and quotas are being used as a way to specify a
>>>>             fraction of a
>>>>             system (number of cpus).
>>>>
>>>>
>>>>         A refinement[6] on this is:
>>>>         Shares can be used for guaranteed cpu - you will always get
>>>>         your share.
>>>>         Quota[4] is a limit/constraint - you can never get more than
>>>>         the quota.
>>>>         So given the below limit of how many shares will be allocated
>>>>         on a host you
>>>>         can have burstable(or overcommit) capacity if your shares are
>>>>         less than
>>>>         your quota.
>>>>
>>>>
>>>>
>>>>             Docker added ?cpus which is implemented using quotas and
>>>>             periods.  They
>>>>             adjust these
>>>>             two parameters to provide a way of calculating the number
>>>>             of cpus that
>>>>             will be available
>>>>             to a process (quota/period).  Amazon also documents that
>>>>             cpu shares are
>>>>             defined to be a multiple of 1024.
>>>>             Where 1024 represents a single cpu and a share value of
>>>>             N*1024 represents
>>>>             N cpus.
>>>>
>>>>
>>>>         Kubernetes and Mesos/Marathon also use the N*1024 shares per
>>>>         host to
>>>>         allocate resources automatically.
>>>>
>>>>         Hopefully this provides some background on what a couple of
>>>>         orchestration
>>>>         systems that will be running java are doing currently in this
>>>>         area.
>>>>         Thanks,
>>>>         Alex
>>>>
>>>>
>>>>         [1] https://github.com/apache/mesos/commit/346cc8dd528a28a6e
>>>>         <https://github.com/apache/mesos/commit/346cc8dd528a28a6e>
>>>>         1f1cbdb4c95b8bdea2f6070 / (now out of date but appears to be a
>>>>         reasonable
>>>>         intro :
>>>>         https://zcox.wordpress.com/2014/09/17/cpu-resources-in-docke
>>>>         <https://zcox.wordpress.com/2014/09/17/cpu-resources-in-docke>
>>>>         r-mesos-and-marathon/ )
>>>>         [1a] https://youtu.be/hJyAfC-Z2xk?t=2439
>>>>         <https://youtu.be/hJyAfC-Z2xk?t=2439>
>>>>
>>>>         [2] https://kubernetes.io/docs/concepts/configuration/manage
>>>>         <https://kubernetes.io/docs/concepts/configuration/manage>
>>>>         -compute-resources-container/
>>>>
>>>>         [3] https://youtu.be/w1rZOY5gbvk?t=2479
>>>>         <https://youtu.be/w1rZOY5gbvk?t=2479>
>>>>
>>>>         [4]
>>>>         https://www.kernel.org/doc/Documentation/scheduler/sched-bwc.txt
>>>>         <https://www.kernel.org/doc/Documentation/scheduler/sched-bwc.txt>
>>>>         https://landley.net/kdocs/ols/2010/ols2010-pages-245-254.pdf
>>>>         <https://landley.net/kdocs/ols/2010/ols2010-pages-245-254.pdf>
>>>>         https://lwn.net/Articles/428175/
>>>>         <https://lwn.net/Articles/428175/>
>>>>
>>>>         [5]
>>>>         https://github.com/kubernetes/community/blob/43ce57ac476b9f2ce3f0220354a075e095a0d469/contributors/design-proposals/node/cpu-manager.md
>>>>         <https://github.com/kubernetes/community/blob/43ce57ac476b9f2ce3f0220354a075e095a0d469/contributors/design-proposals/node/cpu-manager.md>
>>>>         / https://github.com/kubernetes/kubernetes/commit/
>>>>         <https://github.com/kubernetes/kubernetes/commit/>
>>>>         00f0e0f6504ad8dd85fcbbd6294cd7cf2475fc72 /
>>>>         https://vimeo.com/226858314
>>>>
>>>>
>>>>         [6] https://kubernetes.io/docs/concepts/configuration/manage-
>>>>         <https://kubernetes.io/docs/concepts/configuration/manage->
>>>>         compute-resources-container/#how-pods-with-resource-limits-are-run
>>>>
>>>>
>>>>             Of course these are just conventions.  This is why I
>>>>             provided a way of
>>>>             specifying the
>>>>             number of CPUs so folks deploying Java services can be
>>>>             certain they get
>>>>             what they want.
>>>>
>>>>             Bob.
>>>>
>>>>
>>>>                     I had assumed that when sched_setaffinity was
>>>>                     called (in your case by
>>>>
>>>>             numactl) that the
>>>>
>>>>                     cgroup cpu config files would be updated to
>>>>                     reflect the current
>>>>
>>>>             processor affinity for the
>>>>
>>>>                     running process. This is not correct.  I have
>>>>                     updated my changeset and
>>>>
>>>>             have successfully
>>>>
>>>>                     run with your examples below.  I?ll post a new
>>>>                     webrev soon.
>>>>
>>>>
>>>>                 I see, thanks again!
>>>>
>>>>                 /Robbin
>>>>
>>>>                     Thanks,
>>>>                     Bob.
>>>>
>>>>
>>>>                             I still want to include the flag for at
>>>>                             least one Java release in the
>>>>
>>>>             event that the new behavior causes some regression
>>>>
>>>>                             in behavior.  I?m trying to make the
>>>>                             detection robust so that it will
>>>>
>>>>             fallback to the current behavior in the event
>>>>
>>>>                             that cgroups is not configured as expected
>>>>                             but I?d like to have a way
>>>>
>>>>             of forcing the issue.  JDK 10 is not
>>>>
>>>>                             supposed to be a long term support release
>>>>                             which makes it a good
>>>>
>>>>             target for this new behavior.
>>>>
>>>>                             I agree with David that once we commit to
>>>>                             cgroups, we should extract
>>>>
>>>>             all VM configuration data from that
>>>>
>>>>                             source.  There?s more information
>>>>                             available for cpusets than just
>>>>
>>>>             processor affinity that we might want to
>>>>
>>>>                             consider when calculating the number of
>>>>                             processors to assume for the
>>>>
>>>>             VM.  There?s exclusivity and
>>>>
>>>>                             effective cpu data available in addition
>>>>                             to the cpuset string.
>>>>
>>>>
>>>>                         cgroup only contains limits, not the real hard
>>>>                         limits.
>>>>                         You most consider the affinity mask. We that
>>>>                         have numa nodes do:
>>>>
>>>>                         [rehn at rehn-ws dev]$ numactl --cpunodebind=1
>>>>                         --membind=1 java
>>>>
>>>>             -Xlog:os=debug -cp . ForEver | grep proc
>>>>
>>>>                         [0.001s][debug][os] Initial active processor
>>>>                         count set to 16
>>>>                         [rehn at rehn-ws dev]$ numactl --cpunodebind=1
>>>>                         --membind=1 java
>>>>
>>>>             -Xlog:os=debug -XX:+UseContainerSupport -cp . ForEver |
>>>>             grep proc
>>>>
>>>>                         [0.001s][debug][os] Initial active processor
>>>>                         count set to 32
>>>>
>>>>                         when benchmarking all the time and that must
>>>>                         be set to 16 otherwise
>>>>
>>>>             the flag is really bad for us.
>>>>
>>>>                         So the flag actually breaks the little numa
>>>>                         support we have now.
>>>>
>>>>                         Thanks, Robbin
>>>>
>>>>
>>>>
>>>>
> 

From vladimir.kozlov at oracle.com  Fri Oct  6 23:35:30 2017
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Fri, 6 Oct 2017 16:35:30 -0700
Subject: RFR (XS) 8188828 Intermittent ClassNotFoundException:
 jdk.test.lib.Platform for compiler tests
In-Reply-To: <56725166-18B0-47B4-A8FB-DED8B149604D@oracle.com>
References: <1be927fa-fa4b-1964-93f3-1c72386acf7b@oracle.com>
 <56725166-18B0-47B4-A8FB-DED8B149604D@oracle.com>
Message-ID: <e760f913-4aa4-bea3-74e4-755db67423fd@oracle.com>

Looks good.

Thanks,
Vladimir

On 10/6/17 1:28 PM, Igor Ignatyev wrote:
> Hi Ioi,
> 
> I'm really happy we found such a simple workaround for this annoying problem and hope it'll greatly reduce CNFE in our test runs.
> 
> the fix looks good to me.
> 
> Thanks,
> -- Igor
> 
>> On Oct 6, 2017, at 1:19 PM, Ioi Lam <ioi.lam at oracle.com> wrote:
>>
>> Please review this very simple change:
>>
>> https://bugs.openjdk.java.net/browse/JDK-8188828
>> http://ioilinux.us.oracle.com/webrev/jdk10/8188828_compiler_test_class_not_found.v01/
>>
>> The dependency of
>>
>>      FileInstaller -> Utils -> JDKToolLauncher -> Platform
>>
>> has caused many intermittent ClassNotFoundException in the hotspot nightly runs.
>> While this fix does not address the root cause (proper dependencies are not
>> specified in the test cases -- which we are planning to fix), we will hopefully
>> see much fewer occurrences of this annoying failure scenario.
>>
>> Thanks a lot to Igor for suggesting the simple fix!
>>
>> - Ioi
>>
> 

From wenlei.xie at gmail.com  Sat Oct  7 06:42:53 2017
From: wenlei.xie at gmail.com (Wenlei Xie)
Date: Fri, 6 Oct 2017 23:42:53 -0700
Subject: Questions about ... Lambda Form Compilation
In-Reply-To: <57d5cf51-111f-d34a-e161-02df724b6577@oracle.com>
References: <CABTg2xp2tc5EfmHFr5HmEuEhPTO59T0Jmb=8S-GWA+b1f_TQ7w@mail.gmail.com>
 <57d5cf51-111f-d34a-e161-02df724b6577@oracle.com>
Message-ID: <CABTg2xrt7Mi6UeNocSVhvJ058pB9HPMbM0vnVN13bDYOxJOU5Q@mail.gmail.com>

Thank you Vladimir!

We are aware of MethodHandle get customization after calling over 127 times
(thank you for the explanation in
http://mail.openjdk.java.net/pipermail/mlvm-dev/2017-May/006755.html as
well! ). And thus we are trying to avoid continuously instantiating them.

For this case, the MethodHandle get continuously instantiated should be
cached by LoadingCache in Guava. We are looking into why the cache fails to
work in the expected way. Will get back if we have any new observations or
findings!

Thank you for the help!

Best,
Wenlei

On Tue, Oct 3, 2017 at 4:54 AM, Vladimir Ivanov <
vladimir.x.ivanov at oracle.com> wrote:

> Hi,
>
> 2. For the same cluster, we also see over half of machines repeatedly
>> experiencing full GC due to Metaspace full. We dump JSTACK for every
>> minute
>> during 30 minutes, and see many threads are trying to compile the exact
>> same lambda form throughout the 30-minute period.
>>
>> Here is an example stacktrace on one machine. The LambdaForm triggers the
>> compilation on that machine is always LambdaForm$MH/170067652. Once it's
>> compiled, it should use the new compiled lambda form. We don't know why
>> it's still trying to compile the same lambda form again and again. --
>> Would
>> it be because the compiled lambda form somehow failed to load? This might
>> relate to the negative number of loaded classes.
>>
>
> What you are seeing here is LambdaForm customization (8069591 [1]).
>
> Customization creates a new LambdaForm instance specialized for a
> particular MethodHandle instance (no LF sharing possible). It was designed
> to alleviate performance penalty when inlining through a MH invoker doesn't
> happen and enables JIT-compilers to compile the whole method handle chain
> into a single nmethod. Without customization a method handle chain breaks
> up into a chain of small nmethods (1 nmethod per LambdaForm) and calls
> between them start dominate the execution time. (More details are available
> in [2].)
>
> Customization takes place once a method handle has been invoked through
> MH.invoke/invokeExact() more than 127 times.
>
> Considering you observe continuous customization, it means there are
> method handles being continuously instantiated and used which share the
> same lambda form (LambdaForm$MH/170067652). It leads to excessive
> generation of VM anonymous classes and creates memory pressure in Metaspace.
>
> As a workaround, you can try to disable LF customization
> (java.lang.invoke.MethodHandle.CUSTOMIZE_THRESHOLD=-1).
>
> But I'd suggest to look into why the application continuously creates
> method handles. As you noted, it doesn't play well with existing heuristics
> aimed at maximum throughput which assume the application behavior
> "stabilizes" over time.
>
> Best regards,
> Vladimir Ivanov
>
> [1] https://bugs.openjdk.java.net/browse/JDK-8069591
>
> [2] http://cr.openjdk.java.net/~vlivanov/talks/2015-JVMLS_State_of_JLI.pdf
>     slides #45-#50
>
>      "20170926_232912_39740_3vuuu.1.79-4-76640" #76640 prio=5 os_prio=0
>> tid=0x00007f908006dbd0 nid=0x150a6 runnable [0x00007f8bddb1b000]
>>         java.lang.Thread.State: RUNNABLE
>>              at sun.misc.Unsafe.defineAnonymousClass(Native Method)
>>              at java.lang.invoke.InvokerBytecodeGenerator.
>> loadAndInitializeInvokerClass(InvokerBytecodeGenerator.java:284)
>>              at java.lang.invoke.InvokerBytecodeGenerator.loadMethod(
>> InvokerBytecodeGenerator.java:276)
>>              at java.lang.invoke.InvokerBytecodeGenerator.
>> generateCustomizedCode(InvokerBytecodeGenerator.java:618)
>>              at java.lang.invoke.LambdaForm.compileToBytecode(LambdaForm.
>> java:654)
>>              at java.lang.invoke.LambdaForm.prepare(LambdaForm.java:635)
>>              at java.lang.invoke.MethodHandle.
>> updateForm(MethodHandle.java:
>> 1432)
>>              at java.lang.invoke.MethodHandle.
>> customize(MethodHandle.java:
>> 1442)
>>              at java.lang.invoke.Invokers.mayb
>> eCustomize(Invokers.java:407)
>>              at java.lang.invoke.Invokers.chec
>> kCustomized(Invokers.java:398)
>>              at java.lang.invoke.LambdaForm$MH/170067652.invokeExact_MT(
>> LambdaForm$MH)
>>              at com.facebook.presto.operator.aggregation.MinMaxHelper.
>> combineStateWithState(MinMaxHelper.java:141)
>>              at com.facebook.presto.operator.aggregation.
>> MaxAggregationFunction.combine(MaxAggregationFunction.java:108)
>>              at java.lang.invoke.LambdaForm$DMH/1607453282.invokeStatic_
>> L3_V(LambdaForm$DMH)
>>              at java.lang.invoke.LambdaForm$BMH/1118134445.reinvoke(
>> LambdaForm$BMH)
>>              at java.lang.invoke.LambdaForm$MH/1971758264.
>> linkToTargetMethod(LambdaForm$MH)
>>              at com.facebook.presto.$gen.IntegerIntegerMaxGroupedAccumu
>> lator_3439.addIntermediate(Unknown Source)
>>              at com.facebook.presto.operator.aggregation.builder.
>> InMemoryHashAggregationBuilder$Aggregator.processPage(
>> InMemoryHashAggregationBuilder.java:367)
>>              at com.facebook.presto.operator.aggregation.builder.
>> InMemoryHashAggregationBuilder.processPage(InMemoryHashAggregationBuilder
>> .java:138)
>>              at com.facebook.presto.operator.HashAggregationOperator.
>> addInput(HashAggregationOperator.java:400)
>>              at com.facebook.presto.operator.D
>> river.processInternal(Driver.
>> java:343)
>>              at com.facebook.presto.operator.Driver.lambda$processFor$6(
>> Driver.java:241)
>>              at com.facebook.presto.operator.Driver$$Lambda$765/
>> 442308692.get(Unknown
>> Source)
>>              at com.facebook.presto.operator.Driver.tryWithLock(Driver.
>> java:614)
>>              at com.facebook.presto.operator.D
>> river.processFor(Driver.java:
>> 235)
>>              at com.facebook.presto.execution.SqlTaskExecution$
>> DriverSplitRunner.processFor(SqlTaskExecution.java:622)
>>              at com.facebook.presto.execution.executor.
>> PrioritizedSplitRunner.process(PrioritizedSplitRunner.java:163)
>>              at com.facebook.presto.execution.executor.TaskExecutor$
>> TaskRunner.run(TaskExecutor.java:485)
>>              at java.util.concurrent.ThreadPoolExecutor.runWorker(
>> ThreadPoolExecutor.java:1142)
>>              at java.util.concurrent.ThreadPoolExecutor$Worker.run(
>> ThreadPoolExecutor.java:617)
>>              at java.lang.Thread.run(Thread.java:748)
>>      ...
>>
>>
>>
>> Both issues go away after we restart the JVM, and the same query won't
>> trigger the LambdaForm compilation issue, so it looks like the JVM enters
>> some weird state.  We are wondering if there is any thoughts on what could
>> trigger these issues? Or is there any suggestions about how to further
>> investigate it next time we see the VM in this state?
>>
>> Thank you.
>>
>>
>>


-- 
Best Regards,
Wenlei Xie (???)

Email: wenlei.xie at gmail.com

From forax at univ-mlv.fr  Sat Oct  7 09:13:47 2017
From: forax at univ-mlv.fr (Remi Forax)
Date: Sat, 7 Oct 2017 11:13:47 +0200 (CEST)
Subject: Questions about ... Lambda Form Compilation
In-Reply-To: <CABTg2xrt7Mi6UeNocSVhvJ058pB9HPMbM0vnVN13bDYOxJOU5Q@mail.gmail.com>
References: <CABTg2xp2tc5EfmHFr5HmEuEhPTO59T0Jmb=8S-GWA+b1f_TQ7w@mail.gmail.com>
 <57d5cf51-111f-d34a-e161-02df724b6577@oracle.com>
 <CABTg2xrt7Mi6UeNocSVhvJ058pB9HPMbM0vnVN13bDYOxJOU5Q@mail.gmail.com>
Message-ID: <2012378759.3898930.1507367627698.JavaMail.zimbra@u-pem.fr>

Depending on what you want to do, you can also use java.lang.ClassValue as cache.

cheers,
R?mi

----- Mail original -----
> De: "Wenlei Xie" <wenlei.xie at gmail.com>
> ?: "Vladimir Ivanov" <vladimir.x.ivanov at oracle.com>
> Cc: hotspot-dev at openjdk.java.net
> Envoy?: Samedi 7 Octobre 2017 08:42:53
> Objet: Re: Questions about ... Lambda Form Compilation

> Thank you Vladimir!
> 
> We are aware of MethodHandle get customization after calling over 127 times
> (thank you for the explanation in
> http://mail.openjdk.java.net/pipermail/mlvm-dev/2017-May/006755.html as
> well! ). And thus we are trying to avoid continuously instantiating them.
> 
> For this case, the MethodHandle get continuously instantiated should be
> cached by LoadingCache in Guava. We are looking into why the cache fails to
> work in the expected way. Will get back if we have any new observations or
> findings!
> 
> Thank you for the help!
> 
> Best,
> Wenlei
> 
> On Tue, Oct 3, 2017 at 4:54 AM, Vladimir Ivanov <
> vladimir.x.ivanov at oracle.com> wrote:
> 
>> Hi,
>>
>> 2. For the same cluster, we also see over half of machines repeatedly
>>> experiencing full GC due to Metaspace full. We dump JSTACK for every
>>> minute
>>> during 30 minutes, and see many threads are trying to compile the exact
>>> same lambda form throughout the 30-minute period.
>>>
>>> Here is an example stacktrace on one machine. The LambdaForm triggers the
>>> compilation on that machine is always LambdaForm$MH/170067652. Once it's
>>> compiled, it should use the new compiled lambda form. We don't know why
>>> it's still trying to compile the same lambda form again and again. --
>>> Would
>>> it be because the compiled lambda form somehow failed to load? This might
>>> relate to the negative number of loaded classes.
>>>
>>
>> What you are seeing here is LambdaForm customization (8069591 [1]).
>>
>> Customization creates a new LambdaForm instance specialized for a
>> particular MethodHandle instance (no LF sharing possible). It was designed
>> to alleviate performance penalty when inlining through a MH invoker doesn't
>> happen and enables JIT-compilers to compile the whole method handle chain
>> into a single nmethod. Without customization a method handle chain breaks
>> up into a chain of small nmethods (1 nmethod per LambdaForm) and calls
>> between them start dominate the execution time. (More details are available
>> in [2].)
>>
>> Customization takes place once a method handle has been invoked through
>> MH.invoke/invokeExact() more than 127 times.
>>
>> Considering you observe continuous customization, it means there are
>> method handles being continuously instantiated and used which share the
>> same lambda form (LambdaForm$MH/170067652). It leads to excessive
>> generation of VM anonymous classes and creates memory pressure in Metaspace.
>>
>> As a workaround, you can try to disable LF customization
>> (java.lang.invoke.MethodHandle.CUSTOMIZE_THRESHOLD=-1).
>>
>> But I'd suggest to look into why the application continuously creates
>> method handles. As you noted, it doesn't play well with existing heuristics
>> aimed at maximum throughput which assume the application behavior
>> "stabilizes" over time.
>>
>> Best regards,
>> Vladimir Ivanov
>>
>> [1] https://bugs.openjdk.java.net/browse/JDK-8069591
>>
>> [2] http://cr.openjdk.java.net/~vlivanov/talks/2015-JVMLS_State_of_JLI.pdf
>>     slides #45-#50
>>
>>      "20170926_232912_39740_3vuuu.1.79-4-76640" #76640 prio=5 os_prio=0
>>> tid=0x00007f908006dbd0 nid=0x150a6 runnable [0x00007f8bddb1b000]
>>>         java.lang.Thread.State: RUNNABLE
>>>              at sun.misc.Unsafe.defineAnonymousClass(Native Method)
>>>              at java.lang.invoke.InvokerBytecodeGenerator.
>>> loadAndInitializeInvokerClass(InvokerBytecodeGenerator.java:284)
>>>              at java.lang.invoke.InvokerBytecodeGenerator.loadMethod(
>>> InvokerBytecodeGenerator.java:276)
>>>              at java.lang.invoke.InvokerBytecodeGenerator.
>>> generateCustomizedCode(InvokerBytecodeGenerator.java:618)
>>>              at java.lang.invoke.LambdaForm.compileToBytecode(LambdaForm.
>>> java:654)
>>>              at java.lang.invoke.LambdaForm.prepare(LambdaForm.java:635)
>>>              at java.lang.invoke.MethodHandle.
>>> updateForm(MethodHandle.java:
>>> 1432)
>>>              at java.lang.invoke.MethodHandle.
>>> customize(MethodHandle.java:
>>> 1442)
>>>              at java.lang.invoke.Invokers.mayb
>>> eCustomize(Invokers.java:407)
>>>              at java.lang.invoke.Invokers.chec
>>> kCustomized(Invokers.java:398)
>>>              at java.lang.invoke.LambdaForm$MH/170067652.invokeExact_MT(
>>> LambdaForm$MH)
>>>              at com.facebook.presto.operator.aggregation.MinMaxHelper.
>>> combineStateWithState(MinMaxHelper.java:141)
>>>              at com.facebook.presto.operator.aggregation.
>>> MaxAggregationFunction.combine(MaxAggregationFunction.java:108)
>>>              at java.lang.invoke.LambdaForm$DMH/1607453282.invokeStatic_
>>> L3_V(LambdaForm$DMH)
>>>              at java.lang.invoke.LambdaForm$BMH/1118134445.reinvoke(
>>> LambdaForm$BMH)
>>>              at java.lang.invoke.LambdaForm$MH/1971758264.
>>> linkToTargetMethod(LambdaForm$MH)
>>>              at com.facebook.presto.$gen.IntegerIntegerMaxGroupedAccumu
>>> lator_3439.addIntermediate(Unknown Source)
>>>              at com.facebook.presto.operator.aggregation.builder.
>>> InMemoryHashAggregationBuilder$Aggregator.processPage(
>>> InMemoryHashAggregationBuilder.java:367)
>>>              at com.facebook.presto.operator.aggregation.builder.
>>> InMemoryHashAggregationBuilder.processPage(InMemoryHashAggregationBuilder
>>> .java:138)
>>>              at com.facebook.presto.operator.HashAggregationOperator.
>>> addInput(HashAggregationOperator.java:400)
>>>              at com.facebook.presto.operator.D
>>> river.processInternal(Driver.
>>> java:343)
>>>              at com.facebook.presto.operator.Driver.lambda$processFor$6(
>>> Driver.java:241)
>>>              at com.facebook.presto.operator.Driver$$Lambda$765/
>>> 442308692.get(Unknown
>>> Source)
>>>              at com.facebook.presto.operator.Driver.tryWithLock(Driver.
>>> java:614)
>>>              at com.facebook.presto.operator.D
>>> river.processFor(Driver.java:
>>> 235)
>>>              at com.facebook.presto.execution.SqlTaskExecution$
>>> DriverSplitRunner.processFor(SqlTaskExecution.java:622)
>>>              at com.facebook.presto.execution.executor.
>>> PrioritizedSplitRunner.process(PrioritizedSplitRunner.java:163)
>>>              at com.facebook.presto.execution.executor.TaskExecutor$
>>> TaskRunner.run(TaskExecutor.java:485)
>>>              at java.util.concurrent.ThreadPoolExecutor.runWorker(
>>> ThreadPoolExecutor.java:1142)
>>>              at java.util.concurrent.ThreadPoolExecutor$Worker.run(
>>> ThreadPoolExecutor.java:617)
>>>              at java.lang.Thread.run(Thread.java:748)
>>>      ...
>>>
>>>
>>>
>>> Both issues go away after we restart the JVM, and the same query won't
>>> trigger the LambdaForm compilation issue, so it looks like the JVM enters
>>> some weird state.  We are wondering if there is any thoughts on what could
>>> trigger these issues? Or is there any suggestions about how to further
>>> investigate it next time we see the VM in this state?
>>>
>>> Thank you.
>>>
>>>
>>>
> 
> 
> --
> Best Regards,
> Wenlei Xie (???)
> 
> Email: wenlei.xie at gmail.com

From david.holmes at oracle.com  Mon Oct  9 01:33:57 2017
From: david.holmes at oracle.com (David Holmes)
Date: Mon, 9 Oct 2017 11:33:57 +1000
Subject: RFR (XS) 8188828 Intermittent ClassNotFoundException:
 jdk.test.lib.Platform for compiler tests
In-Reply-To: <1be927fa-fa4b-1964-93f3-1c72386acf7b@oracle.com>
References: <1be927fa-fa4b-1964-93f3-1c72386acf7b@oracle.com>
Message-ID: <4dea8e6c-34fc-83b6-8fe3-2905ec15b72b@oracle.com>

Hi Ioi,

This seems like a temporary workaround - fine for now - but what is the 
real fix here? It's crazy that one test library class can't use another 
class from the same test library!

Thanks,
David

On 7/10/2017 6:19 AM, Ioi Lam wrote:
> Please review this very simple change:
> 
> https://bugs.openjdk.java.net/browse/JDK-8188828
> http://ioilinux.us.oracle.com/webrev/jdk10/8188828_compiler_test_class_not_found.v01/ 
> 
> 
> The dependency of
> 
>  ??? FileInstaller -> Utils -> JDKToolLauncher -> Platform
> 
> has caused many intermittent ClassNotFoundException in the hotspot 
> nightly runs.
> While this fix does not address the root cause (proper dependencies are not
> specified in the test cases -- which we are planning to fix), we will 
> hopefully
> see much fewer occurrences of this annoying failure scenario.
> 
> Thanks a lot to Igor for suggesting the simple fix!
> 
> - Ioi
> 

From Alan.Bateman at oracle.com  Mon Oct  9 07:55:49 2017
From: Alan.Bateman at oracle.com (Alan Bateman)
Date: Mon, 9 Oct 2017 08:55:49 +0100
Subject: [10] RFR(S) 8188775: Module jdk.internal.vm.compiler.management
 has not been granted accessClassInPackage.org.graalvm.compiler.hotspot
In-Reply-To: <a374f47d-c4a5-0fb4-8674-dba65624532f@oracle.com>
References: <a374f47d-c4a5-0fb4-8674-dba65624532f@oracle.com>
Message-ID: <b21ce89d-dd4d-2dec-d3ba-6e157718d929@oracle.com>

On 05/10/2017 00:05, Vladimir Kozlov wrote:
> https://bugs.openjdk.java.net/browse/JDK-8188775
>
> Changes for 8182701[1] missed changes in default.policy for new module 
> jdk.internal.vm.compiler.management.
>
> Add missing code:
>
> src/java.base/share/lib/security/default.policy
> @@ -154,6 +154,10 @@
> ???? permission java.security.AllPermission;
> ?};
>
> +grant codeBase "jrt:/jdk.internal.vm.compiler.management" {
> +??? permission java.security.AllPermission;
> +};
> +
This looks okay to me although it would be nice if we could identify the 
minimal permissions rather than granting it AllPermission.

-Alan

From erik.osterlund at oracle.com  Mon Oct  9 08:42:36 2017
From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=)
Date: Mon, 9 Oct 2017 10:42:36 +0200
Subject: RFR (M): 8188813: Generalize OrderAccess to use templates
In-Reply-To: <c72f28de-bb9c-a89a-b70f-4bde6a54674a@oracle.com>
References: <59D639E1.7070104@oracle.com>
 <c72f28de-bb9c-a89a-b70f-4bde6a54674a@oracle.com>
Message-ID: <59DB367C.6040509@oracle.com>

Hi Coleen,

On 2017-10-06 17:09, coleen.phillimore at oracle.com wrote:
> http://cr.openjdk.java.net/~eosterlund/8188813/webrev.00/src/hotspot/os_cpu/linux_aarch64/orderAccess_linux_aarch64.inline.hpp.udiff.html 
>
>
> +template<size_t byte_size>
> +struct OrderAccess::PlatformOrderedStore<byte_size, RELEASE_X_FENCE>
> + VALUE_OBJ_CLASS_SPEC
> +{
> + template <typename T>
> + void operator()(T v, volatile T* p) const { release_store(p, v); 
> fence(); }
> +};
>
> Isn't release_store() removed by this patch?  Or does this call back 
> to OrderAccess::release_store, which seems circular (?)

It is as David says. This does the same as was done before.
Without this specialization, release_store_fence() would turn into 
release() store() fence(). This specializes further with release_store() 
fence(), which will probably turn into stlr; dmb ish; with GCC 
intrinsics on AArch64, which is a bit more slim than release() store() 
fence() which would use more fencing.

> Otherwise this looks really nice.

Thank you!

> I'll remove the *_ptr versions with 
> https://bugs.openjdk.java.net/browse/JDK-8188220 . It's been fun.

Thanks for doing that Coleen.

/Erik

> Thanks,
> Coleen
>
>
> On 10/5/17 9:55 AM, Erik ?sterlund wrote:
>> Hi,
>>
>> Now that Atomic has been generalized with templates, the same should 
>> to be done to OrderAccess.
>>
>> Bug:
>> https://bugs.openjdk.java.net/browse/JDK-8188813
>>
>> Webrev:
>> http://cr.openjdk.java.net/~eosterlund/8188813/webrev.00/
>>
>> Testing: mach5 hs-tier3
>>
>> Since Atomic already has a mechanism for type checking generic 
>> arguments for Atomic::load/store, and OrderAccess also is a bunch of 
>> semantically decorated loads and stores, I decided to reuse the 
>> template wheel that was already invented (Atomic::LoadImpl and 
>> Atomic::StoreImpl).
>> Therefore, I made OrderAccess privately inherit Atomic so that this 
>> infrastructure could be reused. A whole bunch of code has been nuked 
>> with this generalization.
>>
>> It is worth noting that I have added PrimitiveConversion 
>> functionality for doubles and floats which translates to using the 
>> union trick for casting double to and from int64_t and float to and 
>> from int32_t when passing down doubles and ints to the API. I need 
>> the former two, because Java supports volatile double and volatile 
>> float, and therefore runtime support for that needs to be able to use 
>> floats and doubles. I also added PrimitiveConversion functionality 
>> for the subclasses of oop (instanceOop and friends). The base class 
>> oop already supported this, so it seemed natural that the subclasses 
>> should support it too.
>>
>> Thanks,
>> /Erik
>


From ioi.lam at oracle.com  Mon Oct  9 17:54:26 2017
From: ioi.lam at oracle.com (Ioi Lam)
Date: Mon, 9 Oct 2017 10:54:26 -0700
Subject: RFR (XS) 8188828 Intermittent ClassNotFoundException:
 jdk.test.lib.Platform for compiler tests
In-Reply-To: <4dea8e6c-34fc-83b6-8fe3-2905ec15b72b@oracle.com>
References: <1be927fa-fa4b-1964-93f3-1c72386acf7b@oracle.com>
 <4dea8e6c-34fc-83b6-8fe3-2905ec15b72b@oracle.com>
Message-ID: <f1ba03b3-ab6e-d6f4-9697-9a91cb050c80@oracle.com>

There are several possibilities. One is to pre-compile a bunch of 
libraries during the build time, and put them in the classpath using the 
jtreg -cpa: option.

Another possibility is to change jtreg to better express the dependency 
between different classes compiled by jtreg.

Thanks

- Ioi


On 10/8/17 6:33 PM, David Holmes wrote:
> Hi Ioi,
>
> This seems like a temporary workaround - fine for now - but what is 
> the real fix here? It's crazy that one test library class can't use 
> another class from the same test library!
>
> Thanks,
> David
>
> On 7/10/2017 6:19 AM, Ioi Lam wrote:
>> Please review this very simple change:
>>
>> https://bugs.openjdk.java.net/browse/JDK-8188828
>> http://ioilinux.us.oracle.com/webrev/jdk10/8188828_compiler_test_class_not_found.v01/ 
>>
>>
>> The dependency of
>>
>> ???? FileInstaller -> Utils -> JDKToolLauncher -> Platform
>>
>> has caused many intermittent ClassNotFoundException in the hotspot 
>> nightly runs.
>> While this fix does not address the root cause (proper dependencies 
>> are not
>> specified in the test cases -- which we are planning to fix), we will 
>> hopefully
>> see much fewer occurrences of this annoying failure scenario.
>>
>> Thanks a lot to Igor for suggesting the simple fix!
>>
>> - Ioi
>>


From ioi.lam at oracle.com  Mon Oct  9 17:55:34 2017
From: ioi.lam at oracle.com (Ioi Lam)
Date: Mon, 9 Oct 2017 10:55:34 -0700
Subject: RFR (XS) 8188828 Intermittent ClassNotFoundException:
 jdk.test.lib.Platform for compiler tests
In-Reply-To: <1be927fa-fa4b-1964-93f3-1c72386acf7b@oracle.com>
References: <1be927fa-fa4b-1964-93f3-1c72386acf7b@oracle.com>
Message-ID: <1b6a6353-9db6-9e1f-7b03-85bc3773055a@oracle.com>

Sorry I used an internal URL. Here's the proper openjdk URL:

http://cr.openjdk.java.net/~iklam/jdk10/8188828_compiler_test_class_not_found.v01/

Thanks

- Ioi


On 10/6/17 1:19 PM, Ioi Lam wrote:
> Please review this very simple change:
>
> https://bugs.openjdk.java.net/browse/JDK-8188828
> http://ioilinux.us.oracle.com/webrev/jdk10/8188828_compiler_test_class_not_found.v01/ 
>
>
> The dependency of
>
> ??? FileInstaller -> Utils -> JDKToolLauncher -> Platform
>
> has caused many intermittent ClassNotFoundException in the hotspot 
> nightly runs.
> While this fix does not address the root cause (proper dependencies 
> are not
> specified in the test cases -- which we are planning to fix), we will 
> hopefully
> see much fewer occurrences of this annoying failure scenario.
>
> Thanks a lot to Igor for suggesting the simple fix!
>
> - Ioi
>


From volker.simonis at gmail.com  Mon Oct  9 19:24:57 2017
From: volker.simonis at gmail.com (Volker Simonis)
Date: Mon, 9 Oct 2017 21:24:57 +0200
Subject: RFR(M): 8166317: InterpreterCodeSize should be computed
In-Reply-To: <ff56e260-c520-f0a2-32ac-512af28b317c@oracle.com>
References: <CA+3eh13aijzoGexJXDB7tTnFfmqE4BffbASUysSb2=osD7F4nA@mail.gmail.com>
 <50cda0ab-f403-372a-ce51-1a27d8821448@oracle.com>
 <CA+3eh11HCkBF8KkMG5-o-Ouji=KLqQ=FtztLWo6u3Han3yxoKw@mail.gmail.com>
 <e00bf17e-0ee5-75ba-8eb3-f766b984fa9f@oracle.com>
 <CA+3eh10Sxx5hsXABdeWCZ_P=AY5aR6F37Gs3cdFEv9P_K8GUrw@mail.gmail.com>
 <CA+3eh13fpKiZZgmHCAORgOjxdrEiTipN4zOxAxdz=++GiMZ0SQ@mail.gmail.com>
 <6704868d-caa7-51e0-4741-5d62f90d837c@oracle.com>
 <CA+3eh12_Z9S-oyMNJf8MLwtveDXAx+2xAqsKjOZjS5o_nm6WUQ@mail.gmail.com>
 <8c522d38-90db-2864-0778-6d5948b1f50c@oracle.com>
 <7fee08f1-8304-3026-19e9-844e618e98ea@oracle.com>
 <CA+3eh125jBF2OEMMJ_22N__UHx=Kj6NMKtpfnfGnnf0WUk4Wbw@mail.gmail.com>
 <2bb4136a-8c0e-ac4c-0c03-af38ff79ab40@oracle.com>
 <CA+3eh125R_kgucQ0pfUAgmW9a+y=hx05_N4eomHBSzd1viEztg@mail.gmail.com>
 <5b5219a5-960e-363b-2bdc-3613f1dae62c@oracle.com>
 <ff56e260-c520-f0a2-32ac-512af28b317c@oracle.com>
Message-ID: <CA+3eh10SxnE3nOXUpuQ_+vp_S_LP0ByrRhD3A2Asyo9goccs3Q@mail.gmail.com>

Hi Vladimir,

I've analyzed the crash. The problem is Sparc specific because on
Sparc we do not call the SharedRuntime for G1 pre/post barriers (i.e.
SharedRuntime::g1_wb_pre() / SharedRuntime::g1_wb_post()) like on
other architectures. Instead we lazily create assembler stubs on the
fly (generate_satb_log_enqueue_if_necessary() /
generate_dirty_card_log_enqueue_if_necessary()) when they are needed.
This happens during the generation of the interpreter and allocates
more memory in the code cache such that we can't shrink the memory
which was initially allocated for the interpreter any more.

Unfortunately we can't easily generate these stubs during
'stubRoutines_init1()' because
'generate_dirty_card_log_enqueue_if_necessary()' needs the byte map
base address which is only initialized in
'CardTableModRefBS::initialize()' during 'univers_init()' which
happens after 'stubRoutines_init1()'.

I'm still thinking about a good way to fix this without too many
platfrom-specific ifdefs.

Regards,
Volker


On Tue, Oct 3, 2017 at 9:46 PM, Vladimir Kozlov
<vladimir.kozlov at oracle.com> wrote:
> I rebased it. But there is problem with changes. VM hit guarantee() in this
> code when run on SPARC in both, fastdebug and product, builds.
> Crash happens during build. We can't push this - problem should be
> investigated and fixed first.
>
> Thanks,
> Vladimir
>
> make/Main.gmk:443: recipe for target 'generate-link-opt-data' failed
> /usr/ccs/bin/bash: line 4:  9349 Abort                   (core dumped)
> /s/build/solaris-sparcv9-debug/support/interim-image/bin/java
> -XX:DumpLoadedClassList=/s/build/solaris-sparcv9-debug/support/link_opt/classlist
> -Djava.lang.invoke.MethodHandle.TRACE_RESOLVE=true -cp
> /s/build/solaris-sparcv9-debug/support/classlist.jar
> build.tools.classlist.HelloClasslist 2>&1 >
> /s/build/solaris-sparcv9-debug/support/link_opt/default_jli_trace.txt
> make[3]: *** [/s/build/solaris-sparcv9-debug/support/link_opt/classlist]
> Error 134
> make[2]: *** [generate-link-opt-data] Error 1
>
>
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #  Internal Error (/s/open/src/hotspot/share/memory/heap.cpp:233), pid=9349,
> tid=2
> #  guarantee(b == block_at(_next_segment - actual_number_of_segments))
> failed: Intermediate allocation!
> #
> # JRE version:  (10.0) (fastdebug build )
> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug
> 10-internal+0-2017-09-30-014154.8166317, mixed mode, tiered, compressed
> oops, g1 gc, solaris-sparc)
> # Core dump will be written. Default location: /s/open/make/core or
> core.9349
> #
> # If you would like to submit a bug report, please visit:
> #   http://bugreport.java.com/bugreport/crash.jsp
> #
>
> ---------------  S U M M A R Y ------------
>
> Command Line:
> -XX:DumpLoadedClassList=/s/build/solaris-sparcv9-debug/support/link_opt/classlist
> -Djava.lang.invoke.MethodHandle.TRACE_RESOLVE=true
> build.tools.classlist.HelloClasslist
>
> Host: sca00dbv, Sparcv9 64 bit 3600 MHz, 16 cores, 32G, Oracle Solaris 11.2
> SPARC
> Time: Sat Sep 30 03:29:46 2017 UTC elapsed time: 0 seconds (0d 0h 0m 0s)
>
> ---------------  T H R E A D  ---------------
>
> Current thread (0x000000010012f000):  JavaThread "Unknown thread"
> [_thread_in_vm, id=2, stack(0x0007fffef9700000,0x0007fffef9800000)]
>
> Stack: [0x0007fffef9700000,0x0007fffef9800000],  sp=0x0007fffef97ff020,
> free space=1020k
> Native frames: (J=compiled Java code, A=aot compiled Java code,
> j=interpreted, Vv=VM code, C=native code)
> V  [libjvm.so+0x1f94508]  void VMError::report_and_die(int,const char*,const
> char*,void*,Thread*,unsigned char*,void*,void*,const char*,int,unsigned
> long)+0xa58
> V  [libjvm.so+0x1f93a3c]  void VMError::report_and_die(Thread*,const
> char*,int,const char*,const char*,void*)+0x3c
> V  [libjvm.so+0xd02f38]  void report_vm_error(const char*,int,const
> char*,const char*,...)+0x78
> V  [libjvm.so+0xfc219c]  void CodeHeap::deallocate_tail(void*,unsigned
> long)+0xec
> V  [libjvm.so+0xbf4f14]  void CodeCache::free_unused_tail(CodeBlob*,unsigned
> long)+0xe4
> V  [libjvm.so+0x1e0ae70]  void StubQueue::deallocate_unused_tail()+0x40
> V  [libjvm.so+0x1e7452c]  void TemplateInterpreter::initialize()+0x19c
> V  [libjvm.so+0x1051220]  void interpreter_init()+0x20
> V  [libjvm.so+0x10116e0]  int init_globals()+0xf0
> V  [libjvm.so+0x1ed8548]  int
> Threads::create_vm(JavaVMInitArgs*,bool*)+0x4a8
> V  [libjvm.so+0x11c7b58]  int
> JNI_CreateJavaVM_inner(JavaVM_**,void**,void*)+0x108
> C  [libjli.so+0x7950]  InitializeJVM+0x100
>
>
> On 10/2/17 7:55 AM, coleen.phillimore at oracle.com wrote:
>>
>>
>> I can sponsor this for you once you rebase, and fix these compilation
>> errors.
>> Thanks,
>> Coleen
>>
>> On 9/30/17 12:28 AM, Volker Simonis wrote:
>>>
>>> Hi Vladimir,
>>>
>>> thanks a lot for remembering these changes!
>>>
>>> Regards,
>>> Volker
>>>
>>>
>>> Vladimir Kozlov <vladimir.kozlov at oracle.com
>>> <mailto:vladimir.kozlov at oracle.com>> schrieb am Fr. 29. Sep. 2017 um 15:47:
>>>
>>>     I hit build failure when tried to push changes:
>>>
>>>     src/hotspot/share/code/codeBlob.hpp(162) : warning C4267: '=' :
>>> conversion from 'size_t' to 'int', possible loss of data
>>>     src/hotspot/share/code/codeBlob.hpp(163) : warning C4267: '=' :
>>> conversion from 'size_t' to 'int', possible loss of data
>>>
>>>     I am going to fix it by casting (int):
>>>
>>>     +  void adjust_size(size_t used) {
>>>     +    _size = (int)used;
>>>     +    _data_offset = (int)used;
>>>     +    _code_end = (address)this + used;
>>>     +    _data_end = (address)this + used;
>>>     +  }
>>>
>>>     Note, CodeCache size can't more than 2Gb (max_int) so such casting is
>>> fine.
>>>
>>>     Vladimir
>>>
>>>     On 9/6/17 6:20 AM, Volker Simonis wrote:
>>>     > On Tue, Sep 5, 2017 at 9:36 PM,  <coleen.phillimore at oracle.com
>>> <mailto:coleen.phillimore at oracle.com>> wrote:
>>>     >>
>>>     >> I was going to make the same comment about the friend declaration
>>> in v1, so
>>>     >> v2 looks better to me.  Looks good.  Thank you for finding a
>>> solution to
>>>     >> this problem that we've had for a long time.  I will sponsor this
>>> (remind me
>>>     >> if I forget after the 18th).
>>>     >>
>>>     >
>>>     > Thanks Coleen! I've updated
>>>     >
>>>     > http://cr.openjdk.java.net/~simonis/webrevs/2017/8166317.v2/
>>> <http://cr.openjdk.java.net/%7Esimonis/webrevs/2017/8166317.v2/>
>>>     >
>>>     > in-place and added you as a second reviewer.
>>>     >
>>>     > Regards,
>>>     > Volker
>>>     >
>>>     >
>>>     >> thanks,
>>>     >> Coleen
>>>     >>
>>>     >>
>>>     >>
>>>     >> On 9/5/17 1:17 PM, Vladimir Kozlov wrote:
>>>     >>>
>>>     >>> On 9/5/17 9:49 AM, Volker Simonis wrote:
>>>     >>>>
>>>     >>>> On Fri, Sep 1, 2017 at 6:16 PM, Vladimir Kozlov
>>>     >>>> <vladimir.kozlov at oracle.com <mailto:vladimir.kozlov at oracle.com>>
>>> wrote:
>>>     >>>>>
>>>     >>>>> May be add new CodeBlob's method to adjust sizes instead of
>>> directly
>>>     >>>>> setting
>>>     >>>>> them in  CodeCache::free_unused_tail(). Then you would not need
>>> friend
>>>     >>>>> class
>>>     >>>>> CodeCache in CodeBlob.
>>>     >>>>>
>>>     >>>>
>>>     >>>> Changed as suggested (I didn't liked the friend declaration as
>>> well :)
>>>     >>>>
>>>     >>>>> Also I think adjustment to header_size should be done in
>>>     >>>>> CodeCache::free_unused_tail() to limit scope of code who knows
>>> about
>>>     >>>>> blob
>>>     >>>>> layout.
>>>     >>>>>
>>>     >>>>
>>>     >>>> Yes, that's much cleaner. Please find the updated webrev here:
>>>     >>>>
>>>     >>>> http://cr.openjdk.java.net/~simonis/webrevs/2017/8166317.v2/
>>> <http://cr.openjdk.java.net/%7Esimonis/webrevs/2017/8166317.v2/>
>>>
>>>     >>>
>>>     >>>
>>>     >>> Good.
>>>     >>>
>>>     >>>>
>>>     >>>> I've also found another "day 1" problem in StubQueue::next():
>>>     >>>>
>>>     >>>>      Stub* next(Stub* s) const         { int i =
>>>     >>>> index_of(s) + stub_size(s);
>>>     >>>> -          if (i ==
>>>     >>>> _buffer_limit) i = 0;
>>>     >>>> +          // Only wrap
>>>     >>>> around in the non-contiguous case (see stubss.cpp)
>>>     >>>> +          if (i ==
>>>     >>>> _buffer_limit && _queue_end < _buffer_limit) i = 0;
>>>     >>>>            return (i ==
>>>     >>>> _queue_end) ? NULL : stub_at(i);
>>>     >>>>          }
>>>     >>>>
>>>     >>>> The problem was that the method was not prepared to handle the
>>> case
>>>     >>>> where _buffer_limit == _queue_end == _buffer_size which lead to
>>> an
>>>     >>>> infinite recursion when iterating over a StubQueue with
>>>     >>>> StubQueue::next() until next() returns NULL (as this was for
>>> example
>>>     >>>> done with -XX:+PrintInterpreter). But with the new, trimmed
>>> CodeBlob
>>>     >>>> we run into exactly this situation.
>>>     >>>
>>>     >>>
>>>     >>> Okay.
>>>     >>>
>>>     >>>>
>>>     >>>> While doing this last fix I also noticed that
>>> "StubQueue::stubs_do()",
>>>     >>>> "StubQueue::queues_do()" and "StubQueue::register_queue()" don't
>>> seem
>>>     >>>> to be used anywhere in the open code base (please correct me if
>>> I'm
>>>     >>>> wrong). What do you think, maybe we should remove this code in a
>>>     >>>> follow up change if it is really not needed?
>>>     >>>
>>>     >>>
>>>     >>> register_queue() is used in constructor. Other 2 you can remove.
>>>     >>> stub_code_begin() and stub_code_end() are not used too -remove.
>>>     >>> I thought we run on linux with flag which warn about unused code.
>>>     >>>
>>>     >>>>
>>>     >>>> Finally, could you please run the new version through JPRT and
>>> sponsor
>>>     >>>> it once jdk10/hs will be opened again?
>>>     >>>
>>>     >>>
>>>     >>> Will do when jdk10 "consolidation" is finished. Please, remind me
>>> later if
>>>     >>> I forget.
>>>     >>>
>>>     >>> Thanks,
>>>     >>> Vladimir
>>>     >>>
>>>     >>>>
>>>     >>>> Thanks,
>>>     >>>> Volker
>>>     >>>>
>>>     >>>>> Thanks,
>>>     >>>>> Vladimir
>>>     >>>>>
>>>     >>>>>
>>>     >>>>> On 9/1/17 8:46 AM, Volker Simonis wrote:
>>>     >>>>>>
>>>     >>>>>>
>>>     >>>>>> Hi,
>>>     >>>>>>
>>>     >>>>>> I've decided to split the fix for the
>>> 'CodeHeap::contains_blob()'
>>>     >>>>>> problem into its own issue "8187091: ReturnBlobToWrongHeapTest
>>> fails
>>>     >>>>>> because of problems in CodeHeap::contains_blob()"
>>>     >>>>>> (https://bugs.openjdk.java.net/browse/JDK-8187091) and started
>>> a new
>>>     >>>>>> review thread for discussing it at:
>>>     >>>>>>
>>>     >>>>>>
>>>     >>>>>>
>>> http://mail.openjdk.java.net/pipermail/hotspot-dev/2017-September/028206.html
>>>     >>>>>>
>>>     >>>>>> So please lets keep this thread for discussing the interpreter
>>> code
>>>     >>>>>> size issue only. I've prepared a new version of the webrev
>>> which is
>>>     >>>>>> the same as the first one with the only difference that the
>>> change to
>>>     >>>>>> 'CodeHeap::contains_blob()' has been removed:
>>>     >>>>>>
>>>     >>>>>> http://cr.openjdk.java.net/~simonis/webrevs/2017/8166317.v1/
>>> <http://cr.openjdk.java.net/%7Esimonis/webrevs/2017/8166317.v1/>
>>>     >>>>>>
>>>     >>>>>> Thanks,
>>>     >>>>>> Volker
>>>     >>>>>>
>>>     >>>>>>
>>>     >>>>>> On Thu, Aug 31, 2017 at 6:35 PM, Volker Simonis
>>>     >>>>>> <volker.simonis at gmail.com <mailto:volker.simonis at gmail.com>>
>>> wrote:
>>>     >>>>>>>
>>>     >>>>>>>
>>>     >>>>>>> On Thu, Aug 31, 2017 at 6:05 PM, Vladimir Kozlov
>>>     >>>>>>> <vladimir.kozlov at oracle.com
>>> <mailto:vladimir.kozlov at oracle.com>> wrote:
>>>     >>>>>>>>
>>>     >>>>>>>>
>>>     >>>>>>>> Very good change. Thank you, Volker.
>>>     >>>>>>>>
>>>     >>>>>>>> About contains_blob(). The problem is that AOTCompiledMethod
>>>     >>>>>>>> allocated
>>>     >>>>>>>> in
>>>     >>>>>>>> CHeap and not in aot code section (which is RO):
>>>     >>>>>>>>
>>>     >>>>>>>>
>>>     >>>>>>>>
>>>     >>>>>>>>
>>> http://hg.openjdk.java.net/jdk10/hs/hotspot/file/8acd232fb52a/src/share/vm/aot/aotCompiledMethod.hpp#l124
>>>     >>>>>>>>
>>>     >>>>>>>> It is allocated in CHeap after AOT library is loaded. Its
>>>     >>>>>>>> code_begin()
>>>     >>>>>>>> points to AOT code section but AOTCompiledMethod* points
>>> outside it
>>>     >>>>>>>> (to
>>>     >>>>>>>> normal malloced space) so you can't use (char*)blob address.
>>>     >>>>>>>>
>>>     >>>>>>>
>>>     >>>>>>> Thanks for the explanation - now I got it.
>>>     >>>>>>>
>>>     >>>>>>>> There are 2 ways to fix it, I think.
>>>     >>>>>>>> One is to add new field to CodeBlobLayout and set it to
>>> blob* address
>>>     >>>>>>>> for
>>>     >>>>>>>> normal CodeCache blobs and to code_begin for AOT code.
>>>     >>>>>>>> Second is to use contains(blob->code_end() - 1) assuming
>>> that AOT
>>>     >>>>>>>> code
>>>     >>>>>>>> is
>>>     >>>>>>>> never zero.
>>>     >>>>>>>>
>>>     >>>>>>>
>>>     >>>>>>> I'll give it a try tomorrow and will send out a new webrev.
>>>     >>>>>>>
>>>     >>>>>>> Regards,
>>>     >>>>>>> Volker
>>>     >>>>>>>
>>>     >>>>>>>> Thanks,
>>>     >>>>>>>> Vladimir
>>>     >>>>>>>>
>>>     >>>>>>>>
>>>     >>>>>>>> On 8/31/17 5:43 AM, Volker Simonis wrote:
>>>     >>>>>>>>>
>>>     >>>>>>>>>
>>>     >>>>>>>>>
>>>     >>>>>>>>> On Thu, Aug 31, 2017 at 12:14 PM, Claes Redestad
>>>     >>>>>>>>> <claes.redestad at oracle.com
>>> <mailto:claes.redestad at oracle.com>> wrote:
>>>     >>>>>>>>>>
>>>     >>>>>>>>>>
>>>     >>>>>>>>>>
>>>     >>>>>>>>>>
>>>     >>>>>>>>>>
>>>     >>>>>>>>>> On 2017-08-31 08:54, Volker Simonis wrote:
>>>     >>>>>>>>>>>
>>>     >>>>>>>>>>>
>>>     >>>>>>>>>>>
>>>     >>>>>>>>>>>
>>>     >>>>>>>>>>> While working on this, I found another problem which is
>>> related to
>>>     >>>>>>>>>>> the
>>>     >>>>>>>>>>> fix of JDK-8183573 and leads to crashes when executing
>>> the JTreg
>>>     >>>>>>>>>>> test
>>>     >>>>>>>>>>> compiler/codecache/stress/ReturnBlobToWrongHeapTest.java.
>>>     >>>>>>>>>>>
>>>     >>>>>>>>>>> The problem is that JDK-8183573 replaced
>>>     >>>>>>>>>>>
>>>     >>>>>>>>>>>        virtual bool contains_blob(const CodeBlob* blob)
>>> const {
>>>     >>>>>>>>>>> return
>>>     >>>>>>>>>>> low_boundary() <= (char*) blob && (char*) blob < high();
>>> }
>>>     >>>>>>>>>>>
>>>     >>>>>>>>>>> by:
>>>     >>>>>>>>>>>
>>>     >>>>>>>>>>>        bool contains_blob(const CodeBlob* blob) const {
>>> return
>>>     >>>>>>>>>>> contains(blob->code_begin()); }
>>>     >>>>>>>>>>>
>>>     >>>>>>>>>>> But that my be wrong in the corner case where the size of
>>> the
>>>     >>>>>>>>>>> CodeBlob's payload is zero (i.e. the CodeBlob consists
>>> only of the
>>>     >>>>>>>>>>> 'header' - i.e. the C++ object itself) because in that
>>> case
>>>     >>>>>>>>>>> CodeBlob::code_begin() points right behind the CodeBlob's
>>> header
>>>     >>>>>>>>>>> which
>>>     >>>>>>>>>>> is a memory location which doesn't belong to the CodeBlob
>>> anymore.
>>>     >>>>>>>>>>
>>>     >>>>>>>>>>
>>>     >>>>>>>>>>
>>>     >>>>>>>>>>
>>>     >>>>>>>>>>
>>>     >>>>>>>>>> I recall this change was somehow necessary to allow
>>> merging
>>>     >>>>>>>>>> AOTCodeHeap::contains_blob and CodeHead::contains_blob
>>> into
>>>     >>>>>>>>>> one devirtualized method, so you need to ensure all AOT
>>> tests
>>>     >>>>>>>>>> pass with this change (on linux-x64).
>>>     >>>>>>>>>>
>>>     >>>>>>>>>
>>>     >>>>>>>>> All of hotspot/test/aot and hotspot/test/jvmci executed and
>>> passed
>>>     >>>>>>>>> successful. Are there any other tests I should check?
>>>     >>>>>>>>>
>>>     >>>>>>>>> That said, it is a little hard to follow the stages of your
>>> change.
>>>     >>>>>>>>> It
>>>     >>>>>>>>> seems like
>>>     >>>>>>>>>
>>> http://cr.openjdk.java.net/~redestad/scratch/codeheap_contains.00/
>>> <http://cr.openjdk.java.net/%7Eredestad/scratch/codeheap_contains.00/>
>>>     >>>>>>>>> was reviewed [1] but then finally the slightly changed
>>> version from
>>>     >>>>>>>>>
>>> http://cr.openjdk.java.net/~redestad/scratch/codeheap_contains.01/
>>> <http://cr.openjdk.java.net/%7Eredestad/scratch/codeheap_contains.01/>
>>>
>>>     >>>>>>>>> was
>>>     >>>>>>>>> checked in and linked to the bug report.
>>>     >>>>>>>>>
>>>     >>>>>>>>> The first, reviewed version of the change still had a
>>> correct
>>>     >>>>>>>>> version
>>>     >>>>>>>>> of 'CodeHeap::contains_blob(const CodeBlob* blob)' while
>>> the second,
>>>     >>>>>>>>> checked in version has the faulty version of that method.
>>>     >>>>>>>>>
>>>     >>>>>>>>> I don't know why you finally did that change to
>>> 'contains_blob()'
>>>     >>>>>>>>> but
>>>     >>>>>>>>> I don't see any reason why we shouldn't be able to directly
>>> use the
>>>     >>>>>>>>> blob's address for inclusion checking. From what I
>>> understand, it
>>>     >>>>>>>>> should ALWAYS be contained in the corresponding CodeHeap so
>>> no
>>>     >>>>>>>>> reason
>>>     >>>>>>>>> to mess with 'CodeBlob::code_begin()'.
>>>     >>>>>>>>>
>>>     >>>>>>>>> Please let me know if I'm missing something.
>>>     >>>>>>>>>
>>>     >>>>>>>>> [1]
>>>     >>>>>>>>>
>>>     >>>>>>>>>
>>>     >>>>>>>>>
>>> http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2017-July/026624.html
>>>     >>>>>>>>>
>>>     >>>>>>>>>> I can't help to wonder if we'd not be better served by
>>> disallowing
>>>     >>>>>>>>>> zero-sized payloads. Is this something that can ever
>>> actually
>>>     >>>>>>>>>> happen except by abuse of the white box API?
>>>     >>>>>>>>>>
>>>     >>>>>>>>>
>>>     >>>>>>>>> The corresponding test (ReturnBlobToWrongHeapTest.java)
>>> specifically
>>>     >>>>>>>>> wants to allocate "segment sized" blocks which is most
>>> easily
>>>     >>>>>>>>> achieved
>>>     >>>>>>>>> by allocation zero-sized CodeBlobs. And I think there's
>>> nothing
>>>     >>>>>>>>> wrong
>>>     >>>>>>>>> about it if we handle the inclusion tests correctly.
>>>     >>>>>>>>>
>>>     >>>>>>>>> Thank you and best regards,
>>>     >>>>>>>>> Volker
>>>     >>>>>>>>>
>>>     >>>>>>>>>> /Claes
>>>     >>
>>>     >>
>>>
>>
>

From cthalinger at twitter.com  Mon Oct  9 19:45:58 2017
From: cthalinger at twitter.com (Christian Thalinger)
Date: Mon, 9 Oct 2017 09:45:58 -1000
Subject: RFR(M): 8166317: InterpreterCodeSize should be computed
In-Reply-To: <CA+3eh10SxnE3nOXUpuQ_+vp_S_LP0ByrRhD3A2Asyo9goccs3Q@mail.gmail.com>
References: <CA+3eh13aijzoGexJXDB7tTnFfmqE4BffbASUysSb2=osD7F4nA@mail.gmail.com>
 <50cda0ab-f403-372a-ce51-1a27d8821448@oracle.com>
 <CA+3eh11HCkBF8KkMG5-o-Ouji=KLqQ=FtztLWo6u3Han3yxoKw@mail.gmail.com>
 <e00bf17e-0ee5-75ba-8eb3-f766b984fa9f@oracle.com>
 <CA+3eh10Sxx5hsXABdeWCZ_P=AY5aR6F37Gs3cdFEv9P_K8GUrw@mail.gmail.com>
 <CA+3eh13fpKiZZgmHCAORgOjxdrEiTipN4zOxAxdz=++GiMZ0SQ@mail.gmail.com>
 <6704868d-caa7-51e0-4741-5d62f90d837c@oracle.com>
 <CA+3eh12_Z9S-oyMNJf8MLwtveDXAx+2xAqsKjOZjS5o_nm6WUQ@mail.gmail.com>
 <8c522d38-90db-2864-0778-6d5948b1f50c@oracle.com>
 <7fee08f1-8304-3026-19e9-844e618e98ea@oracle.com>
 <CA+3eh125jBF2OEMMJ_22N__UHx=Kj6NMKtpfnfGnnf0WUk4Wbw@mail.gmail.com>
 <2bb4136a-8c0e-ac4c-0c03-af38ff79ab40@oracle.com>
 <CA+3eh125R_kgucQ0pfUAgmW9a+y=hx05_N4eomHBSzd1viEztg@mail.gmail.com>
 <5b5219a5-960e-363b-2bdc-3613f1dae62c@oracle.com>
 <ff56e260-c520-f0a2-32ac-512af28b317c@oracle.com>
 <CA+3eh10SxnE3nOXUpuQ_+vp_S_LP0ByrRhD3A2Asyo9goccs3Q@mail.gmail.com>
Message-ID: <50CACA26-DF35-428F-8385-AB4CE74FFD6E@twitter.com>


> On Oct 9, 2017, at 9:24 AM, Volker Simonis <volker.simonis at gmail.com> wrote:
> 
> Hi Vladimir,
> 
> I've analyzed the crash. The problem is Sparc specific because on
> Sparc we do not call the SharedRuntime for G1 pre/post barriers (i.e.
> SharedRuntime::g1_wb_pre() / SharedRuntime::g1_wb_post()) like on
> other architectures. Instead we lazily create assembler stubs on the
> fly (generate_satb_log_enqueue_if_necessary() /
> generate_dirty_card_log_enqueue_if_necessary()) when they are needed.

Why are we using these stubs on SPARC?  Can we get rid of them and just call into the runtime instead?

> This happens during the generation of the interpreter and allocates
> more memory in the code cache such that we can't shrink the memory
> which was initially allocated for the interpreter any more.
> 
> Unfortunately we can't easily generate these stubs during
> 'stubRoutines_init1()' because
> 'generate_dirty_card_log_enqueue_if_necessary()' needs the byte map
> base address which is only initialized in
> 'CardTableModRefBS::initialize()' during 'univers_init()' which
> happens after 'stubRoutines_init1()'.
> 
> I'm still thinking about a good way to fix this without too many
> platfrom-specific ifdefs.
> 
> Regards,
> Volker
> 
> 
> On Tue, Oct 3, 2017 at 9:46 PM, Vladimir Kozlov
> <vladimir.kozlov at oracle.com> wrote:
>> I rebased it. But there is problem with changes. VM hit guarantee() in this
>> code when run on SPARC in both, fastdebug and product, builds.
>> Crash happens during build. We can't push this - problem should be
>> investigated and fixed first.
>> 
>> Thanks,
>> Vladimir
>> 
>> make/Main.gmk:443: recipe for target 'generate-link-opt-data' failed
>> /usr/ccs/bin/bash: line 4:  9349 Abort                   (core dumped)
>> /s/build/solaris-sparcv9-debug/support/interim-image/bin/java
>> -XX:DumpLoadedClassList=/s/build/solaris-sparcv9-debug/support/link_opt/classlist
>> -Djava.lang.invoke.MethodHandle.TRACE_RESOLVE=true -cp
>> /s/build/solaris-sparcv9-debug/support/classlist.jar
>> build.tools.classlist.HelloClasslist 2>&1 >
>> /s/build/solaris-sparcv9-debug/support/link_opt/default_jli_trace.txt
>> make[3]: *** [/s/build/solaris-sparcv9-debug/support/link_opt/classlist]
>> Error 134
>> make[2]: *** [generate-link-opt-data] Error 1
>> 
>> 
>> # A fatal error has been detected by the Java Runtime Environment:
>> #
>> #  Internal Error (/s/open/src/hotspot/share/memory/heap.cpp:233), pid=9349,
>> tid=2
>> #  guarantee(b == block_at(_next_segment - actual_number_of_segments))
>> failed: Intermediate allocation!
>> #
>> # JRE version:  (10.0) (fastdebug build )
>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug
>> 10-internal+0-2017-09-30-014154.8166317, mixed mode, tiered, compressed
>> oops, g1 gc, solaris-sparc)
>> # Core dump will be written. Default location: /s/open/make/core or
>> core.9349
>> #
>> # If you would like to submit a bug report, please visit:
>> #   http://bugreport.java.com/bugreport/crash.jsp
>> #
>> 
>> ---------------  S U M M A R Y ------------
>> 
>> Command Line:
>> -XX:DumpLoadedClassList=/s/build/solaris-sparcv9-debug/support/link_opt/classlist
>> -Djava.lang.invoke.MethodHandle.TRACE_RESOLVE=true
>> build.tools.classlist.HelloClasslist
>> 
>> Host: sca00dbv, Sparcv9 64 bit 3600 MHz, 16 cores, 32G, Oracle Solaris 11.2
>> SPARC
>> Time: Sat Sep 30 03:29:46 2017 UTC elapsed time: 0 seconds (0d 0h 0m 0s)
>> 
>> ---------------  T H R E A D  ---------------
>> 
>> Current thread (0x000000010012f000):  JavaThread "Unknown thread"
>> [_thread_in_vm, id=2, stack(0x0007fffef9700000,0x0007fffef9800000)]
>> 
>> Stack: [0x0007fffef9700000,0x0007fffef9800000],  sp=0x0007fffef97ff020,
>> free space=1020k
>> Native frames: (J=compiled Java code, A=aot compiled Java code,
>> j=interpreted, Vv=VM code, C=native code)
>> V  [libjvm.so+0x1f94508]  void VMError::report_and_die(int,const char*,const
>> char*,void*,Thread*,unsigned char*,void*,void*,const char*,int,unsigned
>> long)+0xa58
>> V  [libjvm.so+0x1f93a3c]  void VMError::report_and_die(Thread*,const
>> char*,int,const char*,const char*,void*)+0x3c
>> V  [libjvm.so+0xd02f38]  void report_vm_error(const char*,int,const
>> char*,const char*,...)+0x78
>> V  [libjvm.so+0xfc219c]  void CodeHeap::deallocate_tail(void*,unsigned
>> long)+0xec
>> V  [libjvm.so+0xbf4f14]  void CodeCache::free_unused_tail(CodeBlob*,unsigned
>> long)+0xe4
>> V  [libjvm.so+0x1e0ae70]  void StubQueue::deallocate_unused_tail()+0x40
>> V  [libjvm.so+0x1e7452c]  void TemplateInterpreter::initialize()+0x19c
>> V  [libjvm.so+0x1051220]  void interpreter_init()+0x20
>> V  [libjvm.so+0x10116e0]  int init_globals()+0xf0
>> V  [libjvm.so+0x1ed8548]  int
>> Threads::create_vm(JavaVMInitArgs*,bool*)+0x4a8
>> V  [libjvm.so+0x11c7b58]  int
>> JNI_CreateJavaVM_inner(JavaVM_**,void**,void*)+0x108
>> C  [libjli.so+0x7950]  InitializeJVM+0x100
>> 
>> 
>> On 10/2/17 7:55 AM, coleen.phillimore at oracle.com wrote:
>>> 
>>> 
>>> I can sponsor this for you once you rebase, and fix these compilation
>>> errors.
>>> Thanks,
>>> Coleen
>>> 
>>> On 9/30/17 12:28 AM, Volker Simonis wrote:
>>>> 
>>>> Hi Vladimir,
>>>> 
>>>> thanks a lot for remembering these changes!
>>>> 
>>>> Regards,
>>>> Volker
>>>> 
>>>> 
>>>> Vladimir Kozlov <vladimir.kozlov at oracle.com
>>>> <mailto:vladimir.kozlov at oracle.com>> schrieb am Fr. 29. Sep. 2017 um 15:47:
>>>> 
>>>>    I hit build failure when tried to push changes:
>>>> 
>>>>    src/hotspot/share/code/codeBlob.hpp(162) : warning C4267: '=' :
>>>> conversion from 'size_t' to 'int', possible loss of data
>>>>    src/hotspot/share/code/codeBlob.hpp(163) : warning C4267: '=' :
>>>> conversion from 'size_t' to 'int', possible loss of data
>>>> 
>>>>    I am going to fix it by casting (int):
>>>> 
>>>>    +  void adjust_size(size_t used) {
>>>>    +    _size = (int)used;
>>>>    +    _data_offset = (int)used;
>>>>    +    _code_end = (address)this + used;
>>>>    +    _data_end = (address)this + used;
>>>>    +  }
>>>> 
>>>>    Note, CodeCache size can't more than 2Gb (max_int) so such casting is
>>>> fine.
>>>> 
>>>>    Vladimir
>>>> 
>>>>    On 9/6/17 6:20 AM, Volker Simonis wrote:
>>>>> On Tue, Sep 5, 2017 at 9:36 PM,  <coleen.phillimore at oracle.com
>>>> <mailto:coleen.phillimore at oracle.com>> wrote:
>>>>>> 
>>>>>> I was going to make the same comment about the friend declaration
>>>> in v1, so
>>>>>> v2 looks better to me.  Looks good.  Thank you for finding a
>>>> solution to
>>>>>> this problem that we've had for a long time.  I will sponsor this
>>>> (remind me
>>>>>> if I forget after the 18th).
>>>>>> 
>>>>> 
>>>>> Thanks Coleen! I've updated
>>>>> 
>>>>> http://cr.openjdk.java.net/~simonis/webrevs/2017/8166317.v2/
>>>> <http://cr.openjdk.java.net/%7Esimonis/webrevs/2017/8166317.v2/>
>>>>> 
>>>>> in-place and added you as a second reviewer.
>>>>> 
>>>>> Regards,
>>>>> Volker
>>>>> 
>>>>> 
>>>>>> thanks,
>>>>>> Coleen
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On 9/5/17 1:17 PM, Vladimir Kozlov wrote:
>>>>>>> 
>>>>>>> On 9/5/17 9:49 AM, Volker Simonis wrote:
>>>>>>>> 
>>>>>>>> On Fri, Sep 1, 2017 at 6:16 PM, Vladimir Kozlov
>>>>>>>> <vladimir.kozlov at oracle.com <mailto:vladimir.kozlov at oracle.com>>
>>>> wrote:
>>>>>>>>> 
>>>>>>>>> May be add new CodeBlob's method to adjust sizes instead of
>>>> directly
>>>>>>>>> setting
>>>>>>>>> them in  CodeCache::free_unused_tail(). Then you would not need
>>>> friend
>>>>>>>>> class
>>>>>>>>> CodeCache in CodeBlob.
>>>>>>>>> 
>>>>>>>> 
>>>>>>>> Changed as suggested (I didn't liked the friend declaration as
>>>> well :)
>>>>>>>> 
>>>>>>>>> Also I think adjustment to header_size should be done in
>>>>>>>>> CodeCache::free_unused_tail() to limit scope of code who knows
>>>> about
>>>>>>>>> blob
>>>>>>>>> layout.
>>>>>>>>> 
>>>>>>>> 
>>>>>>>> Yes, that's much cleaner. Please find the updated webrev here:
>>>>>>>> 
>>>>>>>> http://cr.openjdk.java.net/~simonis/webrevs/2017/8166317.v2/
>>>> <http://cr.openjdk.java.net/%7Esimonis/webrevs/2017/8166317.v2/>
>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> Good.
>>>>>>> 
>>>>>>>> 
>>>>>>>> I've also found another "day 1" problem in StubQueue::next():
>>>>>>>> 
>>>>>>>>     Stub* next(Stub* s) const         { int i =
>>>>>>>> index_of(s) + stub_size(s);
>>>>>>>> -          if (i ==
>>>>>>>> _buffer_limit) i = 0;
>>>>>>>> +          // Only wrap
>>>>>>>> around in the non-contiguous case (see stubss.cpp)
>>>>>>>> +          if (i ==
>>>>>>>> _buffer_limit && _queue_end < _buffer_limit) i = 0;
>>>>>>>>           return (i ==
>>>>>>>> _queue_end) ? NULL : stub_at(i);
>>>>>>>>         }
>>>>>>>> 
>>>>>>>> The problem was that the method was not prepared to handle the
>>>> case
>>>>>>>> where _buffer_limit == _queue_end == _buffer_size which lead to
>>>> an
>>>>>>>> infinite recursion when iterating over a StubQueue with
>>>>>>>> StubQueue::next() until next() returns NULL (as this was for
>>>> example
>>>>>>>> done with -XX:+PrintInterpreter). But with the new, trimmed
>>>> CodeBlob
>>>>>>>> we run into exactly this situation.
>>>>>>> 
>>>>>>> 
>>>>>>> Okay.
>>>>>>> 
>>>>>>>> 
>>>>>>>> While doing this last fix I also noticed that
>>>> "StubQueue::stubs_do()",
>>>>>>>> "StubQueue::queues_do()" and "StubQueue::register_queue()" don't
>>>> seem
>>>>>>>> to be used anywhere in the open code base (please correct me if
>>>> I'm
>>>>>>>> wrong). What do you think, maybe we should remove this code in a
>>>>>>>> follow up change if it is really not needed?
>>>>>>> 
>>>>>>> 
>>>>>>> register_queue() is used in constructor. Other 2 you can remove.
>>>>>>> stub_code_begin() and stub_code_end() are not used too -remove.
>>>>>>> I thought we run on linux with flag which warn about unused code.
>>>>>>> 
>>>>>>>> 
>>>>>>>> Finally, could you please run the new version through JPRT and
>>>> sponsor
>>>>>>>> it once jdk10/hs will be opened again?
>>>>>>> 
>>>>>>> 
>>>>>>> Will do when jdk10 "consolidation" is finished. Please, remind me
>>>> later if
>>>>>>> I forget.
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> Vladimir
>>>>>>> 
>>>>>>>> 
>>>>>>>> Thanks,
>>>>>>>> Volker
>>>>>>>> 
>>>>>>>>> Thanks,
>>>>>>>>> Vladimir
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On 9/1/17 8:46 AM, Volker Simonis wrote:
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> Hi,
>>>>>>>>>> 
>>>>>>>>>> I've decided to split the fix for the
>>>> 'CodeHeap::contains_blob()'
>>>>>>>>>> problem into its own issue "8187091: ReturnBlobToWrongHeapTest
>>>> fails
>>>>>>>>>> because of problems in CodeHeap::contains_blob()"
>>>>>>>>>> (https://bugs.openjdk.java.net/browse/JDK-8187091) and started
>>>> a new
>>>>>>>>>> review thread for discussing it at:
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>> http://mail.openjdk.java.net/pipermail/hotspot-dev/2017-September/028206.html
>>>>>>>>>> 
>>>>>>>>>> So please lets keep this thread for discussing the interpreter
>>>> code
>>>>>>>>>> size issue only. I've prepared a new version of the webrev
>>>> which is
>>>>>>>>>> the same as the first one with the only difference that the
>>>> change to
>>>>>>>>>> 'CodeHeap::contains_blob()' has been removed:
>>>>>>>>>> 
>>>>>>>>>> http://cr.openjdk.java.net/~simonis/webrevs/2017/8166317.v1/
>>>> <http://cr.openjdk.java.net/%7Esimonis/webrevs/2017/8166317.v1/>
>>>>>>>>>> 
>>>>>>>>>> Thanks,
>>>>>>>>>> Volker
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> On Thu, Aug 31, 2017 at 6:35 PM, Volker Simonis
>>>>>>>>>> <volker.simonis at gmail.com <mailto:volker.simonis at gmail.com>>
>>>> wrote:
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> On Thu, Aug 31, 2017 at 6:05 PM, Vladimir Kozlov
>>>>>>>>>>> <vladimir.kozlov at oracle.com
>>>> <mailto:vladimir.kozlov at oracle.com>> wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> Very good change. Thank you, Volker.
>>>>>>>>>>>> 
>>>>>>>>>>>> About contains_blob(). The problem is that AOTCompiledMethod
>>>>>>>>>>>> allocated
>>>>>>>>>>>> in
>>>>>>>>>>>> CHeap and not in aot code section (which is RO):
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>> http://hg.openjdk.java.net/jdk10/hs/hotspot/file/8acd232fb52a/src/share/vm/aot/aotCompiledMethod.hpp#l124
>>>>>>>>>>>> 
>>>>>>>>>>>> It is allocated in CHeap after AOT library is loaded. Its
>>>>>>>>>>>> code_begin()
>>>>>>>>>>>> points to AOT code section but AOTCompiledMethod* points
>>>> outside it
>>>>>>>>>>>> (to
>>>>>>>>>>>> normal malloced space) so you can't use (char*)blob address.
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> Thanks for the explanation - now I got it.
>>>>>>>>>>> 
>>>>>>>>>>>> There are 2 ways to fix it, I think.
>>>>>>>>>>>> One is to add new field to CodeBlobLayout and set it to
>>>> blob* address
>>>>>>>>>>>> for
>>>>>>>>>>>> normal CodeCache blobs and to code_begin for AOT code.
>>>>>>>>>>>> Second is to use contains(blob->code_end() - 1) assuming
>>>> that AOT
>>>>>>>>>>>> code
>>>>>>>>>>>> is
>>>>>>>>>>>> never zero.
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> I'll give it a try tomorrow and will send out a new webrev.
>>>>>>>>>>> 
>>>>>>>>>>> Regards,
>>>>>>>>>>> Volker
>>>>>>>>>>> 
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Vladimir
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> On 8/31/17 5:43 AM, Volker Simonis wrote:
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> On Thu, Aug 31, 2017 at 12:14 PM, Claes Redestad
>>>>>>>>>>>>> <claes.redestad at oracle.com
>>>> <mailto:claes.redestad at oracle.com>> wrote:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> On 2017-08-31 08:54, Volker Simonis wrote:
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> While working on this, I found another problem which is
>>>> related to
>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>> fix of JDK-8183573 and leads to crashes when executing
>>>> the JTreg
>>>>>>>>>>>>>>> test
>>>>>>>>>>>>>>> compiler/codecache/stress/ReturnBlobToWrongHeapTest.java.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> The problem is that JDK-8183573 replaced
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>       virtual bool contains_blob(const CodeBlob* blob)
>>>> const {
>>>>>>>>>>>>>>> return
>>>>>>>>>>>>>>> low_boundary() <= (char*) blob && (char*) blob < high();
>>>> }
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> by:
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>       bool contains_blob(const CodeBlob* blob) const {
>>>> return
>>>>>>>>>>>>>>> contains(blob->code_begin()); }
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> But that my be wrong in the corner case where the size of
>>>> the
>>>>>>>>>>>>>>> CodeBlob's payload is zero (i.e. the CodeBlob consists
>>>> only of the
>>>>>>>>>>>>>>> 'header' - i.e. the C++ object itself) because in that
>>>> case
>>>>>>>>>>>>>>> CodeBlob::code_begin() points right behind the CodeBlob's
>>>> header
>>>>>>>>>>>>>>> which
>>>>>>>>>>>>>>> is a memory location which doesn't belong to the CodeBlob
>>>> anymore.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> I recall this change was somehow necessary to allow
>>>> merging
>>>>>>>>>>>>>> AOTCodeHeap::contains_blob and CodeHead::contains_blob
>>>> into
>>>>>>>>>>>>>> one devirtualized method, so you need to ensure all AOT
>>>> tests
>>>>>>>>>>>>>> pass with this change (on linux-x64).
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> All of hotspot/test/aot and hotspot/test/jvmci executed and
>>>> passed
>>>>>>>>>>>>> successful. Are there any other tests I should check?
>>>>>>>>>>>>> 
>>>>>>>>>>>>> That said, it is a little hard to follow the stages of your
>>>> change.
>>>>>>>>>>>>> It
>>>>>>>>>>>>> seems like
>>>>>>>>>>>>> 
>>>> http://cr.openjdk.java.net/~redestad/scratch/codeheap_contains.00/
>>>> <http://cr.openjdk.java.net/%7Eredestad/scratch/codeheap_contains.00/>
>>>>>>>>>>>>> was reviewed [1] but then finally the slightly changed
>>>> version from
>>>>>>>>>>>>> 
>>>> http://cr.openjdk.java.net/~redestad/scratch/codeheap_contains.01/
>>>> <http://cr.openjdk.java.net/%7Eredestad/scratch/codeheap_contains.01/>
>>>> 
>>>>>>>>>>>>> was
>>>>>>>>>>>>> checked in and linked to the bug report.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> The first, reviewed version of the change still had a
>>>> correct
>>>>>>>>>>>>> version
>>>>>>>>>>>>> of 'CodeHeap::contains_blob(const CodeBlob* blob)' while
>>>> the second,
>>>>>>>>>>>>> checked in version has the faulty version of that method.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> I don't know why you finally did that change to
>>>> 'contains_blob()'
>>>>>>>>>>>>> but
>>>>>>>>>>>>> I don't see any reason why we shouldn't be able to directly
>>>> use the
>>>>>>>>>>>>> blob's address for inclusion checking. From what I
>>>> understand, it
>>>>>>>>>>>>> should ALWAYS be contained in the corresponding CodeHeap so
>>>> no
>>>>>>>>>>>>> reason
>>>>>>>>>>>>> to mess with 'CodeBlob::code_begin()'.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Please let me know if I'm missing something.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> [1]
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>> http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2017-July/026624.html
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> I can't help to wonder if we'd not be better served by
>>>> disallowing
>>>>>>>>>>>>>> zero-sized payloads. Is this something that can ever
>>>> actually
>>>>>>>>>>>>>> happen except by abuse of the white box API?
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> The corresponding test (ReturnBlobToWrongHeapTest.java)
>>>> specifically
>>>>>>>>>>>>> wants to allocate "segment sized" blocks which is most
>>>> easily
>>>>>>>>>>>>> achieved
>>>>>>>>>>>>> by allocation zero-sized CodeBlobs. And I think there's
>>>> nothing
>>>>>>>>>>>>> wrong
>>>>>>>>>>>>> about it if we handle the inclusion tests correctly.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Thank you and best regards,
>>>>>>>>>>>>> Volker
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> /Claes
>>>>>> 
>>>>>> 
>>>> 
>>> 
>> 


From jesper.wilhelmsson at oracle.com  Tue Oct 10 01:35:21 2017
From: jesper.wilhelmsson at oracle.com (jesper.wilhelmsson at oracle.com)
Date: Tue, 10 Oct 2017 03:35:21 +0200
Subject: RFR (xs): JDK-8189071 - Require jtreg 4.2 b09
Message-ID: <231091EC-AF95-4C88-A5CC-F555FC2C9CC1@oracle.com>

Hi,

Can I have a review of this trivial fix to update the jib-profile to require the latest version of jtreg. This to get rid of the SocketTimeoutException that we see in the hotspot nightlies.

Bug: https://bugs.openjdk.java.net/browse/JDK-8189071

The change is:

diff --git a/make/conf/jib-profiles.js b/make/conf/jib-profiles.js
--- a/make/conf/jib-profiles.js
+++ b/make/conf/jib-profiles.js
@@ -1063,7 +1063,7 @@
         jtreg: {
             server: "javare",
             revision: "4.2",
-            build_number: "b08",
+            build_number: "b09",
             checksum_file: "MD5_VALUES",
             file: "jtreg_bin-4.2.zip",
             environment_name: "JT_HOME",


Thanks,
/Jesper


From david.holmes at oracle.com  Tue Oct 10 01:45:40 2017
From: david.holmes at oracle.com (David Holmes)
Date: Tue, 10 Oct 2017 11:45:40 +1000
Subject: RFR (xs): JDK-8189071 - Require jtreg 4.2 b09
In-Reply-To: <231091EC-AF95-4C88-A5CC-F555FC2C9CC1@oracle.com>
References: <231091EC-AF95-4C88-A5CC-F555FC2C9CC1@oracle.com>
Message-ID: <0c3c434f-8a4e-1c91-d21d-62028382c8d6@oracle.com>

Reviewed!

Push it under trivial rules.

Thanks,
David

On 10/10/2017 11:35 AM, jesper.wilhelmsson at oracle.com wrote:
> Hi,
> 
> Can I have a review of this trivial fix to update the jib-profile to require the latest version of jtreg. This to get rid of the SocketTimeoutException that we see in the hotspot nightlies.
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8189071
> 
> The change is:
> 
> diff --git a/make/conf/jib-profiles.js b/make/conf/jib-profiles.js
> --- a/make/conf/jib-profiles.js
> +++ b/make/conf/jib-profiles.js
> @@ -1063,7 +1063,7 @@
>           jtreg: {
>               server: "javare",
>               revision: "4.2",
> -            build_number: "b08",
> +            build_number: "b09",
>               checksum_file: "MD5_VALUES",
>               file: "jtreg_bin-4.2.zip",
>               environment_name: "JT_HOME",
> 
> 
> Thanks,
> /Jesper
> 

From george.triantafillou at oracle.com  Tue Oct 10 01:50:45 2017
From: george.triantafillou at oracle.com (George Triantafillou)
Date: Mon, 9 Oct 2017 21:50:45 -0400
Subject: RFR (xs): JDK-8189071 - Require jtreg 4.2 b09
In-Reply-To: <231091EC-AF95-4C88-A5CC-F555FC2C9CC1@oracle.com>
References: <231091EC-AF95-4C88-A5CC-F555FC2C9CC1@oracle.com>
Message-ID: <a81662b9-2f66-7cc7-34b1-b166977b94f4@oracle.com>

Hi Jesper,

Looks good.

-George
On 10/9/2017 9:35 PM, jesper.wilhelmsson at oracle.com wrote:
> Hi,
>
> Can I have a review of this trivial fix to update the jib-profile to require the latest version of jtreg. This to get rid of the SocketTimeoutException that we see in the hotspot nightlies.
>
> Bug: https://bugs.openjdk.java.net/browse/JDK-8189071
>
> The change is:
>
> diff --git a/make/conf/jib-profiles.js b/make/conf/jib-profiles.js
> --- a/make/conf/jib-profiles.js
> +++ b/make/conf/jib-profiles.js
> @@ -1063,7 +1063,7 @@
>           jtreg: {
>               server: "javare",
>               revision: "4.2",
> -            build_number: "b08",
> +            build_number: "b09",
>               checksum_file: "MD5_VALUES",
>               file: "jtreg_bin-4.2.zip",
>               environment_name: "JT_HOME",
>
>
> Thanks,
> /Jesper
>


From jesper.wilhelmsson at oracle.com  Tue Oct 10 01:52:07 2017
From: jesper.wilhelmsson at oracle.com (jesper.wilhelmsson at oracle.com)
Date: Tue, 10 Oct 2017 03:52:07 +0200
Subject: RFR (xs): JDK-8189071 - Require jtreg 4.2 b09
In-Reply-To: <0c3c434f-8a4e-1c91-d21d-62028382c8d6@oracle.com>
References: <231091EC-AF95-4C88-A5CC-F555FC2C9CC1@oracle.com>
 <0c3c434f-8a4e-1c91-d21d-62028382c8d6@oracle.com>
Message-ID: <1F417DB9-7275-4240-85CF-8F3AA2667E0D@oracle.com>

Thanks David!
/Jesper

> On 10 Oct 2017, at 03:45, David Holmes <david.holmes at oracle.com> wrote:
> 
> Reviewed!
> 
> Push it under trivial rules.
> 
> Thanks,
> David
> 
> On 10/10/2017 11:35 AM, jesper.wilhelmsson at oracle.com wrote:
>> Hi,
>> Can I have a review of this trivial fix to update the jib-profile to require the latest version of jtreg. This to get rid of the SocketTimeoutException that we see in the hotspot nightlies.
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8189071
>> The change is:
>> diff --git a/make/conf/jib-profiles.js b/make/conf/jib-profiles.js
>> --- a/make/conf/jib-profiles.js
>> +++ b/make/conf/jib-profiles.js
>> @@ -1063,7 +1063,7 @@
>>          jtreg: {
>>              server: "javare",
>>              revision: "4.2",
>> -            build_number: "b08",
>> +            build_number: "b09",
>>              checksum_file: "MD5_VALUES",
>>              file: "jtreg_bin-4.2.zip",
>>              environment_name: "JT_HOME",
>> Thanks,
>> /Jesper


From jesper.wilhelmsson at oracle.com  Tue Oct 10 01:52:26 2017
From: jesper.wilhelmsson at oracle.com (jesper.wilhelmsson at oracle.com)
Date: Tue, 10 Oct 2017 03:52:26 +0200
Subject: RFR (xs): JDK-8189071 - Require jtreg 4.2 b09
In-Reply-To: <a81662b9-2f66-7cc7-34b1-b166977b94f4@oracle.com>
References: <231091EC-AF95-4C88-A5CC-F555FC2C9CC1@oracle.com>
 <a81662b9-2f66-7cc7-34b1-b166977b94f4@oracle.com>
Message-ID: <F0077598-234B-466A-A2B1-2A0E4573F60F@oracle.com>

Thanks George!
/Jesper

> On 10 Oct 2017, at 03:50, George Triantafillou <george.triantafillou at oracle.com> wrote:
> 
> Hi Jesper,
> 
> Looks good.
> 
> -George
> On 10/9/2017 9:35 PM, jesper.wilhelmsson at oracle.com wrote:
>> Hi,
>> 
>> Can I have a review of this trivial fix to update the jib-profile to require the latest version of jtreg. This to get rid of the SocketTimeoutException that we see in the hotspot nightlies.
>> 
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8189071
>> 
>> The change is:
>> 
>> diff --git a/make/conf/jib-profiles.js b/make/conf/jib-profiles.js
>> --- a/make/conf/jib-profiles.js
>> +++ b/make/conf/jib-profiles.js
>> @@ -1063,7 +1063,7 @@
>>          jtreg: {
>>              server: "javare",
>>              revision: "4.2",
>> -            build_number: "b08",
>> +            build_number: "b09",
>>              checksum_file: "MD5_VALUES",
>>              file: "jtreg_bin-4.2.zip",
>>              environment_name: "JT_HOME",
>> 
>> 
>> Thanks,
>> /Jesper
>> 
> 


From aph at redhat.com  Tue Oct 10 07:42:27 2017
From: aph at redhat.com (Andrew Haley)
Date: Tue, 10 Oct 2017 08:42:27 +0100
Subject: RFR(M): 8166317: InterpreterCodeSize should be computed
In-Reply-To: <CA+3eh10SxnE3nOXUpuQ_+vp_S_LP0ByrRhD3A2Asyo9goccs3Q@mail.gmail.com>
References: <CA+3eh13aijzoGexJXDB7tTnFfmqE4BffbASUysSb2=osD7F4nA@mail.gmail.com>
 <50cda0ab-f403-372a-ce51-1a27d8821448@oracle.com>
 <CA+3eh11HCkBF8KkMG5-o-Ouji=KLqQ=FtztLWo6u3Han3yxoKw@mail.gmail.com>
 <e00bf17e-0ee5-75ba-8eb3-f766b984fa9f@oracle.com>
 <CA+3eh10Sxx5hsXABdeWCZ_P=AY5aR6F37Gs3cdFEv9P_K8GUrw@mail.gmail.com>
 <CA+3eh13fpKiZZgmHCAORgOjxdrEiTipN4zOxAxdz=++GiMZ0SQ@mail.gmail.com>
 <6704868d-caa7-51e0-4741-5d62f90d837c@oracle.com>
 <CA+3eh12_Z9S-oyMNJf8MLwtveDXAx+2xAqsKjOZjS5o_nm6WUQ@mail.gmail.com>
 <8c522d38-90db-2864-0778-6d5948b1f50c@oracle.com>
 <7fee08f1-8304-3026-19e9-844e618e98ea@oracle.com>
 <CA+3eh125jBF2OEMMJ_22N__UHx=Kj6NMKtpfnfGnnf0WUk4Wbw@mail.gmail.com>
 <2bb4136a-8c0e-ac4c-0c03-af38ff79ab40@oracle.com>
 <CA+3eh125R_kgucQ0pfUAgmW9a+y=hx05_N4eomHBSzd1viEztg@mail.gmail.com>
 <5b5219a5-960e-363b-2bdc-3613f1dae62c@oracle.com>
 <ff56e260-c520-f0a2-32ac-512af28b317c@oracle.com>
 <CA+3eh10SxnE3nOXUpuQ_+vp_S_LP0ByrRhD3A2Asyo9goccs3Q@mail.gmail.com>
Message-ID: <4109f960-078f-e582-3c78-71f201a265fd@redhat.com>

On 09/10/17 20:24, Volker Simonis wrote:
> Unfortunately we can't easily generate these stubs during
> 'stubRoutines_init1()' because
> 'generate_dirty_card_log_enqueue_if_necessary()' needs the byte map
> base address which is only initialized in
> 'CardTableModRefBS::initialize()' during 'univers_init()' which
> happens after 'stubRoutines_init1()'.

Yes you can, you can do something like we do for narrow_ptrs_base:

    if (Universe::is_fully_initialized()) {
      mov(rheapbase, Universe::narrow_ptrs_base());
    } else {
      lea(rheapbase, ExternalAddress((address)Universe::narrow_ptrs_base_addr()));
      ldr(rheapbase, Address(rheapbase));
    }

-- 
Andrew Haley
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671

From kim.barrett at oracle.com  Tue Oct 10 08:29:57 2017
From: kim.barrett at oracle.com (Kim Barrett)
Date: Tue, 10 Oct 2017 04:29:57 -0400
Subject: RFR: 8189088: Add intrusive doubly-linked list utility
Message-ID: <F84BB0A2-9B92-48F9-B497-EF5626786462@oracle.com>

RFR: 8189088: Add intrusive doubly-linked list utility

Please review this new facility, providing a general mechanism for
intrusive doubly-linked lists. A class supports inclusion in a list by
having an IntrusiveListLink member, and providing structured
information about how to access that member. A class supports
inclusion in multiple lists by having multiple IntrusiveListLink
members, with different lists specified to use different members.

The IntrusiveList class template provides the list management. It is
modelled on bidirectional containers such as std::list and
boost::intrusive::list, providing many of the expected member types
and functions. (Note that the member types use the Standard's naming
conventions.) (Not all standard container requirements are met; some
operations are not presently supported because they haven't been
needed yet.) This includes iteration support using (mostly)
standard-conforming iterator types (they are presently missing
iterator_category member types, pending being able to include
<iterator> so we can use std::bidirectional_iterator_tag).

This change only provides the new facility, and doesn't include any
uses of it, though there is a suite of unit tests for it. I've
extracted it from some in-progress work, as a useful tool in it's own
right.

I've converted a couple existing list implementations to use
IntrusiveList, and will be submitting those changes once this
infrastructure is in place. One place I haven't yet touched that I
think will benefit is G1's region handling. There are various places
where G1 iterates over all regions in order to do something with those
which satisfy some property (humongous regions, regions in the
collection set, &etc). If it were trivial to create new region
sublists (and this facility makes that easy), some of these could be
turned into direct iteration over only the regions of interest.

CR:
https://bugs.openjdk.java.net/browse/JDK-8189088

Webrev:
http://cr.openjdk.java.net/~kbarrett/8189088

Testing:
JPRT to build and run unit tests on supported platforms.


From david.holmes at oracle.com  Tue Oct 10 08:47:25 2017
From: david.holmes at oracle.com (David Holmes)
Date: Tue, 10 Oct 2017 18:47:25 +1000
Subject: RFR: 8189088: Add intrusive doubly-linked list utility
In-Reply-To: <F84BB0A2-9B92-48F9-B497-EF5626786462@oracle.com>
References: <F84BB0A2-9B92-48F9-B497-EF5626786462@oracle.com>
Message-ID: <12515708-d1c3-b284-1117-44b4561d53cd@oracle.com>

Hi Kim,

I get the gist of this but am not going to pretend I can follow all the 
details. :)

So what actually gets expanded into the target type to support this. Is 
it just a next/prev pointer or is there additional infrastructure needed?

Thanks,
David

On 10/10/2017 6:29 PM, Kim Barrett wrote:
> RFR: 8189088: Add intrusive doubly-linked list utility
> 
> Please review this new facility, providing a general mechanism for
> intrusive doubly-linked lists. A class supports inclusion in a list by
> having an IntrusiveListLink member, and providing structured
> information about how to access that member. A class supports
> inclusion in multiple lists by having multiple IntrusiveListLink
> members, with different lists specified to use different members.
> 
> The IntrusiveList class template provides the list management. It is
> modelled on bidirectional containers such as std::list and
> boost::intrusive::list, providing many of the expected member types
> and functions. (Note that the member types use the Standard's naming
> conventions.) (Not all standard container requirements are met; some
> operations are not presently supported because they haven't been
> needed yet.) This includes iteration support using (mostly)
> standard-conforming iterator types (they are presently missing
> iterator_category member types, pending being able to include
> <iterator> so we can use std::bidirectional_iterator_tag).
> 
> This change only provides the new facility, and doesn't include any
> uses of it, though there is a suite of unit tests for it. I've
> extracted it from some in-progress work, as a useful tool in it's own
> right.
> 
> I've converted a couple existing list implementations to use
> IntrusiveList, and will be submitting those changes once this
> infrastructure is in place. One place I haven't yet touched that I
> think will benefit is G1's region handling. There are various places
> where G1 iterates over all regions in order to do something with those
> which satisfy some property (humongous regions, regions in the
> collection set, &etc). If it were trivial to create new region
> sublists (and this facility makes that easy), some of these could be
> turned into direct iteration over only the regions of interest.
> 
> CR:
> https://bugs.openjdk.java.net/browse/JDK-8189088
> 
> Webrev:
> http://cr.openjdk.java.net/~kbarrett/8189088
> 
> Testing:
> JPRT to build and run unit tests on supported platforms.
> 
> 

From glaubitz at physik.fu-berlin.de  Tue Oct 10 09:32:32 2017
From: glaubitz at physik.fu-berlin.de (John Paul Adrian Glaubitz)
Date: Tue, 10 Oct 2017 11:32:32 +0200
Subject: JDK10/RFR(M): 8172232: SPARC ISA/CPU feature detection is
 broken/insufficient (on Linux).
In-Reply-To: <2d1fd501-8ba3-7591-a360-2cdc114cfbe9@physik.fu-berlin.de>
References: <7d5e1ebb-7de8-66f1-a1f0-db465bcad4ab@oracle.com>
 <9f2896ca-65dc-557f-793c-4235499cc340@oracle.com>
 <3fcc865d-3eda-a341-e112-8417711ee3e5@oracle.com>
 <55211504-0f3e-52a0-0930-f34babb5da14@physik.fu-berlin.de>
 <fdde14d2-b451-6a33-cde7-818479c17703@oracle.com>
 <2d1fd501-8ba3-7591-a360-2cdc114cfbe9@physik.fu-berlin.de>
Message-ID: <276c6e05-1732-90da-466f-6c84326e7984@physik.fu-berlin.de>

Hi Patric!

On 10/04/2017 11:58 AM, John Paul Adrian Glaubitz wrote:
> Hope this gets merged soon. After that, the linux-sparc builds
> won't need any external patches downstream anymore.

Any news on this?

Thanks,
Adrian

-- 
  .''`.  John Paul Adrian Glaubitz
: :' :  Debian Developer - glaubitz at debian.org
`. `'   Freie Universitaet Berlin - glaubitz at physik.fu-berlin.de
   `-    GPG: 62FF 8A75 84E0 2956 9546  0006 7426 3B37 F5B5 F913

From erik.joelsson at oracle.com  Tue Oct 10 10:17:32 2017
From: erik.joelsson at oracle.com (Erik Joelsson)
Date: Tue, 10 Oct 2017 12:17:32 +0200
Subject: RFR (xs): JDK-8189071 - Require jtreg 4.2 b09
In-Reply-To: <231091EC-AF95-4C88-A5CC-F555FC2C9CC1@oracle.com>
References: <231091EC-AF95-4C88-A5CC-F555FC2C9CC1@oracle.com>
Message-ID: <8d0542fb-4594-7d5e-4ace-e1777d14de5b@oracle.com>

Hello,

This looks good, but in the future, please include build-dev on such 
changes. This one was trivial, but you never know. That way the build 
team is also better able to keep track of any build related changes. I 
only found out about this by stumbling over a conversation on internal chat.

/Erik


On 2017-10-10 03:35, jesper.wilhelmsson at oracle.com wrote:
> Hi,
>
> Can I have a review of this trivial fix to update the jib-profile to require the latest version of jtreg. This to get rid of the SocketTimeoutException that we see in the hotspot nightlies.
>
> Bug: https://bugs.openjdk.java.net/browse/JDK-8189071
>
> The change is:
>
> diff --git a/make/conf/jib-profiles.js b/make/conf/jib-profiles.js
> --- a/make/conf/jib-profiles.js
> +++ b/make/conf/jib-profiles.js
> @@ -1063,7 +1063,7 @@
>           jtreg: {
>               server: "javare",
>               revision: "4.2",
> -            build_number: "b08",
> +            build_number: "b09",
>               checksum_file: "MD5_VALUES",
>               file: "jtreg_bin-4.2.zip",
>               environment_name: "JT_HOME",
>
>
> Thanks,
> /Jesper
>


From sean.mullan at oracle.com  Tue Oct 10 12:26:12 2017
From: sean.mullan at oracle.com (Sean Mullan)
Date: Tue, 10 Oct 2017 08:26:12 -0400
Subject: [10] RFR(S) 8188775: Module jdk.internal.vm.compiler.management
 has not been granted accessClassInPackage.org.graalvm.compiler.hotspot
In-Reply-To: <b21ce89d-dd4d-2dec-d3ba-6e157718d929@oracle.com>
References: <a374f47d-c4a5-0fb4-8674-dba65624532f@oracle.com>
 <b21ce89d-dd4d-2dec-d3ba-6e157718d929@oracle.com>
Message-ID: <449d883b-1208-9708-2da7-9cd6393a8db7@oracle.com>

On 10/9/17 3:55 AM, Alan Bateman wrote:
> On 05/10/2017 00:05, Vladimir Kozlov wrote:
>> https://bugs.openjdk.java.net/browse/JDK-8188775
>>
>> Changes for 8182701[1] missed changes in default.policy for new module 
>> jdk.internal.vm.compiler.management.
>>
>> Add missing code:
>>
>> src/java.base/share/lib/security/default.policy
>> @@ -154,6 +154,10 @@
>> ???? permission java.security.AllPermission;
>> ?};
>>
>> +grant codeBase "jrt:/jdk.internal.vm.compiler.management" {
>> +??? permission java.security.AllPermission;
>> +};
>> +
> This looks okay to me although it would be nice if we could identify the 
> minimal permissions rather than granting it AllPermission.

+1.

Is there any reason you did not just grant it RuntimePermission 
"accessClassInPackage.org.graalvm.compiler.hotspot"?

I see you have already pushed the fix, so I would recommend opening 
another issue to only grant the required permissions to the 
jdk.internal.vm.compiler.management module.

Thanks,
Sean

From jbax at univocity.com  Tue Oct 10 12:59:36 2017
From: jbax at univocity.com (Jeronimo Backes)
Date: Tue, 10 Oct 2017 23:29:36 +1030
Subject: Issues with JDK 9 crashing itself and the operating system
In-Reply-To: <CAPVMLfUZR_9MsF8wbhm7Yfx5j7qPggS3SjUQWXeKqYNT3K4hpg@mail.gmail.com>
References: <CAAyVCSFnJkiCSqWUhS0qqfep0Y9p8jk_pCcmKYfWt0NAAoCEbQ@mail.gmail.com>
 <CAPVMLfUZR_9MsF8wbhm7Yfx5j7qPggS3SjUQWXeKqYNT3K4hpg@mail.gmail.com>
Message-ID: <CAAyVCSFBaigDEHr6OA=pk=yh0hHf37VkLyxcrW5ha_8=NVL+DQ@mail.gmail.com>

Hello Rohit
Do you have any update regarding the cause of this? Looks like it is
specific to Ryzen+Linux.

On 25 September 2017 at 20:33, Rohit Arul Raj <rohitarulraj at gmail.com>
wrote:

> Hello Jeronimo,
>
> Thanks for the detailed report. We were able to reproduce the issue on
> our machine.
> We will analyze this further and get back to you.
>
> Regards,
> Rohit
>
> On Sat, Sep 23, 2017 at 4:46 PM, Jeronimo Backes <jbax at univocity.com>
> wrote:
> > Hello, my name is Jeronimo and I'm the author of the univocity-parsers
> > library (https://github.com/uniVocity/univocity-parsers) and I'm
> writing to
> > you by recommendation of Erik Duveblad.
> >
> > Basically, I recently installed the JDK 9 distributed by Oracle on my
> > development computer and when I try to build my project (with a simple
> `mvn
> > clean install` command) the JVM crashes with:
> >
> >
> > # A fatal error has been detected by the Java Runtime Environment:
> > #
> > #  SIGSEGV (0xb) at pc=0x00007f18b96c52f0, pid=3865, tid=3904
> > #
> > # JRE version: Java(TM) SE Runtime Environment (9.0+181) (build 9+181)
> > # Java VM: Java HotSpot(TM) 64-Bit Server VM (9+181, mixed mode, tiered,
> > compressed oops, g1 gc, linux-amd64)
> > # Problematic frame:
> > # V  [libjvm.so+0x9292f0]
> > JVMCIGlobals::check_jvmci_flags_are_consistent()+0x120
> > #
> > # Core dump will be written. Default location: Core dumps may be
> processed
> > with "/usr/lib/systemd/systemd-coredump %P %u %g %s %t %c %e" (or
> dumping to
> > /home/jbax/dev/repository/univocity-parsers/core.3865)
> > #
> > # An error report file with more information is saved as:
> > # /home/jbax/dev/repository/univocity-parsers/hs_err_pid3865.log
> > #
> > # Compiler replay data is saved as:
> > # /home/jbax/dev/repository/univocity-parsers/replay_pid3865.log
> > #
> > # If you would like to submit a bug report, please visit:
> > #   http://bugreport.java.com/bugreport/crash.jsp
> > #
> >
> >
> > The hs_err files generated are available here
> > https://github.com/uniVocity/univocity-parsers/files/
> 1326484/jdk_9_crash2.zip.
> > This zip also contains the pom.xml file I used. The build succeeded 4
> times
> > before the JVM crashed.
> >
> > Yesterday I had the crash happen 100% of the time, but the CPU was
> > overclocked to 3.6Ghz (never had any issue with it though) and saved the
> > error file here:
> > https://github.com/uniVocity/univocity-parsers/files/
> 1324326/jdk_9_crash.zip.
> > I created an issue on github to investigate this:
> > https://github.com/uniVocity/univocity-parsers/issues/189. There Erik
> > mentioned that:
> >
> > "Looking at the hs_err file, the stack trace is "wrong", a C2 Compiler
> > Thread can't call JVMCIGlobals::check_jvmci_flags_are_consistent (and
> the
> > value of the register RIP does not correspond to any instruction in the
> > compiled version of that function). This makes me suspect that something
> > could be wrong with your CPU, the CPU should not have jumped to this
> memory
> > location."
> >
> > Things still fail with stock hardware settings. More details about my
> > environment :
> >
> > OS, Maven and Java versions:
> >
> > [jbax at linux-pc ~]$ mvn -version
> > Apache Maven 3.2.5 (12a6b3acb947671f09b81f49094c53f426d8cea1;
> > 2014-12-15T03:59:23+10:30)
> > Maven home: /home/jbax/dev/apache-maven
> > Java version: 9, vendor: Oracle Corporation
> > Java home: /home/jbax/dev/jdk9
> > Default locale: en_AU, platform encoding: UTF-8
> > OS name: "linux", version: "4.12.13-1-manjaro", arch: "amd64", family:
> > "unix"
> > [jbax at linux-pc ~]$
> >
> > Hardware:
> > [jbax at linux-pc univocity-parsers]$ lscpu
> > Architecture:        x86_64
> > CPU op-mode(s):      32-bit, 64-bit
> > Byte Order:          Little Endian
> > CPU(s):              16
> > On-line CPU(s) list: 0-15
> > Thread(s) per core:  2
> > Core(s) per socket:  8
> > Socket(s):           1
> > NUMA node(s):        1
> > Vendor ID:           AuthenticAMD
> > CPU family:          23
> > Model:               1
> > Model name:          AMD Ryzen 7 1700 Eight-Core Processor
> > Stepping:            1
> > CPU MHz:             1550.000
> > CPU max MHz:         3000.0000
> > CPU min MHz:         1550.0000
> > BogoMIPS:            6001.43
> > Virtualization:      AMD-V
> > L1d cache:           32K
> > L1i cache:           64K
> > L2 cache:            512K
> > L3 cache:            8192K
> > NUMA node0 CPU(s):   0-15
> > Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
> > mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext
> fxsr_opt
> > pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid
> extd_apicid
> > aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 movbe
> popcnt
> > aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm
> > sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core
> > perfctr_nb bpext perfctr_l2 mwaitx hw_pstate vmmcall fsgsbase bmi1 avx2
> smep
> > bmi2 rdseed adx smap clflushopt sha_ni xsaveopt xsavec xgetbv1 xsaves
> clzero
> > irperf arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid
> > decodeassists pausefilter pfthreshold avic overflow_recov succor smca
> >
> > On an unrelated note, I use an old java application that crashes the
> entire
> > OS for me when Java 9 is used: http://www.jinchess.com/download
> >
> > It's just a matter of downloading, unpacking and trying to start it with
> > jin-2.14.1/jin
> >
> > The OS crashes and I have to hard-reset the computer. It works just fine
> if
> > revert back to Java 6, 7 or 8.
> >
> > I thought you'd might want to investigate what is going on. Let me know
> if
> > you need more information.
> >
> > Best regards,
> >
> > Jeronimo.
> >
> >
> >
> >
> > --
> > the uniVocity team
> > www.univocity.com
>


-- 
the uniVocity team
www.univocity.com

From kim.barrett at oracle.com  Tue Oct 10 14:56:08 2017
From: kim.barrett at oracle.com (Kim Barrett)
Date: Tue, 10 Oct 2017 10:56:08 -0400
Subject: RFR: 8189088: Add intrusive doubly-linked list utility
In-Reply-To: <12515708-d1c3-b284-1117-44b4561d53cd@oracle.com>
References: <F84BB0A2-9B92-48F9-B497-EF5626786462@oracle.com>
 <12515708-d1c3-b284-1117-44b4561d53cd@oracle.com>
Message-ID: <FEE00764-2163-4221-83FF-68649C0DB90B@oracle.com>

> On Oct 10, 2017, at 4:47 AM, David Holmes <david.holmes at oracle.com> wrote:
> 
> Hi Kim,
> 
> I get the gist of this but am not going to pretend I can follow all the details. :)
> 
> So what actually gets expanded into the target type to support this. Is it just a next/prev pointer or is there additional infrastructure needed?

The target type gets a next/prev pair of pointers, plus a debug-only pointer to the currently containing list to support various assertions.
Those are all packaged in the IntrusiveListLink class.  Replicated for each simultaneous list an object may need to be in.


From vladimir.kozlov at oracle.com  Tue Oct 10 15:29:29 2017
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Tue, 10 Oct 2017 08:29:29 -0700
Subject: [10] RFR(S) 8188775: Module jdk.internal.vm.compiler.management
 has not been granted accessClassInPackage.org.graalvm.compiler.hotspot
In-Reply-To: <449d883b-1208-9708-2da7-9cd6393a8db7@oracle.com>
References: <a374f47d-c4a5-0fb4-8674-dba65624532f@oracle.com>
 <b21ce89d-dd4d-2dec-d3ba-6e157718d929@oracle.com>
 <449d883b-1208-9708-2da7-9cd6393a8db7@oracle.com>
Message-ID: <3fad30f1-0050-12c5-4e61-4bda9852457b@oracle.com>

Thank you Alan and Sean,

I copied preceding code for jdk.internal.vm.compiler because it is not clear for me if accessClassInPackage is enough 
for all cases. Anyway, I filed next issue to find minimum required permission as you suggested.

https://bugs.openjdk.java.net/browse/JDK-8189116

Thanks,
Vladimir

On 10/10/17 5:26 AM, Sean Mullan wrote:
> On 10/9/17 3:55 AM, Alan Bateman wrote:
>> On 05/10/2017 00:05, Vladimir Kozlov wrote:
>>> https://bugs.openjdk.java.net/browse/JDK-8188775
>>>
>>> Changes for 8182701[1] missed changes in default.policy for new module jdk.internal.vm.compiler.management.
>>>
>>> Add missing code:
>>>
>>> src/java.base/share/lib/security/default.policy
>>> @@ -154,6 +154,10 @@
>>> ???? permission java.security.AllPermission;
>>> ?};
>>>
>>> +grant codeBase "jrt:/jdk.internal.vm.compiler.management" {
>>> +??? permission java.security.AllPermission;
>>> +};
>>> +
>> This looks okay to me although it would be nice if we could identify the minimal permissions rather than granting it 
>> AllPermission.
> 
> +1.
> 
> Is there any reason you did not just grant it RuntimePermission "accessClassInPackage.org.graalvm.compiler.hotspot"?
> 
> I see you have already pushed the fix, so I would recommend opening another issue to only grant the required permissions 
> to the jdk.internal.vm.compiler.management module.
> 
> Thanks,
> Sean

From volker.simonis at gmail.com  Tue Oct 10 17:17:40 2017
From: volker.simonis at gmail.com (Volker Simonis)
Date: Tue, 10 Oct 2017 19:17:40 +0200
Subject: RFR(M): 8166317: InterpreterCodeSize should be computed
In-Reply-To: <4109f960-078f-e582-3c78-71f201a265fd@redhat.com>
References: <CA+3eh13aijzoGexJXDB7tTnFfmqE4BffbASUysSb2=osD7F4nA@mail.gmail.com>
 <50cda0ab-f403-372a-ce51-1a27d8821448@oracle.com>
 <CA+3eh11HCkBF8KkMG5-o-Ouji=KLqQ=FtztLWo6u3Han3yxoKw@mail.gmail.com>
 <e00bf17e-0ee5-75ba-8eb3-f766b984fa9f@oracle.com>
 <CA+3eh10Sxx5hsXABdeWCZ_P=AY5aR6F37Gs3cdFEv9P_K8GUrw@mail.gmail.com>
 <CA+3eh13fpKiZZgmHCAORgOjxdrEiTipN4zOxAxdz=++GiMZ0SQ@mail.gmail.com>
 <6704868d-caa7-51e0-4741-5d62f90d837c@oracle.com>
 <CA+3eh12_Z9S-oyMNJf8MLwtveDXAx+2xAqsKjOZjS5o_nm6WUQ@mail.gmail.com>
 <8c522d38-90db-2864-0778-6d5948b1f50c@oracle.com>
 <7fee08f1-8304-3026-19e9-844e618e98ea@oracle.com>
 <CA+3eh125jBF2OEMMJ_22N__UHx=Kj6NMKtpfnfGnnf0WUk4Wbw@mail.gmail.com>
 <2bb4136a-8c0e-ac4c-0c03-af38ff79ab40@oracle.com>
 <CA+3eh125R_kgucQ0pfUAgmW9a+y=hx05_N4eomHBSzd1viEztg@mail.gmail.com>
 <5b5219a5-960e-363b-2bdc-3613f1dae62c@oracle.com>
 <ff56e260-c520-f0a2-32ac-512af28b317c@oracle.com>
 <CA+3eh10SxnE3nOXUpuQ_+vp_S_LP0ByrRhD3A2Asyo9goccs3Q@mail.gmail.com>
 <4109f960-078f-e582-3c78-71f201a265fd@redhat.com>
Message-ID: <CA+3eh10S08sqtk8dgHnDPSdUmt4buvy7Ht=iYE2hKXmPXGqf1w@mail.gmail.com>

On Tue, Oct 10, 2017 at 9:42 AM, Andrew Haley <aph at redhat.com> wrote:
> On 09/10/17 20:24, Volker Simonis wrote:
>> Unfortunately we can't easily generate these stubs during
>> 'stubRoutines_init1()' because
>> 'generate_dirty_card_log_enqueue_if_necessary()' needs the byte map
>> base address which is only initialized in
>> 'CardTableModRefBS::initialize()' during 'univers_init()' which
>> happens after 'stubRoutines_init1()'.
>
> Yes you can, you can do something like we do for narrow_ptrs_base:
>
>     if (Universe::is_fully_initialized()) {
>       mov(rheapbase, Universe::narrow_ptrs_base());
>     } else {
>       lea(rheapbase, ExternalAddress((address)Universe::narrow_ptrs_base_addr()));
>       ldr(rheapbase, Address(rheapbase));
>     }
>

Hi Andrew,

thanks for your suggestion. Yes, I could do that, but that would
replace a constant load in the barrier with a constant load plus a
load from memory, because during stubRoutines_init1() heap won't be
initialized. Not sure about this, but I think we want to avoid this
overhead in the barriers.

Also, Christian proposed in a previous mail to replace the G1 barrier
stubs on SPARC with simple runtime calls like on other platforms.
While I think that it is probably worthwhile thinking about such a
change, I don't know the exact history of these stubs and probably
some GC experts should decide if that's really a good idea. I'd be
happy to open an extra issue for following up on that path.

But for the moments I've simply added a new initialization step
"g1_barrier_stubs_init()" between 'univers_init()' and
interpreter_init() which is empty on all platforms except SPARC where
it generates the corresponding stubs:

http://cr.openjdk.java.net/~simonis/webrevs/2017/8166317.v3/

I've built and smoke-tested the new change on Windows, MacOS,
Solaris/SPARC, AIX, Linux/x86_64/ppc64/ppc64le/s390. Unfortunately I
don't have access to ARM machines so I couldn't check arm,arm64 and
aarch64 although I don't expect any problems there (actually I've just
added an empty method there). But it would be great if somebody could
check that for any case.

@Vladimir: I've also rebased the change for "8187091:
ReturnBlobToWrongHeapTest fails because of problems in
CodeHeap::contains_blob()":

http://cr.openjdk.java.net/~simonis/webrevs/2017/8187091.v2/

Because it changes the same files like 8166317 it should be applied
and pushed only after 8166317 was pushed.

Thank you and best regards,
Volker

> --
> Andrew Haley
> Java Platform Lead Engineer
> Red Hat UK Ltd. <https://www.redhat.com>
> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671

From coleen.phillimore at oracle.com  Tue Oct 10 22:01:01 2017
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Tue, 10 Oct 2017 18:01:01 -0400
Subject: RFR (L, but tedious) 8188220: Remove Atomic::*_ptr() uses and
 overloads from hotspot
Message-ID: <b6d75bd8-ca77-6507-f1f0-db57bbb99ae0@oracle.com>

Summary: With the new template functions these are unnecessary.

The changes are mostly s/_ptr// and removing the cast to return type.? 
There weren't many types that needed to be improved to match the 
template version of the function.?? Some notes:
1. replaced CASPTR with Atomic::cmpxchg() in mutex.cpp, rearranging 
arguments.
2. renamed Atomic::replace_if_null to Atomic::cmpxchg_if_null.? I 
disliked the first name because it's not explicit from the callers that 
there's an underlying cas.? If people want to fight, I'll remove the 
function and use cmpxchg because there are only a couple places where 
this is a little nicer.
3. Added Atomic::sub()

Tested with JPRT, mach5 tier1-5 on linux,windows and solaris.

open webrev at http://cr.openjdk.java.net/~coleenp/8188220.01/webrev
bug link https://bugs.openjdk.java.net/browse/JDK-8188220

Thanks,
Coleen

From kim.barrett at oracle.com  Wed Oct 11 03:43:19 2017
From: kim.barrett at oracle.com (Kim Barrett)
Date: Tue, 10 Oct 2017 23:43:19 -0400
Subject: RFR (L, but tedious) 8188220: Remove Atomic::*_ptr() uses and
 overloads from hotspot
In-Reply-To: <b6d75bd8-ca77-6507-f1f0-db57bbb99ae0@oracle.com>
References: <b6d75bd8-ca77-6507-f1f0-db57bbb99ae0@oracle.com>
Message-ID: <B99DF3C6-7D79-4147-B4A0-C69F0EFA3339@oracle.com>

> On Oct 10, 2017, at 6:01 PM, coleen.phillimore at oracle.com wrote:
> 
> Summary: With the new template functions these are unnecessary.
> 
> 2. renamed Atomic::replace_if_null to Atomic::cmpxchg_if_null.  I disliked the first name because it's not explicit from the callers that there's an underlying cas.  If people want to fight, I'll remove the function and use cmpxchg because there are only a couple places where this is a little nicer.

I'm still looking at other parts, but I want to respond to this now.

I object to this change.  I think the proposed new name is confusing,
suggesting there are two different comparisons involved.

I originally called it something else that I wasn't entirely happy
with.  When David suggested replace_if_null I quickly adopted that as
I think that name exactly describes what it does.  In particular, I
think "atomic replace if" pretty clearly suggests a test-and-set /
compare-and-swap type of operation.

Further, I think any name involving "cmpxchg" is problematic because
the result of this operation is intentionally different from cmpxchg,
in order to better support the primary use-case, which is lazy
initialization.

I also object to your alternative suggestion of removing the operation
entirely and just using cmpxchg directly instead.  I don't recall how
many occurrences there presently are, but I suspect more could easily
be added; it's part of a lazy initialization pattern similar to DCLP
but without the locks.


From david.holmes at oracle.com  Wed Oct 11 03:55:27 2017
From: david.holmes at oracle.com (David Holmes)
Date: Wed, 11 Oct 2017 13:55:27 +1000
Subject: RFR (L, but tedious) 8188220: Remove Atomic::*_ptr() uses and
 overloads from hotspot
In-Reply-To: <B99DF3C6-7D79-4147-B4A0-C69F0EFA3339@oracle.com>
References: <b6d75bd8-ca77-6507-f1f0-db57bbb99ae0@oracle.com>
 <B99DF3C6-7D79-4147-B4A0-C69F0EFA3339@oracle.com>
Message-ID: <7e4fbba3-4462-a729-b663-99fd6919360f@oracle.com>

On 11/10/2017 1:43 PM, Kim Barrett wrote:
>> On Oct 10, 2017, at 6:01 PM, coleen.phillimore at oracle.com wrote:
>>
>> Summary: With the new template functions these are unnecessary.
>>
>> 2. renamed Atomic::replace_if_null to Atomic::cmpxchg_if_null.  I disliked the first name because it's not explicit from the callers that there's an underlying cas.  If people want to fight, I'll remove the function and use cmpxchg because there are only a couple places where this is a little nicer.
> 
> I'm still looking at other parts, but I want to respond to this now.
> 
> I object to this change.  I think the proposed new name is confusing,
> suggesting there are two different comparisons involved.
> 
> I originally called it something else that I wasn't entirely happy
> with.  When David suggested replace_if_null I quickly adopted that as
> I think that name exactly describes what it does.  In particular, I
> think "atomic replace if" pretty clearly suggests a test-and-set /
> compare-and-swap type of operation.

I totally agree. It's an Atomic operation, the implementation will 
involve something atomic, it doesn't matter if it is cmpxchg or 
something else. The name replace_if_null describes exactly what the 
function does - it doesn't have to describe how it does it.

David
-----

> Further, I think any name involving "cmpxchg" is problematic because
> the result of this operation is intentionally different from cmpxchg,
> in order to better support the primary use-case, which is lazy
> initialization.
> 
> I also object to your alternative suggestion of removing the operation
> entirely and just using cmpxchg directly instead.  I don't recall how
> many occurrences there presently are, but I suspect more could easily
> be added; it's part of a lazy initialization pattern similar to DCLP
> but without the locks.
> 

From erik.osterlund at oracle.com  Wed Oct 11 07:45:59 2017
From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=)
Date: Wed, 11 Oct 2017 09:45:59 +0200
Subject: RFR (L, but tedious) 8188220: Remove Atomic::*_ptr() uses and
 overloads from hotspot
In-Reply-To: <7e4fbba3-4462-a729-b663-99fd6919360f@oracle.com>
References: <b6d75bd8-ca77-6507-f1f0-db57bbb99ae0@oracle.com>
 <B99DF3C6-7D79-4147-B4A0-C69F0EFA3339@oracle.com>
 <7e4fbba3-4462-a729-b663-99fd6919360f@oracle.com>
Message-ID: <59DDCC37.8050306@oracle.com>

Hi,

First off - big thanks to Coleen for this cleanup. Nice!

I think I have to take Coleen's side here regarding replace_if_null. 
Here is why:

1) I do not see how performing a CAS expecting NULL specifically is 
special enough that it warrants its own operation. It does not save many 
characters to just type it, and makes it less obvious what it does, 
which seems unnecessary to me. Atomic ought to have the minimum atomic 
operations required and not get cluttered with helpers.

2) To me it really does matter what each operation boils down to in 
Atomic, especially in terms of semantics. Will my replace_if_null have 
acquire semantics if it does not find null? Will it have trailing 
leading, or bidirectional fencing if it succeeds, or just release 
semantics on the store? Does it allow spurious failures? It matters to 
me, and should preferrably not be abstracted away in my opinion.

And if we really depend on it behaving exactly like Atomic::cmpxchg 
semantically, I think (like Coleen) that either the name should reflect 
that, or preferrably for me, it should be removed and replaced with an 
explicit Atomic::cmpxchg.

3) I prefer not to have multiple APIs for doing the same thing. We all 
know what happens when programmers are given the choice of two different 
ways of expressing the same thing: they start disagreeing about how to 
express that thing. Now in this changeset, there are inconsistencies 
already. For example in classLoaderData.cpp:946 there is one occurence 
of an explicit cmpxchg that expects null (for the purposes of lazy 
initialization), while other places (e.g. nmethod.cpp:1662) use the 
abstraction. Should that be changed now (and in subsequent changesets) 
to use the abstraction to make the code consistent? I might think this 
should not matter and that the explicit CAS is okay, but I can almost 
promise somebody will have the opposite opinion. By having one way of 
performing a CAS that expects 0, we can spend less time disagreeing 
about which way we should CAS, and more time on other things of more 
importance.

This is just my 50 cent, letting Coleen know she is not the only one 
with similar thoughts.

I have not reviewed this completely yet - thought I'd wait with that 
until we agree about replace_if_null, if that is okay.

Thanks,
/Erik

On 2017-10-11 05:55, David Holmes wrote:
> On 11/10/2017 1:43 PM, Kim Barrett wrote:
>>> On Oct 10, 2017, at 6:01 PM, coleen.phillimore at oracle.com wrote:
>>>
>>> Summary: With the new template functions these are unnecessary.
>>>
>>> 2. renamed Atomic::replace_if_null to Atomic::cmpxchg_if_null.  I 
>>> disliked the first name because it's not explicit from the callers 
>>> that there's an underlying cas.  If people want to fight, I'll 
>>> remove the function and use cmpxchg because there are only a couple 
>>> places where this is a little nicer.
>>
>> I'm still looking at other parts, but I want to respond to this now.
>>
>> I object to this change.  I think the proposed new name is confusing,
>> suggesting there are two different comparisons involved.
>>
>> I originally called it something else that I wasn't entirely happy
>> with.  When David suggested replace_if_null I quickly adopted that as
>> I think that name exactly describes what it does.  In particular, I
>> think "atomic replace if" pretty clearly suggests a test-and-set /
>> compare-and-swap type of operation.
>
> I totally agree. It's an Atomic operation, the implementation will 
> involve something atomic, it doesn't matter if it is cmpxchg or 
> something else. The name replace_if_null describes exactly what the 
> function does - it doesn't have to describe how it does it.
>
> David
> -----
>
>> Further, I think any name involving "cmpxchg" is problematic because
>> the result of this operation is intentionally different from cmpxchg,
>> in order to better support the primary use-case, which is lazy
>> initialization.
>>
>> I also object to your alternative suggestion of removing the operation
>> entirely and just using cmpxchg directly instead.  I don't recall how
>> many occurrences there presently are, but I suspect more could easily
>> be added; it's part of a lazy initialization pattern similar to DCLP
>> but without the locks.
>>


From david.holmes at oracle.com  Wed Oct 11 08:09:29 2017
From: david.holmes at oracle.com (David Holmes)
Date: Wed, 11 Oct 2017 18:09:29 +1000
Subject: RFR (L, but tedious) 8188220: Remove Atomic::*_ptr() uses and
 overloads from hotspot
In-Reply-To: <59DDCC37.8050306@oracle.com>
References: <b6d75bd8-ca77-6507-f1f0-db57bbb99ae0@oracle.com>
 <B99DF3C6-7D79-4147-B4A0-C69F0EFA3339@oracle.com>
 <7e4fbba3-4462-a729-b663-99fd6919360f@oracle.com>
 <59DDCC37.8050306@oracle.com>
Message-ID: <3089b845-0532-d6a9-b68f-91b3b21c6ef3@oracle.com>

On 11/10/2017 5:45 PM, Erik ?sterlund wrote:
> Hi,
> 
> First off - big thanks to Coleen for this cleanup. Nice!
> 
> I think I have to take Coleen's side here regarding replace_if_null. 
> Here is why:
> 
> 1) I do not see how performing a CAS expecting NULL specifically is 
> special enough that it warrants its own operation. It does not save many 
> characters to just type it, and makes it less obvious what it does, 
> which seems unnecessary to me. Atomic ought to have the minimum atomic 
> operations required and not get cluttered with helpers.

 From the earlier review thread related to the initial templatization of 
Atomic:

"(1) cmpxchg(v, p, NULL), to store a pointer if no pointer is already
present.  This can be used as an alternative to DCLP. One way to deal
with this might be an overload on std::nullptr_t and use nullptr, but
that requires C++11.  We don't have any current uses of this that I
could find, but it's a sufficiently interesting idiom that I'm
relucant to forbid it.  But such idiomatic usage could be wrapped up
in its own little package that can deal with the restriction."

"I've also added bool Atomic::conditional_store_ptr(T, D volatile*),
for the idiom of storing a value if the old value is NULL.  It turns
out there are about 25 occurrences of this idiom in Hotspot, so a
utility for it seems warranted.  The current implementation is just a
straightforward wrapper around cmpxchg, which means it can't take
advantage of gcc's __sync_bool_compare_and_swap.  That can be dealt
with later if desired."


> 2) To me it really does matter what each operation boils down to in 
> Atomic, especially in terms of semantics. Will my replace_if_null have 
> acquire semantics if it does not find null? Will it have trailing 
> leading, or bidirectional fencing if it succeeds, or just release 
> semantics on the store? Does it allow spurious failures? It matters to 
> me, and should preferrably not be abstracted away in my opinion.

I can buy that partially but you're stretching things given you can't 
glean those details from the name cmpxchg either.
> And if we really depend on it behaving exactly like Atomic::cmpxchg 
> semantically, I think (like Coleen) that either the name should reflect 
> that, or preferrably for me, it should be removed and replaced with an 
> explicit Atomic::cmpxchg.

I don't think we do/need-to depend on that.

> 3) I prefer not to have multiple APIs for doing the same thing. We all 
> know what happens when programmers are given the choice of two different 
> ways of expressing the same thing: they start disagreeing about how to 
> express that thing. Now in this changeset, there are inconsistencies 
> already. For example in classLoaderData.cpp:946 there is one occurence 
> of an explicit cmpxchg that expects null (for the purposes of lazy 
> initialization), while other places (e.g. nmethod.cpp:1662) use the 
> abstraction. Should that be changed now (and in subsequent changesets) 
> to use the abstraction to make the code consistent? I might think this 
> should not matter and that the explicit CAS is okay, but I can almost 
> promise somebody will have the opposite opinion. By having one way of 
> performing a CAS that expects 0, we can spend less time disagreeing 
> about which way we should CAS, and more time on other things of more 
> importance.
> 
> This is just my 50 cent, letting Coleen know she is not the only one 
> with similar thoughts.

Removing the operation is a different argument to renaming it. Most of 
the above argues for removing it. :)

Cheers,
David
-----

> I have not reviewed this completely yet - thought I'd wait with that 
> until we agree about replace_if_null, if that is okay.
> 
> Thanks,
> /Erik
> 
> On 2017-10-11 05:55, David Holmes wrote:
>> On 11/10/2017 1:43 PM, Kim Barrett wrote:
>>>> On Oct 10, 2017, at 6:01 PM, coleen.phillimore at oracle.com wrote:
>>>>
>>>> Summary: With the new template functions these are unnecessary.
>>>>
>>>> 2. renamed Atomic::replace_if_null to Atomic::cmpxchg_if_null.? I 
>>>> disliked the first name because it's not explicit from the callers 
>>>> that there's an underlying cas.? If people want to fight, I'll 
>>>> remove the function and use cmpxchg because there are only a couple 
>>>> places where this is a little nicer.
>>>
>>> I'm still looking at other parts, but I want to respond to this now.
>>>
>>> I object to this change.? I think the proposed new name is confusing,
>>> suggesting there are two different comparisons involved.
>>>
>>> I originally called it something else that I wasn't entirely happy
>>> with.? When David suggested replace_if_null I quickly adopted that as
>>> I think that name exactly describes what it does.? In particular, I
>>> think "atomic replace if" pretty clearly suggests a test-and-set /
>>> compare-and-swap type of operation.
>>
>> I totally agree. It's an Atomic operation, the implementation will 
>> involve something atomic, it doesn't matter if it is cmpxchg or 
>> something else. The name replace_if_null describes exactly what the 
>> function does - it doesn't have to describe how it does it.
>>
>> David
>> -----
>>
>>> Further, I think any name involving "cmpxchg" is problematic because
>>> the result of this operation is intentionally different from cmpxchg,
>>> in order to better support the primary use-case, which is lazy
>>> initialization.
>>>
>>> I also object to your alternative suggestion of removing the operation
>>> entirely and just using cmpxchg directly instead.? I don't recall how
>>> many occurrences there presently are, but I suspect more could easily
>>> be added; it's part of a lazy initialization pattern similar to DCLP
>>> but without the locks.
>>>
> 

From robbin.ehn at oracle.com  Wed Oct 11 08:12:04 2017
From: robbin.ehn at oracle.com (Robbin Ehn)
Date: Wed, 11 Oct 2017 10:12:04 +0200
Subject: RFR (L, but tedious) 8188220: Remove Atomic::*_ptr() uses and
 overloads from hotspot
In-Reply-To: <3089b845-0532-d6a9-b68f-91b3b21c6ef3@oracle.com>
References: <b6d75bd8-ca77-6507-f1f0-db57bbb99ae0@oracle.com>
 <B99DF3C6-7D79-4147-B4A0-C69F0EFA3339@oracle.com>
 <7e4fbba3-4462-a729-b663-99fd6919360f@oracle.com>
 <59DDCC37.8050306@oracle.com>
 <3089b845-0532-d6a9-b68f-91b3b21c6ef3@oracle.com>
Message-ID: <591c33b3-f9b1-55f3-2c4b-ddfad4ed9a39@oracle.com>

On 10/11/2017 10:09 AM, David Holmes wrote:
> On 11/10/2017 5:45 PM, Erik ?sterlund wrote:
> 
> Removing the operation is a different argument to renaming it. Most of the above argues for removing it. :)

+1 on removing

Thanks, Robbin

> 
> Cheers,
> David
> -----
> 
>> I have not reviewed this completely yet - thought I'd wait with that until we agree about replace_if_null, if that is okay.
>>
>> Thanks,
>> /Erik
>>
>> On 2017-10-11 05:55, David Holmes wrote:
>>> On 11/10/2017 1:43 PM, Kim Barrett wrote:
>>>>> On Oct 10, 2017, at 6:01 PM, coleen.phillimore at oracle.com wrote:
>>>>>
>>>>> Summary: With the new template functions these are unnecessary.
>>>>>
>>>>> 2. renamed Atomic::replace_if_null to Atomic::cmpxchg_if_null.? I disliked the first name because it's not explicit from the callers that there's an underlying cas.  
>>>>> If people want to fight, I'll remove the function and use cmpxchg because there are only a couple places where this is a little nicer.
>>>>
>>>> I'm still looking at other parts, but I want to respond to this now.
>>>>
>>>> I object to this change.? I think the proposed new name is confusing,
>>>> suggesting there are two different comparisons involved.
>>>>
>>>> I originally called it something else that I wasn't entirely happy
>>>> with.? When David suggested replace_if_null I quickly adopted that as
>>>> I think that name exactly describes what it does.? In particular, I
>>>> think "atomic replace if" pretty clearly suggests a test-and-set /
>>>> compare-and-swap type of operation.
>>>
>>> I totally agree. It's an Atomic operation, the implementation will involve something atomic, it doesn't matter if it is cmpxchg or something else. The name 
>>> replace_if_null describes exactly what the function does - it doesn't have to describe how it does it.
>>>
>>> David
>>> -----
>>>
>>>> Further, I think any name involving "cmpxchg" is problematic because
>>>> the result of this operation is intentionally different from cmpxchg,
>>>> in order to better support the primary use-case, which is lazy
>>>> initialization.
>>>>
>>>> I also object to your alternative suggestion of removing the operation
>>>> entirely and just using cmpxchg directly instead.? I don't recall how
>>>> many occurrences there presently are, but I suspect more could easily
>>>> be added; it's part of a lazy initialization pattern similar to DCLP
>>>> but without the locks.
>>>>
>>

From coleen.phillimore at oracle.com  Wed Oct 11 11:07:28 2017
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Wed, 11 Oct 2017 07:07:28 -0400
Subject: RFR (L, but tedious) 8188220: Remove Atomic::*_ptr() uses and
 overloads from hotspot
In-Reply-To: <591c33b3-f9b1-55f3-2c4b-ddfad4ed9a39@oracle.com>
References: <b6d75bd8-ca77-6507-f1f0-db57bbb99ae0@oracle.com>
 <B99DF3C6-7D79-4147-B4A0-C69F0EFA3339@oracle.com>
 <7e4fbba3-4462-a729-b663-99fd6919360f@oracle.com>
 <59DDCC37.8050306@oracle.com>
 <3089b845-0532-d6a9-b68f-91b3b21c6ef3@oracle.com>
 <591c33b3-f9b1-55f3-2c4b-ddfad4ed9a39@oracle.com>
Message-ID: <bae33049-29be-0f61-5a0a-c7403a387b86@oracle.com>


On 10/11/17 4:12 AM, Robbin Ehn wrote:
> On 10/11/2017 10:09 AM, David Holmes wrote:
>> On 11/10/2017 5:45 PM, Erik ?sterlund wrote:
>>
>> Removing the operation is a different argument to renaming it. Most 
>> of the above argues for removing it. :)
>
> +1 on removing

Thank you for all your feedback.? Erik best described what I was 
thinking.? I will remove it then.? There were not that many instances 
and one instance that people thought would be useful, needed the old 
return value.

Coleen
>
> Thanks, Robbin
>
>>
>> Cheers,
>> David
>> -----
>>
>>> I have not reviewed this completely yet - thought I'd wait with that 
>>> until we agree about replace_if_null, if that is okay.
>>>
>>> Thanks,
>>> /Erik
>>>
>>> On 2017-10-11 05:55, David Holmes wrote:
>>>> On 11/10/2017 1:43 PM, Kim Barrett wrote:
>>>>>> On Oct 10, 2017, at 6:01 PM, coleen.phillimore at oracle.com wrote:
>>>>>>
>>>>>> Summary: With the new template functions these are unnecessary.
>>>>>>
>>>>>> 2. renamed Atomic::replace_if_null to Atomic::cmpxchg_if_null.? I 
>>>>>> disliked the first name because it's not explicit from the 
>>>>>> callers that there's an underlying cas.? If people want to fight, 
>>>>>> I'll remove the function and use cmpxchg because there are only a 
>>>>>> couple places where this is a little nicer.
>>>>>
>>>>> I'm still looking at other parts, but I want to respond to this now.
>>>>>
>>>>> I object to this change.? I think the proposed new name is confusing,
>>>>> suggesting there are two different comparisons involved.
>>>>>
>>>>> I originally called it something else that I wasn't entirely happy
>>>>> with.? When David suggested replace_if_null I quickly adopted that as
>>>>> I think that name exactly describes what it does.? In particular, I
>>>>> think "atomic replace if" pretty clearly suggests a test-and-set /
>>>>> compare-and-swap type of operation.
>>>>
>>>> I totally agree. It's an Atomic operation, the implementation will 
>>>> involve something atomic, it doesn't matter if it is cmpxchg or 
>>>> something else. The name replace_if_null describes exactly what the 
>>>> function does - it doesn't have to describe how it does it.
>>>>
>>>> David
>>>> -----
>>>>
>>>>> Further, I think any name involving "cmpxchg" is problematic because
>>>>> the result of this operation is intentionally different from cmpxchg,
>>>>> in order to better support the primary use-case, which is lazy
>>>>> initialization.
>>>>>
>>>>> I also object to your alternative suggestion of removing the 
>>>>> operation
>>>>> entirely and just using cmpxchg directly instead.? I don't recall how
>>>>> many occurrences there presently are, but I suspect more could easily
>>>>> be added; it's part of a lazy initialization pattern similar to DCLP
>>>>> but without the locks.
>>>>>
>>>


From robbin.ehn at oracle.com  Wed Oct 11 13:37:51 2017
From: robbin.ehn at oracle.com (Robbin Ehn)
Date: Wed, 11 Oct 2017 15:37:51 +0200
Subject: RFR(XL): 8185640: Thread-local handshakes
Message-ID: <6f2f6259-73f1-c09c-063e-39ae528fb96f@oracle.com>

Hi all,

Starting the review of the code while JEP work is still not completed.

JEP: https://bugs.openjdk.java.net/browse/JDK-8185640

This JEP introduces a way to execute a callback on threads without performing a global VM safepoint. It makes it both possible and cheap to stop individual threads and not 
just all threads or none.

Entire changeset:
http://cr.openjdk.java.net/~rehn/8185640/v0/flat/

Divided into 3-parts,
SafepointMechanism abstraction:
http://cr.openjdk.java.net/~rehn/8185640/v0/SafepointMechanism-0/
Consolidating polling page allocation:
http://cr.openjdk.java.net/~rehn/8185640/v0/PollingPage-1/
Handshakes:
http://cr.openjdk.java.net/~rehn/8185640/v0/Handshakes-2/

A handshake operation is a callback that is executed for each JavaThread while that thread is in a safepoint safe state. The callback is executed either by the thread 
itself or by the VM thread while keeping the thread in a blocked state. The big difference between safepointing and handshaking is that the per thread operation will be 
performed on all threads as soon as possible and they will continue to execute as soon as it?s own operation is completed. If a JavaThread is known to be running, then a 
handshake can be performed with that single JavaThread as well.

The current safepointing scheme is modified to perform an indirection through a per-thread pointer which will allow a single thread's execution to be forced to trap on the 
guard page. In order to force a thread to yield the VM updates the per-thread pointer for the corresponding thread to point to the guarded page.

Example of potential use-cases:
-Biased lock revocation
-External requests for stack traces
-Deoptimization
-Async exception delivery
-External suspension
-Eliding memory barriers

All of these will benefit the VM moving towards becoming more low-latency friendly by reducing the number of global safepoints.
Platforms that do not yet implement the per JavaThread poll, a fallback to normal safepoint is in place. HandshakeOneThread will then be a normal safepoint. The supported 
platforms are Linux x64 and Solaris SPARC.

Tested heavily with various test suits and comes with a few new tests.

Performance testing using standardized benchmark show no signification changes, the latest number was -0.7% on Linux x64 and +1.5% Solaris SPARC (not statistically 
ensured). A minor regression for the load vs load load on x64 is expected and a slight increase on SPARC due to the cost of ?materializing? the page vs load load.
The time to trigger a safepoint was measured on a large machine to not be an issue. The looping over threads and arming the polling page will benefit from the work on 
JavaThread life-cycle (8167108 - SMR and JavaThread Lifecycle: http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2017-October/024773.html) which puts all 
JavaThreads in an array instead of a linked list.

Thanks, Robbin

From coleen.phillimore at oracle.com  Wed Oct 11 13:50:08 2017
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Wed, 11 Oct 2017 09:50:08 -0400
Subject: RFR (L, but tedious) 8188220: Remove Atomic::*_ptr() uses and
 overloads from hotspot
In-Reply-To: <bae33049-29be-0f61-5a0a-c7403a387b86@oracle.com>
References: <b6d75bd8-ca77-6507-f1f0-db57bbb99ae0@oracle.com>
 <B99DF3C6-7D79-4147-B4A0-C69F0EFA3339@oracle.com>
 <7e4fbba3-4462-a729-b663-99fd6919360f@oracle.com>
 <59DDCC37.8050306@oracle.com>
 <3089b845-0532-d6a9-b68f-91b3b21c6ef3@oracle.com>
 <591c33b3-f9b1-55f3-2c4b-ddfad4ed9a39@oracle.com>
 <bae33049-29be-0f61-5a0a-c7403a387b86@oracle.com>
Message-ID: <e3380ab0-ffde-b80e-7d65-f7fcaf3993ec@oracle.com>


Please review version .02 which removes use of replace_if_null, but not 
the function.? A separate RFE can be filed to discuss that.

open webrev at http://cr.openjdk.java.net/~coleenp/8188220.02/webrev

Thanks,
Coleen

On 10/11/17 7:07 AM, coleen.phillimore at oracle.com wrote:
>
>
> On 10/11/17 4:12 AM, Robbin Ehn wrote:
>> On 10/11/2017 10:09 AM, David Holmes wrote:
>>> On 11/10/2017 5:45 PM, Erik ?sterlund wrote:
>>>
>>> Removing the operation is a different argument to renaming it. Most 
>>> of the above argues for removing it. :)
>>
>> +1 on removing
>
> Thank you for all your feedback.? Erik best described what I was 
> thinking.? I will remove it then.? There were not that many instances 
> and one instance that people thought would be useful, needed the old 
> return value.
>
> Coleen
>>
>> Thanks, Robbin
>>
>>>
>>> Cheers,
>>> David
>>> -----
>>>
>>>> I have not reviewed this completely yet - thought I'd wait with 
>>>> that until we agree about replace_if_null, if that is okay.
>>>>
>>>> Thanks,
>>>> /Erik
>>>>
>>>> On 2017-10-11 05:55, David Holmes wrote:
>>>>> On 11/10/2017 1:43 PM, Kim Barrett wrote:
>>>>>>> On Oct 10, 2017, at 6:01 PM, coleen.phillimore at oracle.com wrote:
>>>>>>>
>>>>>>> Summary: With the new template functions these are unnecessary.
>>>>>>>
>>>>>>> 2. renamed Atomic::replace_if_null to Atomic::cmpxchg_if_null.? 
>>>>>>> I disliked the first name because it's not explicit from the 
>>>>>>> callers that there's an underlying cas.? If people want to 
>>>>>>> fight, I'll remove the function and use cmpxchg because there 
>>>>>>> are only a couple places where this is a little nicer.
>>>>>>
>>>>>> I'm still looking at other parts, but I want to respond to this now.
>>>>>>
>>>>>> I object to this change.? I think the proposed new name is 
>>>>>> confusing,
>>>>>> suggesting there are two different comparisons involved.
>>>>>>
>>>>>> I originally called it something else that I wasn't entirely happy
>>>>>> with.? When David suggested replace_if_null I quickly adopted 
>>>>>> that as
>>>>>> I think that name exactly describes what it does.? In particular, I
>>>>>> think "atomic replace if" pretty clearly suggests a test-and-set /
>>>>>> compare-and-swap type of operation.
>>>>>
>>>>> I totally agree. It's an Atomic operation, the implementation will 
>>>>> involve something atomic, it doesn't matter if it is cmpxchg or 
>>>>> something else. The name replace_if_null describes exactly what 
>>>>> the function does - it doesn't have to describe how it does it.
>>>>>
>>>>> David
>>>>> -----
>>>>>
>>>>>> Further, I think any name involving "cmpxchg" is problematic because
>>>>>> the result of this operation is intentionally different from 
>>>>>> cmpxchg,
>>>>>> in order to better support the primary use-case, which is lazy
>>>>>> initialization.
>>>>>>
>>>>>> I also object to your alternative suggestion of removing the 
>>>>>> operation
>>>>>> entirely and just using cmpxchg directly instead.? I don't recall 
>>>>>> how
>>>>>> many occurrences there presently are, but I suspect more could 
>>>>>> easily
>>>>>> be added; it's part of a lazy initialization pattern similar to DCLP
>>>>>> but without the locks.
>>>>>>
>>>>
>


From erik.osterlund at oracle.com  Wed Oct 11 15:36:04 2017
From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=)
Date: Wed, 11 Oct 2017 17:36:04 +0200
Subject: RFR (L, but tedious) 8188220: Remove Atomic::*_ptr() uses and
 overloads from hotspot
In-Reply-To: <e3380ab0-ffde-b80e-7d65-f7fcaf3993ec@oracle.com>
References: <b6d75bd8-ca77-6507-f1f0-db57bbb99ae0@oracle.com>
 <B99DF3C6-7D79-4147-B4A0-C69F0EFA3339@oracle.com>
 <7e4fbba3-4462-a729-b663-99fd6919360f@oracle.com>
 <59DDCC37.8050306@oracle.com>
 <3089b845-0532-d6a9-b68f-91b3b21c6ef3@oracle.com>
 <591c33b3-f9b1-55f3-2c4b-ddfad4ed9a39@oracle.com>
 <bae33049-29be-0f61-5a0a-c7403a387b86@oracle.com>
 <e3380ab0-ffde-b80e-7d65-f7fcaf3993ec@oracle.com>
Message-ID: <59DE3A64.7000009@oracle.com>

Hi Coleen,

In classLoaderData.cpp:~167:
There is a cast to Chunk* when loading _head, but _head is already 
Chunk*, so it seems like that should not need a cast. In fact, _head 
should probably be declared as Chunk *volatile as it is accessed 
concurrently.

In parNewGeneration.cpp:~1450:
Atomic::add(-n, &_num_par_pushes);
can now use Atomic::sub instead.

g1PageBasedVirtualSpace.cpp:~249:
Do you really need the (char*) cast for Atomic::add? Seems like it 
already is a char*, unless I missed something.

cpCache.hpp:
Noticed the casts for &_f1 (declared as volatile Metadata*) to Metadata 
*volatile*. It seems to me like _f1 should instead be declared as 
Metaata* volatile, and remove the casts.

Also noticed some copyright headers have not been updated, might want to 
have a look at that.

Otherwise, I think this looks good. Thank you again for doing this!

Thanks,
/Erik

On 2017-10-11 15:50, coleen.phillimore at oracle.com wrote:
>
> Please review version .02 which removes use of replace_if_null, but 
> not the function.  A separate RFE can be filed to discuss that.
>
> open webrev at http://cr.openjdk.java.net/~coleenp/8188220.02/webrev
>
> Thanks,
> Coleen
>
> On 10/11/17 7:07 AM, coleen.phillimore at oracle.com wrote:
>>
>>
>> On 10/11/17 4:12 AM, Robbin Ehn wrote:
>>> On 10/11/2017 10:09 AM, David Holmes wrote:
>>>> On 11/10/2017 5:45 PM, Erik ?sterlund wrote:
>>>>
>>>> Removing the operation is a different argument to renaming it. Most 
>>>> of the above argues for removing it. :)
>>>
>>> +1 on removing
>>
>> Thank you for all your feedback.  Erik best described what I was 
>> thinking.  I will remove it then.  There were not that many instances 
>> and one instance that people thought would be useful, needed the old 
>> return value.
>>
>> Coleen
>>>
>>> Thanks, Robbin
>>>
>>>>
>>>> Cheers,
>>>> David
>>>> -----
>>>>
>>>>> I have not reviewed this completely yet - thought I'd wait with 
>>>>> that until we agree about replace_if_null, if that is okay.
>>>>>
>>>>> Thanks,
>>>>> /Erik
>>>>>
>>>>> On 2017-10-11 05:55, David Holmes wrote:
>>>>>> On 11/10/2017 1:43 PM, Kim Barrett wrote:
>>>>>>>> On Oct 10, 2017, at 6:01 PM, coleen.phillimore at oracle.com wrote:
>>>>>>>>
>>>>>>>> Summary: With the new template functions these are unnecessary.
>>>>>>>>
>>>>>>>> 2. renamed Atomic::replace_if_null to Atomic::cmpxchg_if_null.  
>>>>>>>> I disliked the first name because it's not explicit from the 
>>>>>>>> callers that there's an underlying cas.  If people want to 
>>>>>>>> fight, I'll remove the function and use cmpxchg because there 
>>>>>>>> are only a couple places where this is a little nicer.
>>>>>>>
>>>>>>> I'm still looking at other parts, but I want to respond to this 
>>>>>>> now.
>>>>>>>
>>>>>>> I object to this change.  I think the proposed new name is 
>>>>>>> confusing,
>>>>>>> suggesting there are two different comparisons involved.
>>>>>>>
>>>>>>> I originally called it something else that I wasn't entirely happy
>>>>>>> with.  When David suggested replace_if_null I quickly adopted 
>>>>>>> that as
>>>>>>> I think that name exactly describes what it does.  In particular, I
>>>>>>> think "atomic replace if" pretty clearly suggests a test-and-set /
>>>>>>> compare-and-swap type of operation.
>>>>>>
>>>>>> I totally agree. It's an Atomic operation, the implementation 
>>>>>> will involve something atomic, it doesn't matter if it is cmpxchg 
>>>>>> or something else. The name replace_if_null describes exactly 
>>>>>> what the function does - it doesn't have to describe how it does it.
>>>>>>
>>>>>> David
>>>>>> -----
>>>>>>
>>>>>>> Further, I think any name involving "cmpxchg" is problematic 
>>>>>>> because
>>>>>>> the result of this operation is intentionally different from 
>>>>>>> cmpxchg,
>>>>>>> in order to better support the primary use-case, which is lazy
>>>>>>> initialization.
>>>>>>>
>>>>>>> I also object to your alternative suggestion of removing the 
>>>>>>> operation
>>>>>>> entirely and just using cmpxchg directly instead.  I don't 
>>>>>>> recall how
>>>>>>> many occurrences there presently are, but I suspect more could 
>>>>>>> easily
>>>>>>> be added; it's part of a lazy initialization pattern similar to 
>>>>>>> DCLP
>>>>>>> but without the locks.
>>>>>>>
>>>>>
>>
>


From rohitarulraj at gmail.com  Wed Oct 11 16:20:01 2017
From: rohitarulraj at gmail.com (Rohit Arul Raj)
Date: Wed, 11 Oct 2017 21:50:01 +0530
Subject: Issues with JDK 9 crashing itself and the operating system
In-Reply-To: <CAAyVCSFBaigDEHr6OA=pk=yh0hHf37VkLyxcrW5ha_8=NVL+DQ@mail.gmail.com>
References: <CAAyVCSFnJkiCSqWUhS0qqfep0Y9p8jk_pCcmKYfWt0NAAoCEbQ@mail.gmail.com>
 <CAPVMLfUZR_9MsF8wbhm7Yfx5j7qPggS3SjUQWXeKqYNT3K4hpg@mail.gmail.com>
 <CAAyVCSFBaigDEHr6OA=pk=yh0hHf37VkLyxcrW5ha_8=NVL+DQ@mail.gmail.com>
Message-ID: <CAPVMLfU03EAcx32F8bLPs=fYV1Y0EgoqasHhX_JTd8oJxm8BOg@mail.gmail.com>

Hello Jeronimo,

Sorry for the late reply. We have already forwarded the issue to the
relevant team here to confirm if it is indeed specific to Ryzen +
Linux.

Please give us some more time to confirm the same.

Regards,
Rohit

On Tue, Oct 10, 2017 at 6:29 PM, Jeronimo Backes <jbax at univocity.com> wrote:
> Hello Rohit
> Do you have any update regarding the cause of this? Looks like it is
> specific to Ryzen+Linux.
>
> On 25 September 2017 at 20:33, Rohit Arul Raj <rohitarulraj at gmail.com>
> wrote:
>>
>> Hello Jeronimo,
>>
>> Thanks for the detailed report. We were able to reproduce the issue on
>> our machine.
>> We will analyze this further and get back to you.
>>
>> Regards,
>> Rohit
>>
>> On Sat, Sep 23, 2017 at 4:46 PM, Jeronimo Backes <jbax at univocity.com>
>> wrote:
>> > Hello, my name is Jeronimo and I'm the author of the univocity-parsers
>> > library (https://github.com/uniVocity/univocity-parsers) and I'm writing
>> > to
>> > you by recommendation of Erik Duveblad.
>> >
>> > Basically, I recently installed the JDK 9 distributed by Oracle on my
>> > development computer and when I try to build my project (with a simple
>> > `mvn
>> > clean install` command) the JVM crashes with:
>> >
>> >
>> > # A fatal error has been detected by the Java Runtime Environment:
>> > #
>> > #  SIGSEGV (0xb) at pc=0x00007f18b96c52f0, pid=3865, tid=3904
>> > #
>> > # JRE version: Java(TM) SE Runtime Environment (9.0+181) (build 9+181)
>> > # Java VM: Java HotSpot(TM) 64-Bit Server VM (9+181, mixed mode, tiered,
>> > compressed oops, g1 gc, linux-amd64)
>> > # Problematic frame:
>> > # V  [libjvm.so+0x9292f0]
>> > JVMCIGlobals::check_jvmci_flags_are_consistent()+0x120
>> > #
>> > # Core dump will be written. Default location: Core dumps may be
>> > processed
>> > with "/usr/lib/systemd/systemd-coredump %P %u %g %s %t %c %e" (or
>> > dumping to
>> > /home/jbax/dev/repository/univocity-parsers/core.3865)
>> > #
>> > # An error report file with more information is saved as:
>> > # /home/jbax/dev/repository/univocity-parsers/hs_err_pid3865.log
>> > #
>> > # Compiler replay data is saved as:
>> > # /home/jbax/dev/repository/univocity-parsers/replay_pid3865.log
>> > #
>> > # If you would like to submit a bug report, please visit:
>> > #   http://bugreport.java.com/bugreport/crash.jsp
>> > #
>> >
>> >
>> > The hs_err files generated are available here
>> >
>> > https://github.com/uniVocity/univocity-parsers/files/1326484/jdk_9_crash2.zip.
>> > This zip also contains the pom.xml file I used. The build succeeded 4
>> > times
>> > before the JVM crashed.
>> >
>> > Yesterday I had the crash happen 100% of the time, but the CPU was
>> > overclocked to 3.6Ghz (never had any issue with it though) and saved the
>> > error file here:
>> >
>> > https://github.com/uniVocity/univocity-parsers/files/1324326/jdk_9_crash.zip.
>> > I created an issue on github to investigate this:
>> > https://github.com/uniVocity/univocity-parsers/issues/189. There Erik
>> > mentioned that:
>> >
>> > "Looking at the hs_err file, the stack trace is "wrong", a C2 Compiler
>> > Thread can't call JVMCIGlobals::check_jvmci_flags_are_consistent (and
>> > the
>> > value of the register RIP does not correspond to any instruction in the
>> > compiled version of that function). This makes me suspect that something
>> > could be wrong with your CPU, the CPU should not have jumped to this
>> > memory
>> > location."
>> >
>> > Things still fail with stock hardware settings. More details about my
>> > environment :
>> >
>> > OS, Maven and Java versions:
>> >
>> > [jbax at linux-pc ~]$ mvn -version
>> > Apache Maven 3.2.5 (12a6b3acb947671f09b81f49094c53f426d8cea1;
>> > 2014-12-15T03:59:23+10:30)
>> > Maven home: /home/jbax/dev/apache-maven
>> > Java version: 9, vendor: Oracle Corporation
>> > Java home: /home/jbax/dev/jdk9
>> > Default locale: en_AU, platform encoding: UTF-8
>> > OS name: "linux", version: "4.12.13-1-manjaro", arch: "amd64", family:
>> > "unix"
>> > [jbax at linux-pc ~]$
>> >
>> > Hardware:
>> > [jbax at linux-pc univocity-parsers]$ lscpu
>> > Architecture:        x86_64
>> > CPU op-mode(s):      32-bit, 64-bit
>> > Byte Order:          Little Endian
>> > CPU(s):              16
>> > On-line CPU(s) list: 0-15
>> > Thread(s) per core:  2
>> > Core(s) per socket:  8
>> > Socket(s):           1
>> > NUMA node(s):        1
>> > Vendor ID:           AuthenticAMD
>> > CPU family:          23
>> > Model:               1
>> > Model name:          AMD Ryzen 7 1700 Eight-Core Processor
>> > Stepping:            1
>> > CPU MHz:             1550.000
>> > CPU max MHz:         3000.0000
>> > CPU min MHz:         1550.0000
>> > BogoMIPS:            6001.43
>> > Virtualization:      AMD-V
>> > L1d cache:           32K
>> > L1i cache:           64K
>> > L2 cache:            512K
>> > L3 cache:            8192K
>> > NUMA node0 CPU(s):   0-15
>> > Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr
>> > pge
>> > mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext
>> > fxsr_opt
>> > pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid
>> > extd_apicid
>> > aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 movbe
>> > popcnt
>> > aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm
>> > sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core
>> > perfctr_nb bpext perfctr_l2 mwaitx hw_pstate vmmcall fsgsbase bmi1 avx2
>> > smep
>> > bmi2 rdseed adx smap clflushopt sha_ni xsaveopt xsavec xgetbv1 xsaves
>> > clzero
>> > irperf arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid
>> > decodeassists pausefilter pfthreshold avic overflow_recov succor smca
>> >
>> > On an unrelated note, I use an old java application that crashes the
>> > entire
>> > OS for me when Java 9 is used: http://www.jinchess.com/download
>> >
>> > It's just a matter of downloading, unpacking and trying to start it with
>> > jin-2.14.1/jin
>> >
>> > The OS crashes and I have to hard-reset the computer. It works just fine
>> > if
>> > revert back to Java 6, 7 or 8.
>> >
>> > I thought you'd might want to investigate what is going on. Let me know
>> > if
>> > you need more information.
>> >
>> > Best regards,
>> >
>> > Jeronimo.
>> >
>> >
>> >
>> >
>> > --
>> > the uniVocity team
>> > www.univocity.com
>
>
>
>
> --
> the uniVocity team
> www.univocity.com

From coleen.phillimore at oracle.com  Wed Oct 11 17:44:34 2017
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Wed, 11 Oct 2017 13:44:34 -0400
Subject: RFR (L, but tedious) 8188220: Remove Atomic::*_ptr() uses and
 overloads from hotspot
In-Reply-To: <59DE3A64.7000009@oracle.com>
References: <b6d75bd8-ca77-6507-f1f0-db57bbb99ae0@oracle.com>
 <B99DF3C6-7D79-4147-B4A0-C69F0EFA3339@oracle.com>
 <7e4fbba3-4462-a729-b663-99fd6919360f@oracle.com>
 <59DDCC37.8050306@oracle.com>
 <3089b845-0532-d6a9-b68f-91b3b21c6ef3@oracle.com>
 <591c33b3-f9b1-55f3-2c4b-ddfad4ed9a39@oracle.com>
 <bae33049-29be-0f61-5a0a-c7403a387b86@oracle.com>
 <e3380ab0-ffde-b80e-7d65-f7fcaf3993ec@oracle.com>
 <59DE3A64.7000009@oracle.com>
Message-ID: <8f16bce9-8131-8ec6-18af-cba3d8234d71@oracle.com>


On 10/11/17 11:36 AM, Erik ?sterlund wrote:
> Hi Coleen,
>
> In classLoaderData.cpp:~167:
> There is a cast to Chunk* when loading _head, but _head is already 
> Chunk*, so it seems like that should not need a cast. In fact, _head 
> should probably be declared as Chunk *volatile as it is accessed 
> concurrently.

Yes, you are right.? I fixed it and now declare _head as Chunk* volatile 
(star goes on type I think).
>
> In parNewGeneration.cpp:~1450:
> Atomic::add(-n, &_num_par_pushes);
> can now use Atomic::sub instead.

Fixed.
>
> g1PageBasedVirtualSpace.cpp:~249:
> Do you really need the (char*) cast for Atomic::add? Seems like it 
> already is a char*, unless I missed something.
>

Nope.? Missed that one.

> cpCache.hpp:
> Noticed the casts for &_f1 (declared as volatile Metadata*) to 
> Metadata *volatile*. It seems to me like _f1 should instead be 
> declared as Metaata* volatile, and remove the casts.
>

Fixed.? You are right about the declaration for _f1.? It should be 
Metadata* volatile.
> Also noticed some copyright headers have not been updated, might want 
> to have a look at that.
>

I forgot to say that I update the copyrights in my commit script.

> Otherwise, I think this looks good. Thank you again for doing this!
>

Thank you so much for reviewing all of this and making the templates 
easy to use.

Coleen

> Thanks,
> /Erik
>
> On 2017-10-11 15:50, coleen.phillimore at oracle.com wrote:
>>
>> Please review version .02 which removes use of replace_if_null, but 
>> not the function.? A separate RFE can be filed to discuss that.
>>
>> open webrev at http://cr.openjdk.java.net/~coleenp/8188220.02/webrev
>>
>> Thanks,
>> Coleen
>>
>> On 10/11/17 7:07 AM, coleen.phillimore at oracle.com wrote:
>>>
>>>
>>> On 10/11/17 4:12 AM, Robbin Ehn wrote:
>>>> On 10/11/2017 10:09 AM, David Holmes wrote:
>>>>> On 11/10/2017 5:45 PM, Erik ?sterlund wrote:
>>>>>
>>>>> Removing the operation is a different argument to renaming it. 
>>>>> Most of the above argues for removing it. :)
>>>>
>>>> +1 on removing
>>>
>>> Thank you for all your feedback.? Erik best described what I was 
>>> thinking.? I will remove it then.? There were not that many 
>>> instances and one instance that people thought would be useful, 
>>> needed the old return value.
>>>
>>> Coleen
>>>>
>>>> Thanks, Robbin
>>>>
>>>>>
>>>>> Cheers,
>>>>> David
>>>>> -----
>>>>>
>>>>>> I have not reviewed this completely yet - thought I'd wait with 
>>>>>> that until we agree about replace_if_null, if that is okay.
>>>>>>
>>>>>> Thanks,
>>>>>> /Erik
>>>>>>
>>>>>> On 2017-10-11 05:55, David Holmes wrote:
>>>>>>> On 11/10/2017 1:43 PM, Kim Barrett wrote:
>>>>>>>>> On Oct 10, 2017, at 6:01 PM, coleen.phillimore at oracle.com wrote:
>>>>>>>>>
>>>>>>>>> Summary: With the new template functions these are unnecessary.
>>>>>>>>>
>>>>>>>>> 2. renamed Atomic::replace_if_null to 
>>>>>>>>> Atomic::cmpxchg_if_null.? I disliked the first name because 
>>>>>>>>> it's not explicit from the callers that there's an underlying 
>>>>>>>>> cas.? If people want to fight, I'll remove the function and 
>>>>>>>>> use cmpxchg because there are only a couple places where this 
>>>>>>>>> is a little nicer.
>>>>>>>>
>>>>>>>> I'm still looking at other parts, but I want to respond to this 
>>>>>>>> now.
>>>>>>>>
>>>>>>>> I object to this change.? I think the proposed new name is 
>>>>>>>> confusing,
>>>>>>>> suggesting there are two different comparisons involved.
>>>>>>>>
>>>>>>>> I originally called it something else that I wasn't entirely happy
>>>>>>>> with.? When David suggested replace_if_null I quickly adopted 
>>>>>>>> that as
>>>>>>>> I think that name exactly describes what it does. In particular, I
>>>>>>>> think "atomic replace if" pretty clearly suggests a test-and-set /
>>>>>>>> compare-and-swap type of operation.
>>>>>>>
>>>>>>> I totally agree. It's an Atomic operation, the implementation 
>>>>>>> will involve something atomic, it doesn't matter if it is 
>>>>>>> cmpxchg or something else. The name replace_if_null describes 
>>>>>>> exactly what the function does - it doesn't have to describe how 
>>>>>>> it does it.
>>>>>>>
>>>>>>> David
>>>>>>> -----
>>>>>>>
>>>>>>>> Further, I think any name involving "cmpxchg" is problematic 
>>>>>>>> because
>>>>>>>> the result of this operation is intentionally different from 
>>>>>>>> cmpxchg,
>>>>>>>> in order to better support the primary use-case, which is lazy
>>>>>>>> initialization.
>>>>>>>>
>>>>>>>> I also object to your alternative suggestion of removing the 
>>>>>>>> operation
>>>>>>>> entirely and just using cmpxchg directly instead.? I don't 
>>>>>>>> recall how
>>>>>>>> many occurrences there presently are, but I suspect more could 
>>>>>>>> easily
>>>>>>>> be added; it's part of a lazy initialization pattern similar to 
>>>>>>>> DCLP
>>>>>>>> but without the locks.
>>>>>>>>
>>>>>>
>>>
>>
>


From bob.vandette at oracle.com  Wed Oct 11 19:11:41 2017
From: bob.vandette at oracle.com (Bob Vandette)
Date: Wed, 11 Oct 2017 15:11:41 -0400
Subject: RFR: 8146115 - Improve docker container detection and resource
 configuration usage 
In-Reply-To: <ff0d651b-eea1-85a1-46d8-c4904556fce2@oracle.com>
References: <74630458-926E-4B3E-B967-6F6ADCA0A406@oracle.com>
 <7e9322e8-274d-9fb2-f6a5-8cc612e3fe68@oracle.com>
 <33d617e7-7ec4-ebde-efa1-5602189e8470@oracle.com>
 <12909a67-6876-a40c-85b9-b959ed9f02df@oracle.com>
 <39AD9F8D-7E2B-4C15-8525-36DBA7C74302@oracle.com>
 <d3c36d08-1c9f-69bb-4e0c-aa00d3e62691@oracle.com>
 <F25C2D01-C643-4838-B2A7-DAA19D1B1EF5@oracle.com>
 <1d05a95f-75db-e0ca-e069-12fe41502e4f@oracle.com>
 <D2E3D675-2A4A-4ED1-BC30-6C52B185B731@oracle.com>
 <CAHeneC8cxBJ61A4u1AkfmpP63ZG6-WEkMyFLKnhRv=G6W32pFw@mail.gmail.com>
 <615e504d-94af-bae3-b721-6ca1dac6a567@oracle.com>
 <CAHeneC9tdgc9aCx-VLa8TvYufq_V-z+ApoAv5Cewi6UzX9uz8Q@mail.gmail.com>
 <1BD883DB-8C8B-405D-8F85-3A026B19286F@oracle.com>
 <5f7f3d85-db48-fe6e-28f5-e3f4858f33e8@oracle.com>
 <A5CDB304-6B44-448E-ADF3-C736FB49C85C@oracle.com>
 <ff0d651b-eea1-85a1-46d8-c4904556fce2@oracle.com>
Message-ID: <CD1C6FA4-5729-40A7-91DC-224FCAC55949@oracle.com>

Here?s an updated webrev for this RFE that contains changes and cleanups based on feedback
I?ve received so far.

I?m still investigating the best approach for reacting to cpu shares and quotas.  I do not believe
doing nothing is the answer.

http://cr.openjdk.java.net/~bobv/8146115/webrev.01 <http://cr.openjdk.java.net/~bobv/8146115/webrev.01>

Updates:

1. I had to move the processing of AggressiveHeap since the container memory size needs to be
known before this can be processed.

2. I no longer use the cpuset.cpus contents since sched_getaffinity reports the correct results
even if someone manually updates the cgroup data.  I originally didn?t think this was the case since
sched_setaffinity didn?t automatically update the cpuset file contents but the inverse is true.

3. I ifdef?d the container function support in src/hotspot/share/runtime/os.hpp to avoid putting stubs in all other os
platform directories.  I can do this if it?s absolutely necessary.

Bob.


> On Oct 6, 2017, at 7:28 PM, David Holmes <David.Holmes at oracle.com> wrote:
> 
> On 7/10/2017 1:34 AM, Bob Vandette wrote:
>>> On Oct 5, 2017, at 6:12 PM, David Holmes <David.Holmes at oracle.com> wrote:
>>> 
>>> Hi Bob,
>>> 
>>> On 6/10/2017 3:57 AM, Bob Vandette wrote:
>>>>> On Oct 5, 2017, at 12:43 PM, Alex Bagehot <ceeaspb at gmail.com <mailto:ceeaspb at gmail.com>> wrote:
>>>>> 
>>>>> Hi David,
>>>>> 
>>>>> On Wed, Oct 4, 2017 at 10:51 PM, David Holmes <david.holmes at oracle.com <mailto:david.holmes at oracle.com>> wrote:
>>>>> 
>>>>>    Hi Alex,
>>>>> 
>>>>>    Can you tell me how shares/quotas are actually implemented in
>>>>>    terms of allocating "cpus" to processes when shares/quotas are
>>>>>    being applied?
>>>>> 
>>>>> The allocation of cpus to processes/threads(tasks as the kernel sees them) or the other way round is called balancing, which is done by Scheduling domains[3].
>>>>> 
>>>>> cpu shares use CFS "group" scheduling[1] to apply the share to all the tasks(threads) in the container. The container cpu shares weight maps directly to a task's weight in CFS, which given it is part of a group is divided by the number of tasks in the group (ie. a default container share of 1024 with 2 threads in the container/group would result in each thread/task having a 512 weight[4]). The same values used by nice[2] also.
>>>>> 
>>>>> You can observe the task weight and other scheduler numbers in /proc/sched_debug [4]. You can also kernel trace scheduler activity which typically tells you the tasks involved, the cpu, the event: switch or wakeup, etc.
>>>>> 
>>>>>    For example in a 12 cpu system if I have a 50% share do I get all
>>>>>    12 CPUs for 50% of a "quantum" each, or do I get 6 CPUs for a full
>>>>>    quantum each?
>>>>> 
>>>>> 
>>>>> You get 12 cpus for 50% of the time on the average if there is another workload that has the same weight as you and is consuming as much as it can.
>>>>> If there's nothing else running on the machine you get 12 cpus for 100% of the time with a cpu shares only config (ie. the burst capacity).
>>>>> 
>>>>> I validated that the share was balanced over all the cpus by running linux perf events and checking that there were cpu samples on all cpus. There's bound to be other ways of doing it also.
>>>>> 
>>>>> 
>>>>>    When we try to use the "number of processors" to control the
>>>>>    number of threads created, or the number of partitions in a task,
>>>>>    then we really want to know how many CPUs we can actually be
>>>>>    concurrently running on!
>>>> I?m not sure that the primary question for serverless container execution.  Just because you might happen to burst and have available
>>>> to you more CPU time than you specified in your shares doesn?t mean
>>>> that a multi-threaded application running in one of these containers should configure itself to use all available host processors.  This would result in over-burdoning the system at times of high load.
>>> 
>>> And conversely if you restrict yourself to the "share" of processors you get over time (ie 6 instead of 12) then you can severely impact the performance (response time in particular) of the VM and the application running on the VM.
>> So if someone configures an 88 way system to use 1/88 share, you don?t think they expect a highly threaded
>> application to run slower than if they didn?t restrict the shares??   The whole idea about shares is to SHARE the
>> system.  Yes, you?d have better performance when the system is idle and only running a single application but that?s
>> not what these container frameworks are trying to accomplish.  They want to get the best performance when running many
>> many processes.  That?s what I?m optimizing for.
> 
> In what I described you are SHARING the system. You're also getting the most benefit from a lightly loaded system.
> 
> To me the conceptual model for a 1/88 share of an 88-way system is that you get 88 processors that appear to run at 1/88 the speed of the physical ones. Not that you get 1 real full speed processor.
> 
>>> 
>>> But I don't see how this can overburden the system. If you app is running alone you get to use all 12 cpus for 100% of the time and life is good. If another app starts up then your 100% drops proportionately. If you schedule 12 apps all with a 1/12 share then everyone gets up to 12 cpus for 1/12 of the time. It's only if you try to schedule a set of apps with a utilization total greater than 1 does the system become overloaded.
>> In my above example, If we run the VM ergonomics based on 88 CPUs, then we are wasting a lot of memory on thread stacks and when
>> many of these processes are running,  the system will context switch a lot more than it would if we restricted the creation of threads to
>> the share amount.
> 
> Context switching is a function of threads and time. My way uses more threads and less time (per unit of work); yours uses less threads and more time. Seems like zero sum to me.
> 
> Memory use is a different matter, but only because you can restrict memory independently of cpus. So you will need to ensure your memory quotas can accommodate the number of threads you expect to run - regardless.
> 
> David
> -----
> 
>> Bob.
>>> 
>>>> The Java runtime, at startup, configures several subsystems to use a number of threads for each system based on the number of available
>>>> processors.  These subsystems include things like the number of GC
>>>> threads, JIT compiler and thread pools.
>>> 
>>>> The problem I am trying to solve is to come up with a single number
>>>> of CPUs based on container knowledge that can be used for the Java
>>>> runtime subsystem to configure itself.  I believe that we should
>>>> trust the implementor of the Mesos or Kubernetes setup and honor their wishes when coming up with this number and not just use the
>>>> processor affinity or number of cpus in the cpuset.
>>> 
>>> I don't agree, as has been discussed before. It's perfectly fine, even desirable, in my opinion to have 12 threads executing concurrently for 50% of the time, rather than only 6 threads for 100% (assuming the scheduling technology is even clever enough to realize it can grant your threads 100%).
>>> 
>>> Over time the amount of work your app can execute is the same, but the time taken for an individual subtask can vary. If you are just doing one-shot batch processing then it makes no difference. If you're running an app that itself services incoming requests then the response time to individual requests can be impacted. To take the worst-case scenario, imagine you get 12 concurrent requests that would each take 1/12 of your cpu quota. With 12 threads on 12 cpus you can service all 12 requests with a response time of 1/12 time units. But with 6 threads on 6 cpus you can only service 6 requests with a 1/12 response time, and the other 6 will have a 1/6 response time.
>>> 
>>>> The challenge is determining the right algorithm that doesn?t penalize the VM.
>>> 
>>> Agreed. But I think the current algorithm may penalize the VM, and more importantly the application it is running.
>>> 
>>>> My current implementation does this:
>>>> total available logical processors = min (cpusets,sched_getaffinity,shares/1024, quota/period)
>>>> All fractional units are rounded up to the next whole number.
>>> 
>>> My point has always been that I just don't think producing a single number from all these factors is the right/best way to deal with this. I think we really want to be able to answer the question "how many processors can I concurrently execute on" distinct from the question of "how much of a time slice will I get on each of those processors". To me "how many" is the question that "availableProcessors" should be answering - and only that question. How much "share" do I get is a different question, and perhaps one that the VM and the application need to be able to ask.
>>> 
>>> BTW sched_getaffinity should already account for cpusets ??
>>> 
>>> Cheers,
>>> David
>>> 
>>>> Bob.
>>>>> 
>>>>> Makes sense to check. Hopefully there aren't any major errors or omissions in the above.
>>>>> Thanks,
>>>>> Alex
>>>>> 
>>>>> [1] https://lwn.net/Articles/240474/ <https://lwn.net/Articles/240474/>
>>>>> [2] https://github.com/torvalds/linux/blob/368f89984bb971b9f8b69eeb85ab19a89f985809/kernel/sched/core.c#L6735 <https://github.com/torvalds/linux/blob/368f89984bb971b9f8b69eeb85ab19a89f985809/kernel/sched/core.c#L6735>
>>>>> [3] https://lwn.net/Articles/80911/ <https://lwn.net/Articles/80911/> / http://www.i3s.unice.fr/~jplozi/wastedcores/files/extended_talk.pdf <http://www.i3s.unice.fr/~jplozi/wastedcores/files/extended_talk.pdf>
>>>>> 
>>>>> [4]
>>>>> 
>>>>> cfs_rq[13]:/system.slice/docker-f5681788d6daab249c90810fe60da429a2565b901ff34245922a578635b5d607.scope
>>>>> 
>>>>> .exec_clock: 0.000000
>>>>> 
>>>>> .MIN_vruntime: 0.000001
>>>>> 
>>>>> .min_vruntime: 8090.087297
>>>>> 
>>>>> .max_vruntime: 0.000001
>>>>> 
>>>>> .spread: 0.000000
>>>>> 
>>>>> .spread0 : -124692718.052832
>>>>> 
>>>>> .nr_spread_over: 0
>>>>> 
>>>>> .nr_running: 1
>>>>> 
>>>>> .load: 1024
>>>>> 
>>>>> .runnable_load_avg : 1023
>>>>> 
>>>>> .blocked_load_avg: 0
>>>>> 
>>>>> .tg_load_avg : 2046
>>>>> 
>>>>> .tg_load_contrib : 1023
>>>>> 
>>>>> .tg_runnable_contrib : 1023
>>>>> 
>>>>> .tg->runnable_avg: 2036
>>>>> 
>>>>> .tg->cfs_bandwidth.timer_active: 0
>>>>> 
>>>>> .throttled : 0
>>>>> 
>>>>> .throttle_count: 0
>>>>> 
>>>>> .se->exec_start: 236081964.515645
>>>>> 
>>>>> .se->vruntime: 24403993.326934
>>>>> 
>>>>> .se->sum_exec_runtime: 8091.135873
>>>>> 
>>>>> .se->load.weight : 512
>>>>> 
>>>>> .se->avg.runnable_avg_sum: 45979
>>>>> 
>>>>> .se->avg.runnable_avg_period : 45979
>>>>> 
>>>>> .se->avg.load_avg_contrib: 511
>>>>> 
>>>>> .se->avg.decay_count : 0
>>>>> 
>>>>> 
>>>>>    Thanks,
>>>>>    David
>>>>> 
>>>>> 
>>>>>    On 5/10/2017 6:01 AM, Alex Bagehot wrote:
>>>>> 
>>>>>        Hi,
>>>>> 
>>>>>        On Wed, Oct 4, 2017 at 7:51 PM, Bob Vandette
>>>>>        <bob.vandette at oracle.com <mailto:bob.vandette at oracle.com>>
>>>>>        wrote:
>>>>> 
>>>>> 
>>>>>                On Oct 4, 2017, at 2:30 PM, Robbin Ehn
>>>>>                <robbin.ehn at oracle.com <mailto:robbin.ehn at oracle.com>>
>>>>>                wrote:
>>>>> 
>>>>>                Thanks Bob for looking into this.
>>>>> 
>>>>>                On 10/04/2017 08:14 PM, Bob Vandette wrote:
>>>>> 
>>>>>                    Robbin,
>>>>>                    I?ve looked into this issue and you are correct.                     I do have to examine
>>>>> 
>>>>>            both the
>>>>> 
>>>>>                    sched_getaffinity results as well as the cgroup
>>>>>                    cpu subsystem
>>>>> 
>>>>>            configuration
>>>>> 
>>>>>                    files in order to provide a reasonable value for
>>>>>                    active_processors.  If
>>>>> 
>>>>>            I was only
>>>>> 
>>>>>                    interested in cpusets, I could simply rely on the
>>>>>                    getaffinity call but
>>>>> 
>>>>>            I also want to
>>>>> 
>>>>>                    factor in shares and quotas as well.
>>>>> 
>>>>> 
>>>>>                We had a quick discussion at the office, we actually
>>>>>                do think that you
>>>>> 
>>>>>            could skip reading the shares and quotas.
>>>>> 
>>>>>                It really depends on what the user expect, if he give
>>>>>                us 4 cpu's with
>>>>> 
>>>>>            50% or 2 full cpu what do he expect the differences would be?
>>>>> 
>>>>>                One could argue that he 'knows' that he will only use
>>>>>                max 50% and thus
>>>>> 
>>>>>            we can act as if he is giving us 4 full cpu.
>>>>> 
>>>>>                But I'll leave that up to you, just a tough we had.
>>>>> 
>>>>> 
>>>>>            It?s my opinion that we should do something if someone
>>>>>            makes the effort to
>>>>>            configure their
>>>>>            containers to use quotas or shares.  There are many
>>>>>            different opinions on
>>>>>            what the right that
>>>>>            right ?something? is.
>>>>> 
>>>>> 
>>>>>        It might be interesting to look at some real instances of how
>>>>>        java might[3]
>>>>>        be deployed in containers.
>>>>>        Marathon/Mesos[1] and Kubernetes[2] use shares and quotas so
>>>>>        this is a vast
>>>>>        chunk of deployments that need both of them today.
>>>>> 
>>>>> 
>>>>> 
>>>>>            Many developers that are trying to deploy apps that use
>>>>>            containers say
>>>>>            they don?t like
>>>>>            cpusets.  This is too limiting for them especially when
>>>>>            the server
>>>>>            configurations vary
>>>>>            within their organization.
>>>>> 
>>>>> 
>>>>>        True, however Kubernetes has an alpha feature[5] where it
>>>>>        allocates cpusets
>>>>>        to containers that request a whole number of cpus. Previously
>>>>>        without
>>>>>        cpusets any container could run on any cpu which we know might
>>>>>        not be good
>>>>>        for some workloads that want isolation. A request for a
>>>>>        fractional or
>>>>>        burstable amount of cpu would be allocated from a shared cpu
>>>>>        pool. So
>>>>>        although manual allocation of cpusets will be flakey[3] ,
>>>>>        automation should
>>>>>        be able to make it work.
>>>>> 
>>>>> 
>>>>> 
>>>>>             From everything I?ve read including source code, there
>>>>>            seems to be a
>>>>>            consensus that
>>>>>            shares and quotas are being used as a way to specify a
>>>>>            fraction of a
>>>>>            system (number of cpus).
>>>>> 
>>>>> 
>>>>>        A refinement[6] on this is:
>>>>>        Shares can be used for guaranteed cpu - you will always get
>>>>>        your share.
>>>>>        Quota[4] is a limit/constraint - you can never get more than
>>>>>        the quota.
>>>>>        So given the below limit of how many shares will be allocated
>>>>>        on a host you
>>>>>        can have burstable(or overcommit) capacity if your shares are
>>>>>        less than
>>>>>        your quota.
>>>>> 
>>>>> 
>>>>> 
>>>>>            Docker added ?cpus which is implemented using quotas and
>>>>>            periods.  They
>>>>>            adjust these
>>>>>            two parameters to provide a way of calculating the number
>>>>>            of cpus that
>>>>>            will be available
>>>>>            to a process (quota/period).  Amazon also documents that
>>>>>            cpu shares are
>>>>>            defined to be a multiple of 1024.
>>>>>            Where 1024 represents a single cpu and a share value of
>>>>>            N*1024 represents
>>>>>            N cpus.
>>>>> 
>>>>> 
>>>>>        Kubernetes and Mesos/Marathon also use the N*1024 shares per
>>>>>        host to
>>>>>        allocate resources automatically.
>>>>> 
>>>>>        Hopefully this provides some background on what a couple of
>>>>>        orchestration
>>>>>        systems that will be running java are doing currently in this
>>>>>        area.
>>>>>        Thanks,
>>>>>        Alex
>>>>> 
>>>>> 
>>>>>        [1] https://github.com/apache/mesos/commit/346cc8dd528a28a6e
>>>>>        <https://github.com/apache/mesos/commit/346cc8dd528a28a6e>
>>>>>        1f1cbdb4c95b8bdea2f6070 / (now out of date but appears to be a
>>>>>        reasonable
>>>>>        intro :
>>>>>        https://zcox.wordpress.com/2014/09/17/cpu-resources-in-docke
>>>>>        <https://zcox.wordpress.com/2014/09/17/cpu-resources-in-docke>
>>>>>        r-mesos-and-marathon/ )
>>>>>        [1a] https://youtu.be/hJyAfC-Z2xk?t=2439
>>>>>        <https://youtu.be/hJyAfC-Z2xk?t=2439>
>>>>> 
>>>>>        [2] https://kubernetes.io/docs/concepts/configuration/manage
>>>>>        <https://kubernetes.io/docs/concepts/configuration/manage>
>>>>>        -compute-resources-container/
>>>>> 
>>>>>        [3] https://youtu.be/w1rZOY5gbvk?t=2479
>>>>>        <https://youtu.be/w1rZOY5gbvk?t=2479>
>>>>> 
>>>>>        [4]
>>>>>        https://www.kernel.org/doc/Documentation/scheduler/sched-bwc.txt
>>>>>        <https://www.kernel.org/doc/Documentation/scheduler/sched-bwc.txt>
>>>>>        https://landley.net/kdocs/ols/2010/ols2010-pages-245-254.pdf
>>>>>        <https://landley.net/kdocs/ols/2010/ols2010-pages-245-254.pdf>
>>>>>        https://lwn.net/Articles/428175/
>>>>>        <https://lwn.net/Articles/428175/>
>>>>> 
>>>>>        [5]
>>>>>        https://github.com/kubernetes/community/blob/43ce57ac476b9f2ce3f0220354a075e095a0d469/contributors/design-proposals/node/cpu-manager.md
>>>>>        <https://github.com/kubernetes/community/blob/43ce57ac476b9f2ce3f0220354a075e095a0d469/contributors/design-proposals/node/cpu-manager.md>
>>>>>        / https://github.com/kubernetes/kubernetes/commit/
>>>>>        <https://github.com/kubernetes/kubernetes/commit/>
>>>>>        00f0e0f6504ad8dd85fcbbd6294cd7cf2475fc72 /
>>>>>        https://vimeo.com/226858314
>>>>> 
>>>>> 
>>>>>        [6] https://kubernetes.io/docs/concepts/configuration/manage-
>>>>>        <https://kubernetes.io/docs/concepts/configuration/manage->
>>>>>        compute-resources-container/#how-pods-with-resource-limits-are-run
>>>>> 
>>>>> 
>>>>>            Of course these are just conventions.  This is why I
>>>>>            provided a way of
>>>>>            specifying the
>>>>>            number of CPUs so folks deploying Java services can be
>>>>>            certain they get
>>>>>            what they want.
>>>>> 
>>>>>            Bob.
>>>>> 
>>>>> 
>>>>>                    I had assumed that when sched_setaffinity was
>>>>>                    called (in your case by
>>>>> 
>>>>>            numactl) that the
>>>>> 
>>>>>                    cgroup cpu config files would be updated to
>>>>>                    reflect the current
>>>>> 
>>>>>            processor affinity for the
>>>>> 
>>>>>                    running process. This is not correct.  I have
>>>>>                    updated my changeset and
>>>>> 
>>>>>            have successfully
>>>>> 
>>>>>                    run with your examples below.  I?ll post a new
>>>>>                    webrev soon.
>>>>> 
>>>>> 
>>>>>                I see, thanks again!
>>>>> 
>>>>>                /Robbin
>>>>> 
>>>>>                    Thanks,
>>>>>                    Bob.
>>>>> 
>>>>> 
>>>>>                            I still want to include the flag for at
>>>>>                            least one Java release in the
>>>>> 
>>>>>            event that the new behavior causes some regression
>>>>> 
>>>>>                            in behavior.  I?m trying to make the
>>>>>                            detection robust so that it will
>>>>> 
>>>>>            fallback to the current behavior in the event
>>>>> 
>>>>>                            that cgroups is not configured as expected
>>>>>                            but I?d like to have a way
>>>>> 
>>>>>            of forcing the issue.  JDK 10 is not
>>>>> 
>>>>>                            supposed to be a long term support release
>>>>>                            which makes it a good
>>>>> 
>>>>>            target for this new behavior.
>>>>> 
>>>>>                            I agree with David that once we commit to
>>>>>                            cgroups, we should extract
>>>>> 
>>>>>            all VM configuration data from that
>>>>> 
>>>>>                            source.  There?s more information
>>>>>                            available for cpusets than just
>>>>> 
>>>>>            processor affinity that we might want to
>>>>> 
>>>>>                            consider when calculating the number of
>>>>>                            processors to assume for the
>>>>> 
>>>>>            VM.  There?s exclusivity and
>>>>> 
>>>>>                            effective cpu data available in addition
>>>>>                            to the cpuset string.
>>>>> 
>>>>> 
>>>>>                        cgroup only contains limits, not the real hard
>>>>>                        limits.
>>>>>                        You most consider the affinity mask. We that
>>>>>                        have numa nodes do:
>>>>> 
>>>>>                        [rehn at rehn-ws dev]$ numactl --cpunodebind=1
>>>>>                        --membind=1 java
>>>>> 
>>>>>            -Xlog:os=debug -cp . ForEver | grep proc
>>>>> 
>>>>>                        [0.001s][debug][os] Initial active processor
>>>>>                        count set to 16
>>>>>                        [rehn at rehn-ws dev]$ numactl --cpunodebind=1
>>>>>                        --membind=1 java
>>>>> 
>>>>>            -Xlog:os=debug -XX:+UseContainerSupport -cp . ForEver |
>>>>>            grep proc
>>>>> 
>>>>>                        [0.001s][debug][os] Initial active processor
>>>>>                        count set to 32
>>>>> 
>>>>>                        when benchmarking all the time and that must
>>>>>                        be set to 16 otherwise
>>>>> 
>>>>>            the flag is really bad for us.
>>>>> 
>>>>>                        So the flag actually breaks the little numa
>>>>>                        support we have now.
>>>>> 
>>>>>                        Thanks, Robbin
>>>>> 
>>>>> 
>>>>> 
>>>>> 


From goetz.lindenmaier at sap.com  Wed Oct 11 20:06:45 2017
From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz)
Date: Wed, 11 Oct 2017 20:06:45 +0000
Subject: RFR(M): 8189102: All tools should support -?, -h and --help
Message-ID: <c27028f9f5844c3c818885f88c75f84b@sap.com>

Hi

The tools in jdk should all show the same behavior wrt. help flags. 
This change normalizes the help flags of a row of the tools in the jdk.
Java accepts -?, -h and --help, thus I changed the tools to support
these, too.  Some tools exited with '1' after displaying the help message, 
I turned this to '0'. 

Maybe this is not the right mailing list for this, please advise.

Please review this change. I please need a sponsor.
http://cr.openjdk.java.net/~goetz/wr17/8189102-helpMessage/webrev.01/

In detail, this fixes the help message of the following tools:
jar          -? -h --help;  added -?.
jarsigner    -? -h --help;  added --help. -help accepted but not documented.
javac        -?    --help;  added -?. Removed -help. -h is taken for other purpose
javadoc      -? -h --help;  added -h -?. Removed -help
javap        -? -h --help;  added -h. -help accepted but no more documented.
jcmd         -? -h --help;  added -? --help. -help accepted but no more documented. Changed return value to '0'
jdb          -? -h --help;  added -? -h --help. -help accepted but no more documented.
jdeprscan    -? -h --help;  added -?
jinfo        -? -h --help;  added -? --help. -help accepted but no more documented.
jjs             -h --help;  Replaced -help by --help. Adding more not straight forward.
jps          -? -h --help;  added -? --help. -help accepted but no more documented.
jshell       -? -h --help;  added -?
jstat        -? -h --help;  added -h --help. -help accepted but no more documented.

Best regards,
  Goetz.

From joe.darcy at oracle.com  Wed Oct 11 20:10:31 2017
From: joe.darcy at oracle.com (joe darcy)
Date: Wed, 11 Oct 2017 13:10:31 -0700
Subject: RFR(M): 8189102: All tools should support -?, -h and --help
In-Reply-To: <c27028f9f5844c3c818885f88c75f84b@sap.com>
References: <c27028f9f5844c3c818885f88c75f84b@sap.com>
Message-ID: <895e0f83-3b7f-f691-53d6-67a3d6257aa3@oracle.com>

Hi Goetz,

Note that a change like this require a CSR request for the command line 
updates and return code modification. The review should also occur on 
aliases where the various tools are discussed, for example, javac is 
discussed on compiler-dev and several other tools are discussed on 
core-libs-dev.

Thanks,

-Joe


On 10/11/2017 1:06 PM, Lindenmaier, Goetz wrote:
> Hi
>
> The tools in jdk should all show the same behavior wrt. help flags.
> This change normalizes the help flags of a row of the tools in the jdk.
> Java accepts -?, -h and --help, thus I changed the tools to support
> these, too.  Some tools exited with '1' after displaying the help message,
> I turned this to '0'.
>
> Maybe this is not the right mailing list for this, please advise.
>
> Please review this change. I please need a sponsor.
> http://cr.openjdk.java.net/~goetz/wr17/8189102-helpMessage/webrev.01/
>
> In detail, this fixes the help message of the following tools:
> jar          -? -h --help;  added -?.
> jarsigner    -? -h --help;  added --help. -help accepted but not documented.
> javac        -?    --help;  added -?. Removed -help. -h is taken for other purpose
> javadoc      -? -h --help;  added -h -?. Removed -help
> javap        -? -h --help;  added -h. -help accepted but no more documented.
> jcmd         -? -h --help;  added -? --help. -help accepted but no more documented. Changed return value to '0'
> jdb          -? -h --help;  added -? -h --help. -help accepted but no more documented.
> jdeprscan    -? -h --help;  added -?
> jinfo        -? -h --help;  added -? --help. -help accepted but no more documented.
> jjs             -h --help;  Replaced -help by --help. Adding more not straight forward.
> jps          -? -h --help;  added -? --help. -help accepted but no more documented.
> jshell       -? -h --help;  added -?
> jstat        -? -h --help;  added -h --help. -help accepted but no more documented.
>
> Best regards,
>    Goetz.


From david.holmes at oracle.com  Wed Oct 11 21:38:40 2017
From: david.holmes at oracle.com (David Holmes)
Date: Thu, 12 Oct 2017 07:38:40 +1000
Subject: RFR(M): 8189102: All tools should support -?, -h and --help
In-Reply-To: <895e0f83-3b7f-f691-53d6-67a3d6257aa3@oracle.com>
References: <c27028f9f5844c3c818885f88c75f84b@sap.com>
 <895e0f83-3b7f-f691-53d6-67a3d6257aa3@oracle.com>
Message-ID: <d5f6ddc6-a380-177a-ebde-820bb854273e@oracle.com>

On 12/10/2017 6:10 AM, joe darcy wrote:
> Hi Goetz,
> 
> Note that a change like this require a CSR request for the command line 
> updates and return code modification. The review should also occur on 
> aliases where the various tools are discussed, for example, javac is 
> discussed on compiler-dev and several other tools are discussed on 
> core-libs-dev.

And none of the tools/launchers fall under hotspot directly. Some may be 
serviceability ...

David

> Thanks,
> 
> -Joe
> 
> 
> On 10/11/2017 1:06 PM, Lindenmaier, Goetz wrote:
>> Hi
>>
>> The tools in jdk should all show the same behavior wrt. help flags.
>> This change normalizes the help flags of a row of the tools in the jdk.
>> Java accepts -?, -h and --help, thus I changed the tools to support
>> these, too.? Some tools exited with '1' after displaying the help 
>> message,
>> I turned this to '0'.
>>
>> Maybe this is not the right mailing list for this, please advise.
>>
>> Please review this change. I please need a sponsor.
>> http://cr.openjdk.java.net/~goetz/wr17/8189102-helpMessage/webrev.01/
>>
>> In detail, this fixes the help message of the following tools:
>> jar????????? -? -h --help;? added -?.
>> jarsigner??? -? -h --help;? added --help. -help accepted but not 
>> documented.
>> javac??????? -???? --help;? added -?. Removed -help. -h is taken for 
>> other purpose
>> javadoc????? -? -h --help;? added -h -?. Removed -help
>> javap??????? -? -h --help;? added -h. -help accepted but no more 
>> documented.
>> jcmd???????? -? -h --help;? added -? --help. -help accepted but no 
>> more documented. Changed return value to '0'
>> jdb????????? -? -h --help;? added -? -h --help. -help accepted but no 
>> more documented.
>> jdeprscan??? -? -h --help;? added -?
>> jinfo??????? -? -h --help;? added -? --help. -help accepted but no 
>> more documented.
>> jjs???????????? -h --help;? Replaced -help by --help. Adding more not 
>> straight forward.
>> jps????????? -? -h --help;? added -? --help. -help accepted but no 
>> more documented.
>> jshell?????? -? -h --help;? added -?
>> jstat??????? -? -h --help;? added -h --help. -help accepted but no 
>> more documented.
>>
>> Best regards,
>> ?? Goetz.
> 

From vladimir.kozlov at oracle.com  Wed Oct 11 23:03:36 2017
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Wed, 11 Oct 2017 16:03:36 -0700
Subject: RFR(M): 8189102: All tools should support -?, -h and --help
In-Reply-To: <c27028f9f5844c3c818885f88c75f84b@sap.com>
References: <c27028f9f5844c3c818885f88c75f84b@sap.com>
Message-ID: <a0c7c170-7ee7-6fcb-90c2-ce774a328669@oracle.com>

You missed AOT tool jaotc:

http://hg.openjdk.java.net/jdk10/hs/file/44117bc2bedf/src/jdk.aot/share/classes/jdk.tools.jaotc/src/jdk/tools/jaotc/Options.java#l230

     }, new Option("  --help                     Print this usage message", false, "--help", "-h", "-?") {

Vladimir

On 10/11/17 1:06 PM, Lindenmaier, Goetz wrote:
> Hi
> 
> The tools in jdk should all show the same behavior wrt. help flags.
> This change normalizes the help flags of a row of the tools in the jdk.
> Java accepts -?, -h and --help, thus I changed the tools to support
> these, too.  Some tools exited with '1' after displaying the help message,
> I turned this to '0'.
> 
> Maybe this is not the right mailing list for this, please advise.
> 
> Please review this change. I please need a sponsor.
> http://cr.openjdk.java.net/~goetz/wr17/8189102-helpMessage/webrev.01/
> 
> In detail, this fixes the help message of the following tools:
> jar          -? -h --help;  added -?.
> jarsigner    -? -h --help;  added --help. -help accepted but not documented.
> javac        -?    --help;  added -?. Removed -help. -h is taken for other purpose
> javadoc      -? -h --help;  added -h -?. Removed -help
> javap        -? -h --help;  added -h. -help accepted but no more documented.
> jcmd         -? -h --help;  added -? --help. -help accepted but no more documented. Changed return value to '0'
> jdb          -? -h --help;  added -? -h --help. -help accepted but no more documented.
> jdeprscan    -? -h --help;  added -?
> jinfo        -? -h --help;  added -? --help. -help accepted but no more documented.
> jjs             -h --help;  Replaced -help by --help. Adding more not straight forward.
> jps          -? -h --help;  added -? --help. -help accepted but no more documented.
> jshell       -? -h --help;  added -?
> jstat        -? -h --help;  added -h --help. -help accepted but no more documented.
> 
> Best regards,
>    Goetz.
> 

From david.holmes at oracle.com  Thu Oct 12 01:04:50 2017
From: david.holmes at oracle.com (David Holmes)
Date: Thu, 12 Oct 2017 11:04:50 +1000
Subject: RFR: 8146115 - Improve docker container detection and resource
 configuration usage
In-Reply-To: <CD1C6FA4-5729-40A7-91DC-224FCAC55949@oracle.com>
References: <74630458-926E-4B3E-B967-6F6ADCA0A406@oracle.com>
 <7e9322e8-274d-9fb2-f6a5-8cc612e3fe68@oracle.com>
 <33d617e7-7ec4-ebde-efa1-5602189e8470@oracle.com>
 <12909a67-6876-a40c-85b9-b959ed9f02df@oracle.com>
 <39AD9F8D-7E2B-4C15-8525-36DBA7C74302@oracle.com>
 <d3c36d08-1c9f-69bb-4e0c-aa00d3e62691@oracle.com>
 <F25C2D01-C643-4838-B2A7-DAA19D1B1EF5@oracle.com>
 <1d05a95f-75db-e0ca-e069-12fe41502e4f@oracle.com>
 <D2E3D675-2A4A-4ED1-BC30-6C52B185B731@oracle.com>
 <CAHeneC8cxBJ61A4u1AkfmpP63ZG6-WEkMyFLKnhRv=G6W32pFw@mail.gmail.com>
 <615e504d-94af-bae3-b721-6ca1dac6a567@oracle.com>
 <CAHeneC9tdgc9aCx-VLa8TvYufq_V-z+ApoAv5Cewi6UzX9uz8Q@mail.gmail.com>
 <1BD883DB-8C8B-405D-8F85-3A026B19286F@oracle.com>
 <5f7f3d85-db48-fe6e-28f5-e3f4858f33e8@oracle.com>
 <A5CDB304-6B44-448E-ADF3-C736FB49C85C@oracle.com>
 <ff0d651b-eea1-85a1-46d8-c4904556fce2@oracle.com>
 <CD1C6FA4-5729-40A7-91DC-224FCAC55949@oracle.com>
Message-ID: <799205ae-ba9f-ce3a-8dd6-1a55e32689df@oracle.com>

Hi Bob,

On 12/10/2017 5:11 AM, Bob Vandette wrote:
> Here?s an updated webrev for this RFE that contains changes and cleanups 
> based on feedback I?ve received so far.
> 
> I?m still investigating the best approach for reacting to cpu shares and 
> quotas. ?I do not believe doing nothing is the answer.

I do. :) Let me try this again. When you run outside of a container you 
don't get 100% of the CPUs - you have to share with whatever else is 
running on the system. You get a fraction of CPU time based on the load. 
We don't try to communicate load information to the VM/application so it 
can adapt. Within a container setting shares/quotas is just a way of 
setting an artificial load. So why should we be treating it any differently?

That's not to say an API to provide load/shares/quota information may 
not be useful, but that is a separate issue to what the "active 
processor count" should report.

> 
> http://cr.openjdk.java.net/~bobv/8146115/webrev.01
> 
> Updates:
> 
> 1. I had to move the processing of AggressiveHeap since the container 
> memory size needs to be known before this can be processed.

I don't like the placement of this - we don't call os:: init functions 
from inside Arguments - we manage the initialization sequence from 
Threads::create_vm. Seems to me that container initialization can/should 
happen in os::init_before_ergo, and the AggressiveHeap processing can 
occur at the start of Arguments::apply_ergo().

That said we need to be sure nothing touched by 
set_aggressive_heap_flags will be used before we now reach that code - 
there are a lot of flags being set in there.

> 
> 2. I no longer use the cpuset.cpus contents since sched_getaffinity 
> reports the correct results
> even if someone manually updates the cgroup data. ?I originally didn?t 
> think this was the case since
> sched_setaffinity didn?t automatically update the cpuset file contents 
> but the inverse is true.

Ok.

> 
> 3. I ifdef?d the container function support in 
> src/hotspot/share/runtime/os.hpp to avoid putting stubs in all other os
> platform directories. ?I can do this if it?s absolutely necessary.

You should not need to do this if initialization moves as I suggested 
above. os::init_before_ergo() in os_linux.cpp can call 
OSContainer::init(). No need for os::initialize_container_support() or 
os::pd_initialize_container_support.


Some further comments:

src/hotspot/share/runtime/globals.hpp

+           "Optimize heap optnios

Typo.

+   product(intx, ActiveProcessorCount, -1,

Why intx? It can be int then the logging

      log_trace(os)("active_processor_count: "
                    "active processor count set by user : %d",
                    (int)ActiveProcessorCount);

can use %d without casts. Or you can init to 0 and make it uint (and use 
%u).

+   product(bool, UseContainerSupport, true, 
      \
+           "(Linux Only)

Sorry don't recall if we already covered this, but this should be in 
./os/linux/globals_linux.hpp

---

src/hotspot/os/linux/os_linux.cpp/.hpp

187         log_trace(os)("available container memory: " JULONG_FORMAT, 
avail_mem);
  188         return avail_mem;
  189       } else {
  190         log_debug(os,container)("container memory usage call 
failed: " JLONG_FORMAT, mem_usage);

Why "trace" (the third logging level) to show the information, but 
"debug" (the second level) to show failed calls? You use debug in other 
files for basic info. Overall I'm unclear on your use of debug versus 
trace for the logging.

---

src/hotspot/os/linux/osContainer_linux.cpp

Dead code:

  376 #if 0
  377   os::Linux::print_container_info(tty);
  ...
  390 #endif

Thanks,
David

> Bob.

From david.holmes at oracle.com  Thu Oct 12 07:23:24 2017
From: david.holmes at oracle.com (David Holmes)
Date: Thu, 12 Oct 2017 17:23:24 +1000
Subject: RFR (L, but tedious) 8188220: Remove Atomic::*_ptr() uses and
 overloads from hotspot
In-Reply-To: <e3380ab0-ffde-b80e-7d65-f7fcaf3993ec@oracle.com>
References: <b6d75bd8-ca77-6507-f1f0-db57bbb99ae0@oracle.com>
 <B99DF3C6-7D79-4147-B4A0-C69F0EFA3339@oracle.com>
 <7e4fbba3-4462-a729-b663-99fd6919360f@oracle.com>
 <59DDCC37.8050306@oracle.com>
 <3089b845-0532-d6a9-b68f-91b3b21c6ef3@oracle.com>
 <591c33b3-f9b1-55f3-2c4b-ddfad4ed9a39@oracle.com>
 <bae33049-29be-0f61-5a0a-c7403a387b86@oracle.com>
 <e3380ab0-ffde-b80e-7d65-f7fcaf3993ec@oracle.com>
Message-ID: <4fb119b8-cb0b-474c-ebbc-60841ef4aa46@oracle.com>

Hi Coleen,

Thanks for doing this tedious cleanup!

It was good to see so many casts disappear; and sad to see so many have 
to now appear in the sync code. :(

There were a few things that struck me ...

Atomic::xchg_ptr turned into Atomic::xchg; yet for the stub generator 
routines atomic_xchg_ptr became atomic_xchg_long - but I can't see where 
that stub will now come into play?

---

src/hotspot/share/gc/shared/taskqueue.inline.hpp

+  return (size_t) Atomic::cmpxchg((intptr_t)new_age._data,
+                                  (volatile intptr_t *)&_data,
+                                  (intptr_t)old_age._data);

The actual types here should be size_t, can we now change it to use the 
real type?

---

src/hotspot/share/oops/cpCache.cpp

  114 bool ConstantPoolCacheEntry::init_flags_atomic(intptr_t flags) {
  115   intptr_t result = Atomic::cmpxchg(flags, &_flags, (intptr_t)0);
  116   return (result == 0);
  117 }

_flags is actually intx, yet above we treat it as intptr_t. But then later:

  156   if (_flags == 0) {
  157     intx newflags = (value & parameter_size_mask);
  158     Atomic::cmpxchg(newflags, &_flags, (intx)0);
  159   }

its intx again. This looks really odd to me.

---

src/hotspot/share/runtime/objectMonitor.inline.hpp

The addition of header_addr() made me a little nervous :) Can we add a 
sanity assert either inside it (or in synchronizer.cpp), to verify that 
this == &_header  (or monitor == monitor->header_addr())

---

src/hotspot/share/runtime/synchronizer.cpp

  // global list of blocks of monitors
-// gBlockList is really PaddedEnd<ObjectMonitor> *, but we don't
-// want to expose the PaddedEnd template more than necessary.
-ObjectMonitor * volatile ObjectSynchronizer::gBlockList = NULL;
+PaddedEnd<ObjectMonitor> * volatile ObjectSynchronizer::gBlockList = NULL;

Did this have to change? I'm not sure why we didn't want to expose 
PaddedEnd, but it is now being exposed.

Thanks,
David
-----


On 11/10/2017 11:50 PM, coleen.phillimore at oracle.com wrote:
> 
> Please review version .02 which removes use of replace_if_null, but not 
> the function.? A separate RFE can be filed to discuss that.
> 
> open webrev at http://cr.openjdk.java.net/~coleenp/8188220.02/webrev
> 
> Thanks,
> Coleen
> 
> On 10/11/17 7:07 AM, coleen.phillimore at oracle.com wrote:
>>
>>
>> On 10/11/17 4:12 AM, Robbin Ehn wrote:
>>> On 10/11/2017 10:09 AM, David Holmes wrote:
>>>> On 11/10/2017 5:45 PM, Erik ?sterlund wrote:
>>>>
>>>> Removing the operation is a different argument to renaming it. Most 
>>>> of the above argues for removing it. :)
>>>
>>> +1 on removing
>>
>> Thank you for all your feedback.? Erik best described what I was 
>> thinking.? I will remove it then.? There were not that many instances 
>> and one instance that people thought would be useful, needed the old 
>> return value.
>>
>> Coleen
>>>
>>> Thanks, Robbin
>>>
>>>>
>>>> Cheers,
>>>> David
>>>> -----
>>>>
>>>>> I have not reviewed this completely yet - thought I'd wait with 
>>>>> that until we agree about replace_if_null, if that is okay.
>>>>>
>>>>> Thanks,
>>>>> /Erik
>>>>>
>>>>> On 2017-10-11 05:55, David Holmes wrote:
>>>>>> On 11/10/2017 1:43 PM, Kim Barrett wrote:
>>>>>>>> On Oct 10, 2017, at 6:01 PM, coleen.phillimore at oracle.com wrote:
>>>>>>>>
>>>>>>>> Summary: With the new template functions these are unnecessary.
>>>>>>>>
>>>>>>>> 2. renamed Atomic::replace_if_null to Atomic::cmpxchg_if_null. I 
>>>>>>>> disliked the first name because it's not explicit from the 
>>>>>>>> callers that there's an underlying cas.? If people want to 
>>>>>>>> fight, I'll remove the function and use cmpxchg because there 
>>>>>>>> are only a couple places where this is a little nicer.
>>>>>>>
>>>>>>> I'm still looking at other parts, but I want to respond to this now.
>>>>>>>
>>>>>>> I object to this change.? I think the proposed new name is 
>>>>>>> confusing,
>>>>>>> suggesting there are two different comparisons involved.
>>>>>>>
>>>>>>> I originally called it something else that I wasn't entirely happy
>>>>>>> with.? When David suggested replace_if_null I quickly adopted 
>>>>>>> that as
>>>>>>> I think that name exactly describes what it does.? In particular, I
>>>>>>> think "atomic replace if" pretty clearly suggests a test-and-set /
>>>>>>> compare-and-swap type of operation.
>>>>>>
>>>>>> I totally agree. It's an Atomic operation, the implementation will 
>>>>>> involve something atomic, it doesn't matter if it is cmpxchg or 
>>>>>> something else. The name replace_if_null describes exactly what 
>>>>>> the function does - it doesn't have to describe how it does it.
>>>>>>
>>>>>> David
>>>>>> -----
>>>>>>
>>>>>>> Further, I think any name involving "cmpxchg" is problematic because
>>>>>>> the result of this operation is intentionally different from 
>>>>>>> cmpxchg,
>>>>>>> in order to better support the primary use-case, which is lazy
>>>>>>> initialization.
>>>>>>>
>>>>>>> I also object to your alternative suggestion of removing the 
>>>>>>> operation
>>>>>>> entirely and just using cmpxchg directly instead.? I don't recall 
>>>>>>> how
>>>>>>> many occurrences there presently are, but I suspect more could 
>>>>>>> easily
>>>>>>> be added; it's part of a lazy initialization pattern similar to DCLP
>>>>>>> but without the locks.
>>>>>>>
>>>>>
>>
> 

From kim.barrett at oracle.com  Thu Oct 12 07:29:20 2017
From: kim.barrett at oracle.com (Kim Barrett)
Date: Thu, 12 Oct 2017 03:29:20 -0400
Subject: RFR (L, but tedious) 8188220: Remove Atomic::*_ptr() uses and
 overloads from hotspot
In-Reply-To: <bae33049-29be-0f61-5a0a-c7403a387b86@oracle.com>
References: <b6d75bd8-ca77-6507-f1f0-db57bbb99ae0@oracle.com>
 <B99DF3C6-7D79-4147-B4A0-C69F0EFA3339@oracle.com>
 <7e4fbba3-4462-a729-b663-99fd6919360f@oracle.com>
 <59DDCC37.8050306@oracle.com>
 <3089b845-0532-d6a9-b68f-91b3b21c6ef3@oracle.com>
 <591c33b3-f9b1-55f3-2c4b-ddfad4ed9a39@oracle.com>
 <bae33049-29be-0f61-5a0a-c7403a387b86@oracle.com>
Message-ID: <E5B9EE0E-E94D-436C-A2D9-72C7EF624489@oracle.com>

> On Oct 11, 2017, at 7:07 AM, coleen.phillimore at oracle.com wrote:
> 
> 
> 
> On 10/11/17 4:12 AM, Robbin Ehn wrote:
>> On 10/11/2017 10:09 AM, David Holmes wrote:
>>> On 11/10/2017 5:45 PM, Erik ?sterlund wrote:
>>> 
>>> Removing the operation is a different argument to renaming it. Most of the above argues for removing it. :)
>> 
>> +1 on removing
> 
> Thank you for all your feedback.  Erik best described what I was thinking.  I will remove it then.  There were not that many instances and one instance that people thought would be useful, needed the old return value.

I?ve already registered my objection to removal.  I disagree with several of Erik?s points, which don?t
address or miss the issues brought up in the original discussion that led to its introduction, as quoted
by David.

I?m still slogging my way through the review, maybe about 3/4 of the way through.  

I?ve found a number of real problems, some pre-existing and discovered by looking at the code
around your changes; I think there are a couple of ABA bugs, for example.  I?m worried that I?m
missing some too, because I?m getting burned out from reading reams of lock-free code.  This
is *really* hard, and I very much wish it had been broken up into more easily digestible chunks.


From coleen.phillimore at oracle.com  Thu Oct 12 11:29:02 2017
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Thu, 12 Oct 2017 07:29:02 -0400
Subject: RFR (L, but tedious) 8188220: Remove Atomic::*_ptr() uses and
 overloads from hotspot
In-Reply-To: <E5B9EE0E-E94D-436C-A2D9-72C7EF624489@oracle.com>
References: <b6d75bd8-ca77-6507-f1f0-db57bbb99ae0@oracle.com>
 <B99DF3C6-7D79-4147-B4A0-C69F0EFA3339@oracle.com>
 <7e4fbba3-4462-a729-b663-99fd6919360f@oracle.com>
 <59DDCC37.8050306@oracle.com>
 <3089b845-0532-d6a9-b68f-91b3b21c6ef3@oracle.com>
 <591c33b3-f9b1-55f3-2c4b-ddfad4ed9a39@oracle.com>
 <bae33049-29be-0f61-5a0a-c7403a387b86@oracle.com>
 <E5B9EE0E-E94D-436C-A2D9-72C7EF624489@oracle.com>
Message-ID: <4b56a92a-474e-1aa8-f217-413e4f642b6d@oracle.com>


On 10/12/17 3:29 AM, Kim Barrett wrote:
>> On Oct 11, 2017, at 7:07 AM, coleen.phillimore at oracle.com wrote:
>>
>>
>>
>> On 10/11/17 4:12 AM, Robbin Ehn wrote:
>>> On 10/11/2017 10:09 AM, David Holmes wrote:
>>>> On 11/10/2017 5:45 PM, Erik ?sterlund wrote:
>>>>
>>>> Removing the operation is a different argument to renaming it. Most of the above argues for removing it. :)
>>> +1 on removing
>> Thank you for all your feedback.  Erik best described what I was thinking.  I will remove it then.  There were not that many instances and one instance that people thought would be useful, needed the old return value.
> I?ve already registered my objection to removal.  I disagree with several of Erik?s points, which don?t
> address or miss the issues brought up in the original discussion that led to its introduction, as quoted
> by David.

You can file an RFE for it.
>
> I?m still slogging my way through the review, maybe about 3/4 of the way through.
>
> I?ve found a number of real problems, some pre-existing and discovered by looking at the code
> around your changes; I think there are a couple of ABA bugs, for example.  I?m worried that I?m
> missing some too, because I?m getting burned out from reading reams of lock-free code.  This
> is *really* hard, and I very much wish it had been broken up into more easily digestible chunks.
>
>

Were these bugs pre-existing or did I introduce them?

Thanks,
Coleen


From david.holmes at oracle.com  Thu Oct 12 11:35:50 2017
From: david.holmes at oracle.com (David Holmes)
Date: Thu, 12 Oct 2017 21:35:50 +1000
Subject: RFR (L, but tedious) 8188220: Remove Atomic::*_ptr() uses and
 overloads from hotspot
In-Reply-To: <E5B9EE0E-E94D-436C-A2D9-72C7EF624489@oracle.com>
References: <b6d75bd8-ca77-6507-f1f0-db57bbb99ae0@oracle.com>
 <B99DF3C6-7D79-4147-B4A0-C69F0EFA3339@oracle.com>
 <7e4fbba3-4462-a729-b663-99fd6919360f@oracle.com>
 <59DDCC37.8050306@oracle.com>
 <3089b845-0532-d6a9-b68f-91b3b21c6ef3@oracle.com>
 <591c33b3-f9b1-55f3-2c4b-ddfad4ed9a39@oracle.com>
 <bae33049-29be-0f61-5a0a-c7403a387b86@oracle.com>
 <E5B9EE0E-E94D-436C-A2D9-72C7EF624489@oracle.com>
Message-ID: <0740f4ce-9388-d225-75d3-47f4657dcac3@oracle.com>

On 12/10/2017 5:29 PM, Kim Barrett wrote:
>> On Oct 11, 2017, at 7:07 AM, coleen.phillimore at oracle.com wrote:
>>
>>
>>
>> On 10/11/17 4:12 AM, Robbin Ehn wrote:
>>> On 10/11/2017 10:09 AM, David Holmes wrote:
>>>> On 11/10/2017 5:45 PM, Erik ?sterlund wrote:
>>>>
>>>> Removing the operation is a different argument to renaming it. Most of the above argues for removing it. :)
>>>
>>> +1 on removing
>>
>> Thank you for all your feedback.  Erik best described what I was thinking.  I will remove it then.  There were not that many instances and one instance that people thought would be useful, needed the old return value.
> 
> I?ve already registered my objection to removal.  I disagree with several of Erik?s points, which don?t
> address or miss the issues brought up in the original discussion that led to its introduction, as quoted
> by David.
> 
> I?m still slogging my way through the review, maybe about 3/4 of the way through.
> 
> I?ve found a number of real problems, some pre-existing and discovered by looking at the code
> around your changes; I think there are a couple of ABA bugs, for example.  I?m worried that I?m
> missing some too, because I?m getting burned out from reading reams of lock-free code.  This
> is *really* hard, and I very much wish it had been broken up into more easily digestible chunks.

I can't see how Coleen's changes can have introduced any bugs like that. 
So if there are ABA or other issues, then I think we would deal with 
them separately.

Cheers,
David

> 

From coleen.phillimore at oracle.com  Thu Oct 12 11:52:43 2017
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Thu, 12 Oct 2017 07:52:43 -0400
Subject: RFR (L, but tedious) 8188220: Remove Atomic::*_ptr() uses and
 overloads from hotspot
In-Reply-To: <4fb119b8-cb0b-474c-ebbc-60841ef4aa46@oracle.com>
References: <b6d75bd8-ca77-6507-f1f0-db57bbb99ae0@oracle.com>
 <B99DF3C6-7D79-4147-B4A0-C69F0EFA3339@oracle.com>
 <7e4fbba3-4462-a729-b663-99fd6919360f@oracle.com>
 <59DDCC37.8050306@oracle.com>
 <3089b845-0532-d6a9-b68f-91b3b21c6ef3@oracle.com>
 <591c33b3-f9b1-55f3-2c4b-ddfad4ed9a39@oracle.com>
 <bae33049-29be-0f61-5a0a-c7403a387b86@oracle.com>
 <e3380ab0-ffde-b80e-7d65-f7fcaf3993ec@oracle.com>
 <4fb119b8-cb0b-474c-ebbc-60841ef4aa46@oracle.com>
Message-ID: <5986a9d6-a27f-8462-d13e-5e11de8e358c@oracle.com>


On 10/12/17 3:23 AM, David Holmes wrote:
> Hi Coleen,
>
> Thanks for doing this tedious cleanup!
>
> It was good to see so many casts disappear; and sad to see so many 
> have to now appear in the sync code. :(

The sync code has _owner field as void* because it can be several 
things.? I didn't try to
>
> There were a few things that struck me ...
>
> Atomic::xchg_ptr turned into Atomic::xchg; yet for the stub generator 
> routines atomic_xchg_ptr became atomic_xchg_long - but I can't see 
> where that stub will now come into play?

http://cr.openjdk.java.net/~coleenp/8188220.02/webrev/src/hotspot/cpu/x86/stubGenerator_x86_64.cpp.udiff.html

I tried to remove it but windows x64 uses a stub for xchg (and others).? 
There was a preexisting stub for cmpxchg_long which I followed naming 
convention.

 ? static address _atomic_cmpxchg_entry;
 ? static address _atomic_cmpxchg_byte_entry;
 ? static address _atomic_cmpxchg_long_entry;

Technically I think it should be long_long, as well as the 
cmpxchg_long_entry as well.

I also missed renaming store_ptr_entry and add_ptr_entry.? What do you 
suggest?

>
> ---
>
> src/hotspot/share/gc/shared/taskqueue.inline.hpp
>
> +? return (size_t) Atomic::cmpxchg((intptr_t)new_age._data,
> +????????????????????????????????? (volatile intptr_t *)&_data,
> +????????????????????????????????? (intptr_t)old_age._data);
>
> The actual types here should be size_t, can we now change it to use 
> the real type?

Yes, fixed.? Missed that one.
>
> ---
>
> src/hotspot/share/oops/cpCache.cpp
>
> ?114 bool ConstantPoolCacheEntry::init_flags_atomic(intptr_t flags) {
> ?115?? intptr_t result = Atomic::cmpxchg(flags, &_flags, (intptr_t)0);
> ?116?? return (result == 0);
> ?117 }
>
> _flags is actually intx, yet above we treat it as intptr_t. But then 
> later:
>
> ?156?? if (_flags == 0) {
> ?157???? intx newflags = (value & parameter_size_mask);
> ?158???? Atomic::cmpxchg(newflags, &_flags, (intx)0);
> ?159?? }
>
> its intx again. This looks really odd to me.

It's better as an intx, because that's what it's declared as.?? I'll 
patch up some other uses but don't promise total consistency because I 
don't want to pull on this particular sweater thread too much. intx and 
intptr_t I believe are typedefed to each other.

typedef intptr_t? intx;

Should we not have intx and uintx and change all their uses??? I've 
sworn off large changes after this though.

ConstantPoolCacheEntry::make_flags returns an int.?? I fixed 
init_flags_atomic() because it's declared with an intx and defined with 
intptr_t.
>
> ---
>
> src/hotspot/share/runtime/objectMonitor.inline.hpp
>
> The addition of header_addr() made me a little nervous :) Can we add a 
> sanity assert either inside it (or in synchronizer.cpp), to verify 
> that this == &_header? (or monitor == monitor->header_addr())

Where I introduced it, looked like undefined behavior because it assumed 
that the header was the first field.

So I should sanity check that other places with undefined behavior won't 
break?? Sure I'll do that.
>
> ---
>
> src/hotspot/share/runtime/synchronizer.cpp
>
> ?// global list of blocks of monitors
> -// gBlockList is really PaddedEnd<ObjectMonitor> *, but we don't
> -// want to expose the PaddedEnd template more than necessary.
> -ObjectMonitor * volatile ObjectSynchronizer::gBlockList = NULL;
> +PaddedEnd<ObjectMonitor> * volatile ObjectSynchronizer::gBlockList = 
> NULL;
>
> Did this have to change? I'm not sure why we didn't want to expose 
> PaddedEnd, but it is now being exposed.

I didn't see why not and it avoided a bunch of ugly casts.?? I tested 
that the SA was fine with it because the SA manually did the address 
adjustment.? The SA could be fixed to know about PaddedEnd if it's 
somehting they want to do.

Thanks for going through and reviewing all of this.?? Please answer 
question about the stub function name and I'll include the change with 
this patch.

Coleen

>
> Thanks,
> David
> -----
>
>
> On 11/10/2017 11:50 PM, coleen.phillimore at oracle.com wrote:
>>
>> Please review version .02 which removes use of replace_if_null, but 
>> not the function.? A separate RFE can be filed to discuss that.
>>
>> open webrev at http://cr.openjdk.java.net/~coleenp/8188220.02/webrev
>>
>> Thanks,
>> Coleen
>>
>> On 10/11/17 7:07 AM, coleen.phillimore at oracle.com wrote:
>>>
>>>
>>> On 10/11/17 4:12 AM, Robbin Ehn wrote:
>>>> On 10/11/2017 10:09 AM, David Holmes wrote:
>>>>> On 11/10/2017 5:45 PM, Erik ?sterlund wrote:
>>>>>
>>>>> Removing the operation is a different argument to renaming it. 
>>>>> Most of the above argues for removing it. :)
>>>>
>>>> +1 on removing
>>>
>>> Thank you for all your feedback.? Erik best described what I was 
>>> thinking.? I will remove it then.? There were not that many 
>>> instances and one instance that people thought would be useful, 
>>> needed the old return value.
>>>
>>> Coleen
>>>>
>>>> Thanks, Robbin
>>>>
>>>>>
>>>>> Cheers,
>>>>> David
>>>>> -----
>>>>>
>>>>>> I have not reviewed this completely yet - thought I'd wait with 
>>>>>> that until we agree about replace_if_null, if that is okay.
>>>>>>
>>>>>> Thanks,
>>>>>> /Erik
>>>>>>
>>>>>> On 2017-10-11 05:55, David Holmes wrote:
>>>>>>> On 11/10/2017 1:43 PM, Kim Barrett wrote:
>>>>>>>>> On Oct 10, 2017, at 6:01 PM, coleen.phillimore at oracle.com wrote:
>>>>>>>>>
>>>>>>>>> Summary: With the new template functions these are unnecessary.
>>>>>>>>>
>>>>>>>>> 2. renamed Atomic::replace_if_null to Atomic::cmpxchg_if_null. 
>>>>>>>>> I disliked the first name because it's not explicit from the 
>>>>>>>>> callers that there's an underlying cas.? If people want to 
>>>>>>>>> fight, I'll remove the function and use cmpxchg because there 
>>>>>>>>> are only a couple places where this is a little nicer.
>>>>>>>>
>>>>>>>> I'm still looking at other parts, but I want to respond to this 
>>>>>>>> now.
>>>>>>>>
>>>>>>>> I object to this change.? I think the proposed new name is 
>>>>>>>> confusing,
>>>>>>>> suggesting there are two different comparisons involved.
>>>>>>>>
>>>>>>>> I originally called it something else that I wasn't entirely happy
>>>>>>>> with.? When David suggested replace_if_null I quickly adopted 
>>>>>>>> that as
>>>>>>>> I think that name exactly describes what it does. In particular, I
>>>>>>>> think "atomic replace if" pretty clearly suggests a test-and-set /
>>>>>>>> compare-and-swap type of operation.
>>>>>>>
>>>>>>> I totally agree. It's an Atomic operation, the implementation 
>>>>>>> will involve something atomic, it doesn't matter if it is 
>>>>>>> cmpxchg or something else. The name replace_if_null describes 
>>>>>>> exactly what the function does - it doesn't have to describe how 
>>>>>>> it does it.
>>>>>>>
>>>>>>> David
>>>>>>> -----
>>>>>>>
>>>>>>>> Further, I think any name involving "cmpxchg" is problematic 
>>>>>>>> because
>>>>>>>> the result of this operation is intentionally different from 
>>>>>>>> cmpxchg,
>>>>>>>> in order to better support the primary use-case, which is lazy
>>>>>>>> initialization.
>>>>>>>>
>>>>>>>> I also object to your alternative suggestion of removing the 
>>>>>>>> operation
>>>>>>>> entirely and just using cmpxchg directly instead.? I don't 
>>>>>>>> recall how
>>>>>>>> many occurrences there presently are, but I suspect more could 
>>>>>>>> easily
>>>>>>>> be added; it's part of a lazy initialization pattern similar to 
>>>>>>>> DCLP
>>>>>>>> but without the locks.
>>>>>>>>
>>>>>>
>>>
>>


From coleen.phillimore at oracle.com  Thu Oct 12 11:54:09 2017
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Thu, 12 Oct 2017 07:54:09 -0400
Subject: RFR (L, but tedious) 8188220: Remove Atomic::*_ptr() uses and
 overloads from hotspot
In-Reply-To: <E5B9EE0E-E94D-436C-A2D9-72C7EF624489@oracle.com>
References: <b6d75bd8-ca77-6507-f1f0-db57bbb99ae0@oracle.com>
 <B99DF3C6-7D79-4147-B4A0-C69F0EFA3339@oracle.com>
 <7e4fbba3-4462-a729-b663-99fd6919360f@oracle.com>
 <59DDCC37.8050306@oracle.com>
 <3089b845-0532-d6a9-b68f-91b3b21c6ef3@oracle.com>
 <591c33b3-f9b1-55f3-2c4b-ddfad4ed9a39@oracle.com>
 <bae33049-29be-0f61-5a0a-c7403a387b86@oracle.com>
 <E5B9EE0E-E94D-436C-A2D9-72C7EF624489@oracle.com>
Message-ID: <53384d01-094e-19cd-b0a7-d695386e8d4d@oracle.com>

Kim, I have this change as an mq patchset.? If you teach me more mq 
commands, I'll post webrevs for each. :)

thanks,
Coleen

On 10/12/17 3:29 AM, Kim Barrett wrote:
>> On Oct 11, 2017, at 7:07 AM, coleen.phillimore at oracle.com wrote:
>>
>>
>>
>> On 10/11/17 4:12 AM, Robbin Ehn wrote:
>>> On 10/11/2017 10:09 AM, David Holmes wrote:
>>>> On 11/10/2017 5:45 PM, Erik ?sterlund wrote:
>>>>
>>>> Removing the operation is a different argument to renaming it. Most of the above argues for removing it. :)
>>> +1 on removing
>> Thank you for all your feedback.  Erik best described what I was thinking.  I will remove it then.  There were not that many instances and one instance that people thought would be useful, needed the old return value.
> I?ve already registered my objection to removal.  I disagree with several of Erik?s points, which don?t
> address or miss the issues brought up in the original discussion that led to its introduction, as quoted
> by David.
>
> I?m still slogging my way through the review, maybe about 3/4 of the way through.
>
> I?ve found a number of real problems, some pre-existing and discovered by looking at the code
> around your changes; I think there are a couple of ABA bugs, for example.  I?m worried that I?m
> missing some too, because I?m getting burned out from reading reams of lock-free code.  This
> is *really* hard, and I very much wish it had been broken up into more easily digestible chunks.
>
>


From david.holmes at oracle.com  Thu Oct 12 12:21:36 2017
From: david.holmes at oracle.com (David Holmes)
Date: Thu, 12 Oct 2017 22:21:36 +1000
Subject: RFR (L, but tedious) 8188220: Remove Atomic::*_ptr() uses and
 overloads from hotspot
In-Reply-To: <5986a9d6-a27f-8462-d13e-5e11de8e358c@oracle.com>
References: <b6d75bd8-ca77-6507-f1f0-db57bbb99ae0@oracle.com>
 <B99DF3C6-7D79-4147-B4A0-C69F0EFA3339@oracle.com>
 <7e4fbba3-4462-a729-b663-99fd6919360f@oracle.com>
 <59DDCC37.8050306@oracle.com>
 <3089b845-0532-d6a9-b68f-91b3b21c6ef3@oracle.com>
 <591c33b3-f9b1-55f3-2c4b-ddfad4ed9a39@oracle.com>
 <bae33049-29be-0f61-5a0a-c7403a387b86@oracle.com>
 <e3380ab0-ffde-b80e-7d65-f7fcaf3993ec@oracle.com>
 <4fb119b8-cb0b-474c-ebbc-60841ef4aa46@oracle.com>
 <5986a9d6-a27f-8462-d13e-5e11de8e358c@oracle.com>
Message-ID: <354409e5-8985-6710-1a1a-848a6b366d12@oracle.com>

On 12/10/2017 9:52 PM, coleen.phillimore at oracle.com wrote:
> On 10/12/17 3:23 AM, David Holmes wrote:
>> Hi Coleen,
>>
>> Thanks for doing this tedious cleanup!
>>
>> It was good to see so many casts disappear; and sad to see so many 
>> have to now appear in the sync code. :(
> 
> The sync code has _owner field as void* because it can be several 
> things.? I didn't try to

Yeah I understood why this had to happen.

>>
>> There were a few things that struck me ...
>>
>> Atomic::xchg_ptr turned into Atomic::xchg; yet for the stub generator 
>> routines atomic_xchg_ptr became atomic_xchg_long - but I can't see 
>> where that stub will now come into play?
> 
> http://cr.openjdk.java.net/~coleenp/8188220.02/webrev/src/hotspot/cpu/x86/stubGenerator_x86_64.cpp.udiff.html 
> 
> 
> I tried to remove it but windows x64 uses a stub for xchg (and others). 

Ah so I think this is where it is used:

./os_cpu/windows_x86/atomic_windows_x86.hpp:DEFINE_STUB_XCHG(8, jlong, 
os::atomic_xchg_ptr_func)

ie atomic_xchg_ptr is the stub for Atomic::xchg<8>

> There was a preexisting stub for cmpxchg_long which I followed naming 
> convention.
> 
>  ? static address _atomic_cmpxchg_entry;
>  ? static address _atomic_cmpxchg_byte_entry;
>  ? static address _atomic_cmpxchg_long_entry;
> 
> Technically I think it should be long_long, as well as the 
> cmpxchg_long_entry as well.

Or int64_t

> I also missed renaming store_ptr_entry and add_ptr_entry.? What do you 
> suggest?

store_ptr_entry actually seems unused.

add_ptr_entry looks like it needs to be the 64-bit Atomic::add<8> 
implementation - so probably add_int64_t_entry.

>>
>> ---
>>
>> src/hotspot/share/gc/shared/taskqueue.inline.hpp
>>
>> +? return (size_t) Atomic::cmpxchg((intptr_t)new_age._data,
>> +????????????????????????????????? (volatile intptr_t *)&_data,
>> +????????????????????????????????? (intptr_t)old_age._data);
>>
>> The actual types here should be size_t, can we now change it to use 
>> the real type?
> 
> Yes, fixed.? Missed that one.
>>
>> ---
>>
>> src/hotspot/share/oops/cpCache.cpp
>>
>> ?114 bool ConstantPoolCacheEntry::init_flags_atomic(intptr_t flags) {
>> ?115?? intptr_t result = Atomic::cmpxchg(flags, &_flags, (intptr_t)0);
>> ?116?? return (result == 0);
>> ?117 }
>>
>> _flags is actually intx, yet above we treat it as intptr_t. But then 
>> later:
>>
>> ?156?? if (_flags == 0) {
>> ?157???? intx newflags = (value & parameter_size_mask);
>> ?158???? Atomic::cmpxchg(newflags, &_flags, (intx)0);
>> ?159?? }
>>
>> its intx again. This looks really odd to me.
> 
> It's better as an intx, because that's what it's declared as.?? I'll 
> patch up some other uses but don't promise total consistency because I 
> don't want to pull on this particular sweater thread too much. intx and 
> intptr_t I believe are typedefed to each other.
> 
> typedef intptr_t? intx;
> 
> Should we not have intx and uintx and change all their uses??? I've 
> sworn off large changes after this though.

I don't know why we have intx/uintx other than someone not liking having 
to type intptr_t all the time.

> ConstantPoolCacheEntry::make_flags returns an int.?? I fixed 
> init_flags_atomic() because it's declared with an intx and defined with 
> intptr_t.

Ok.

>>
>> ---
>>
>> src/hotspot/share/runtime/objectMonitor.inline.hpp
>>
>> The addition of header_addr() made me a little nervous :) Can we add a 
>> sanity assert either inside it (or in synchronizer.cpp), to verify 
>> that this == &_header? (or monitor == monitor->header_addr())
> 
> Where I introduced it, looked like undefined behavior because it assumed 
> that the header was the first field.

Assumes and expects, I think. Not sure if it is undefined behaviour or not.

> So I should sanity check that other places with undefined behavior won't 
> break?? Sure I'll do that.

No only sanity check that your change actually didn't change anything. :)

>>
>> ---
>>
>> src/hotspot/share/runtime/synchronizer.cpp
>>
>> ?// global list of blocks of monitors
>> -// gBlockList is really PaddedEnd<ObjectMonitor> *, but we don't
>> -// want to expose the PaddedEnd template more than necessary.
>> -ObjectMonitor * volatile ObjectSynchronizer::gBlockList = NULL;
>> +PaddedEnd<ObjectMonitor> * volatile ObjectSynchronizer::gBlockList = 
>> NULL;
>>
>> Did this have to change? I'm not sure why we didn't want to expose 
>> PaddedEnd, but it is now being exposed.
> 
> I didn't see why not and it avoided a bunch of ugly casts.?? I tested 
> that the SA was fine with it because the SA manually did the address 
> adjustment.? The SA could be fixed to know about PaddedEnd if it's 
> somehting they want to do.

Glad you mentioned SA as I forgot to mention that with the vmStructs 
changes. :)

> Thanks for going through and reviewing all of this.?? Please answer 
> question about the stub function name and I'll include the change with 
> this patch.

Would like to see an incremental webrev please. (Should be easy if 
you're using mq :) )

Thanks,
David

> Coleen
> 
>>
>> Thanks,
>> David
>> -----
>>
>>
>> On 11/10/2017 11:50 PM, coleen.phillimore at oracle.com wrote:
>>>
>>> Please review version .02 which removes use of replace_if_null, but 
>>> not the function.? A separate RFE can be filed to discuss that.
>>>
>>> open webrev at http://cr.openjdk.java.net/~coleenp/8188220.02/webrev
>>>
>>> Thanks,
>>> Coleen
>>>
>>> On 10/11/17 7:07 AM, coleen.phillimore at oracle.com wrote:
>>>>
>>>>
>>>> On 10/11/17 4:12 AM, Robbin Ehn wrote:
>>>>> On 10/11/2017 10:09 AM, David Holmes wrote:
>>>>>> On 11/10/2017 5:45 PM, Erik ?sterlund wrote:
>>>>>>
>>>>>> Removing the operation is a different argument to renaming it. 
>>>>>> Most of the above argues for removing it. :)
>>>>>
>>>>> +1 on removing
>>>>
>>>> Thank you for all your feedback.? Erik best described what I was 
>>>> thinking.? I will remove it then.? There were not that many 
>>>> instances and one instance that people thought would be useful, 
>>>> needed the old return value.
>>>>
>>>> Coleen
>>>>>
>>>>> Thanks, Robbin
>>>>>
>>>>>>
>>>>>> Cheers,
>>>>>> David
>>>>>> -----
>>>>>>
>>>>>>> I have not reviewed this completely yet - thought I'd wait with 
>>>>>>> that until we agree about replace_if_null, if that is okay.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> /Erik
>>>>>>>
>>>>>>> On 2017-10-11 05:55, David Holmes wrote:
>>>>>>>> On 11/10/2017 1:43 PM, Kim Barrett wrote:
>>>>>>>>>> On Oct 10, 2017, at 6:01 PM, coleen.phillimore at oracle.com wrote:
>>>>>>>>>>
>>>>>>>>>> Summary: With the new template functions these are unnecessary.
>>>>>>>>>>
>>>>>>>>>> 2. renamed Atomic::replace_if_null to Atomic::cmpxchg_if_null. 
>>>>>>>>>> I disliked the first name because it's not explicit from the 
>>>>>>>>>> callers that there's an underlying cas.? If people want to 
>>>>>>>>>> fight, I'll remove the function and use cmpxchg because there 
>>>>>>>>>> are only a couple places where this is a little nicer.
>>>>>>>>>
>>>>>>>>> I'm still looking at other parts, but I want to respond to this 
>>>>>>>>> now.
>>>>>>>>>
>>>>>>>>> I object to this change.? I think the proposed new name is 
>>>>>>>>> confusing,
>>>>>>>>> suggesting there are two different comparisons involved.
>>>>>>>>>
>>>>>>>>> I originally called it something else that I wasn't entirely happy
>>>>>>>>> with.? When David suggested replace_if_null I quickly adopted 
>>>>>>>>> that as
>>>>>>>>> I think that name exactly describes what it does. In particular, I
>>>>>>>>> think "atomic replace if" pretty clearly suggests a test-and-set /
>>>>>>>>> compare-and-swap type of operation.
>>>>>>>>
>>>>>>>> I totally agree. It's an Atomic operation, the implementation 
>>>>>>>> will involve something atomic, it doesn't matter if it is 
>>>>>>>> cmpxchg or something else. The name replace_if_null describes 
>>>>>>>> exactly what the function does - it doesn't have to describe how 
>>>>>>>> it does it.
>>>>>>>>
>>>>>>>> David
>>>>>>>> -----
>>>>>>>>
>>>>>>>>> Further, I think any name involving "cmpxchg" is problematic 
>>>>>>>>> because
>>>>>>>>> the result of this operation is intentionally different from 
>>>>>>>>> cmpxchg,
>>>>>>>>> in order to better support the primary use-case, which is lazy
>>>>>>>>> initialization.
>>>>>>>>>
>>>>>>>>> I also object to your alternative suggestion of removing the 
>>>>>>>>> operation
>>>>>>>>> entirely and just using cmpxchg directly instead.? I don't 
>>>>>>>>> recall how
>>>>>>>>> many occurrences there presently are, but I suspect more could 
>>>>>>>>> easily
>>>>>>>>> be added; it's part of a lazy initialization pattern similar to 
>>>>>>>>> DCLP
>>>>>>>>> but without the locks.
>>>>>>>>>
>>>>>>>
>>>>
>>>
> 

From coleen.phillimore at oracle.com  Thu Oct 12 12:55:56 2017
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Thu, 12 Oct 2017 08:55:56 -0400
Subject: RFR (L, but tedious) 8188220: Remove Atomic::*_ptr() uses and
 overloads from hotspot
In-Reply-To: <354409e5-8985-6710-1a1a-848a6b366d12@oracle.com>
References: <b6d75bd8-ca77-6507-f1f0-db57bbb99ae0@oracle.com>
 <B99DF3C6-7D79-4147-B4A0-C69F0EFA3339@oracle.com>
 <7e4fbba3-4462-a729-b663-99fd6919360f@oracle.com>
 <59DDCC37.8050306@oracle.com>
 <3089b845-0532-d6a9-b68f-91b3b21c6ef3@oracle.com>
 <591c33b3-f9b1-55f3-2c4b-ddfad4ed9a39@oracle.com>
 <bae33049-29be-0f61-5a0a-c7403a387b86@oracle.com>
 <e3380ab0-ffde-b80e-7d65-f7fcaf3993ec@oracle.com>
 <4fb119b8-cb0b-474c-ebbc-60841ef4aa46@oracle.com>
 <5986a9d6-a27f-8462-d13e-5e11de8e358c@oracle.com>
 <354409e5-8985-6710-1a1a-848a6b366d12@oracle.com>
Message-ID: <1a37a25f-8a72-3990-4849-24dbfbc21b0a@oracle.com>


On 10/12/17 8:21 AM, David Holmes wrote:
> On 12/10/2017 9:52 PM, coleen.phillimore at oracle.com wrote:
>> On 10/12/17 3:23 AM, David Holmes wrote:
>>> Hi Coleen,
>>>
>>> Thanks for doing this tedious cleanup!
>>>
>>> It was good to see so many casts disappear; and sad to see so many 
>>> have to now appear in the sync code. :(
>>
>> The sync code has _owner field as void* because it can be several 
>> things.? I didn't try to
>
> Yeah I understood why this had to happen.
>
>>>
>>> There were a few things that struck me ...
>>>
>>> Atomic::xchg_ptr turned into Atomic::xchg; yet for the stub 
>>> generator routines atomic_xchg_ptr became atomic_xchg_long - but I 
>>> can't see where that stub will now come into play?
>>
>> http://cr.openjdk.java.net/~coleenp/8188220.02/webrev/src/hotspot/cpu/x86/stubGenerator_x86_64.cpp.udiff.html 
>>
>>
>> I tried to remove it but windows x64 uses a stub for xchg (and others). 
>
> Ah so I think this is where it is used:
>
> ./os_cpu/windows_x86/atomic_windows_x86.hpp:DEFINE_STUB_XCHG(8, jlong, 
> os::atomic_xchg_ptr_func)
>
> ie atomic_xchg_ptr is the stub for Atomic::xchg<8>
>
>> There was a preexisting stub for cmpxchg_long which I followed naming 
>> convention.
>>
>> ?? static address _atomic_cmpxchg_entry;
>> ?? static address _atomic_cmpxchg_byte_entry;
>> ?? static address _atomic_cmpxchg_long_entry;
>>
>> Technically I think it should be long_long, as well as the 
>> cmpxchg_long_entry as well.
>
> Or int64_t
>
>> I also missed renaming store_ptr_entry and add_ptr_entry.? What do 
>> you suggest?
>
> store_ptr_entry actually seems unused.
>
> add_ptr_entry looks like it needs to be the 64-bit Atomic::add<8> 
> implementation - so probably add_int64_t_entry.

https://bugs.openjdk.java.net/browse/JDK-8186903

I'm renaming to ptr => long for now to follow other code and fixing the 
name with this RFE to what it really is, and what we decide.

It was pretty ugly as:

 ? static jint????? (*atomic_add_func)?????????? (jint,????? volatile 
jint*);
 ? static intptr_t? (*atomic_add_ptr_func)?????? (intptr_t,? volatile 
intptr_t*);

When the other uses jint as an argument.?? Actually, I think add_ptr 
makes more sense in this context than long.? I think I should leave this 
name and not make it long.
>
>>>
>>> ---
>>>
>>> src/hotspot/share/gc/shared/taskqueue.inline.hpp
>>>
>>> +? return (size_t) Atomic::cmpxchg((intptr_t)new_age._data,
>>> +????????????????????????????????? (volatile intptr_t *)&_data,
>>> +????????????????????????????????? (intptr_t)old_age._data);
>>>
>>> The actual types here should be size_t, can we now change it to use 
>>> the real type?
>>
>> Yes, fixed.? Missed that one.
>>>
>>> ---
>>>
>>> src/hotspot/share/oops/cpCache.cpp
>>>
>>> ?114 bool ConstantPoolCacheEntry::init_flags_atomic(intptr_t flags) {
>>> ?115?? intptr_t result = Atomic::cmpxchg(flags, &_flags, (intptr_t)0);
>>> ?116?? return (result == 0);
>>> ?117 }
>>>
>>> _flags is actually intx, yet above we treat it as intptr_t. But then 
>>> later:
>>>
>>> ?156?? if (_flags == 0) {
>>> ?157???? intx newflags = (value & parameter_size_mask);
>>> ?158???? Atomic::cmpxchg(newflags, &_flags, (intx)0);
>>> ?159?? }
>>>
>>> its intx again. This looks really odd to me.
>>
>> It's better as an intx, because that's what it's declared as. I'll 
>> patch up some other uses but don't promise total consistency because 
>> I don't want to pull on this particular sweater thread too much. intx 
>> and intptr_t I believe are typedefed to each other.
>>
>> typedef intptr_t? intx;
>>
>> Should we not have intx and uintx and change all their uses? I've 
>> sworn off large changes after this though.
>
> I don't know why we have intx/uintx other than someone not liking 
> having to type intptr_t all the time.
>
>> ConstantPoolCacheEntry::make_flags returns an int.?? I fixed 
>> init_flags_atomic() because it's declared with an intx and defined 
>> with intptr_t.
>
> Ok.
>
>>>
>>> ---
>>>
>>> src/hotspot/share/runtime/objectMonitor.inline.hpp
>>>
>>> The addition of header_addr() made me a little nervous :) Can we add 
>>> a sanity assert either inside it (or in synchronizer.cpp), to verify 
>>> that this == &_header? (or monitor == monitor->header_addr())
>>
>> Where I introduced it, looked like undefined behavior because it 
>> assumed that the header was the first field.
>
> Assumes and expects, I think. Not sure if it is undefined behaviour or 
> not.

Assumes without giving the static compiler a chance to check that what 
you've done is correct or not.? Maybe that's not undefined behavior.
>
>> So I should sanity check that other places with undefined behavior 
>> won't break?? Sure I'll do that.
>
> No only sanity check that your change actually didn't change anything. :)

As well.
>
>>>
>>> ---
>>>
>>> src/hotspot/share/runtime/synchronizer.cpp
>>>
>>> ?// global list of blocks of monitors
>>> -// gBlockList is really PaddedEnd<ObjectMonitor> *, but we don't
>>> -// want to expose the PaddedEnd template more than necessary.
>>> -ObjectMonitor * volatile ObjectSynchronizer::gBlockList = NULL;
>>> +PaddedEnd<ObjectMonitor> * volatile ObjectSynchronizer::gBlockList 
>>> = NULL;
>>>
>>> Did this have to change? I'm not sure why we didn't want to expose 
>>> PaddedEnd, but it is now being exposed.
>>
>> I didn't see why not and it avoided a bunch of ugly casts.?? I tested 
>> that the SA was fine with it because the SA manually did the address 
>> adjustment.? The SA could be fixed to know about PaddedEnd if it's 
>> somehting they want to do.
>
> Glad you mentioned SA as I forgot to mention that with the vmStructs 
> changes. :)
>
>> Thanks for going through and reviewing all of this.?? Please answer 
>> question about the stub function name and I'll include the change 
>> with this patch.
>
> Would like to see an incremental webrev please. (Should be easy if 
> you're using mq :) )

Will do.

Thanks,
Coleen
>
> Thanks,
> David
>
>> Coleen
>>
>>>
>>> Thanks,
>>> David
>>> -----
>>>
>>>
>>> On 11/10/2017 11:50 PM, coleen.phillimore at oracle.com wrote:
>>>>
>>>> Please review version .02 which removes use of replace_if_null, but 
>>>> not the function.? A separate RFE can be filed to discuss that.
>>>>
>>>> open webrev at http://cr.openjdk.java.net/~coleenp/8188220.02/webrev
>>>>
>>>> Thanks,
>>>> Coleen
>>>>
>>>> On 10/11/17 7:07 AM, coleen.phillimore at oracle.com wrote:
>>>>>
>>>>>
>>>>> On 10/11/17 4:12 AM, Robbin Ehn wrote:
>>>>>> On 10/11/2017 10:09 AM, David Holmes wrote:
>>>>>>> On 11/10/2017 5:45 PM, Erik ?sterlund wrote:
>>>>>>>
>>>>>>> Removing the operation is a different argument to renaming it. 
>>>>>>> Most of the above argues for removing it. :)
>>>>>>
>>>>>> +1 on removing
>>>>>
>>>>> Thank you for all your feedback.? Erik best described what I was 
>>>>> thinking.? I will remove it then.? There were not that many 
>>>>> instances and one instance that people thought would be useful, 
>>>>> needed the old return value.
>>>>>
>>>>> Coleen
>>>>>>
>>>>>> Thanks, Robbin
>>>>>>
>>>>>>>
>>>>>>> Cheers,
>>>>>>> David
>>>>>>> -----
>>>>>>>
>>>>>>>> I have not reviewed this completely yet - thought I'd wait with 
>>>>>>>> that until we agree about replace_if_null, if that is okay.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> /Erik
>>>>>>>>
>>>>>>>> On 2017-10-11 05:55, David Holmes wrote:
>>>>>>>>> On 11/10/2017 1:43 PM, Kim Barrett wrote:
>>>>>>>>>>> On Oct 10, 2017, at 6:01 PM, coleen.phillimore at oracle.com 
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>> Summary: With the new template functions these are unnecessary.
>>>>>>>>>>>
>>>>>>>>>>> 2. renamed Atomic::replace_if_null to 
>>>>>>>>>>> Atomic::cmpxchg_if_null. I disliked the first name because 
>>>>>>>>>>> it's not explicit from the callers that there's an 
>>>>>>>>>>> underlying cas.? If people want to fight, I'll remove the 
>>>>>>>>>>> function and use cmpxchg because there are only a couple 
>>>>>>>>>>> places where this is a little nicer.
>>>>>>>>>>
>>>>>>>>>> I'm still looking at other parts, but I want to respond to 
>>>>>>>>>> this now.
>>>>>>>>>>
>>>>>>>>>> I object to this change.? I think the proposed new name is 
>>>>>>>>>> confusing,
>>>>>>>>>> suggesting there are two different comparisons involved.
>>>>>>>>>>
>>>>>>>>>> I originally called it something else that I wasn't entirely 
>>>>>>>>>> happy
>>>>>>>>>> with.? When David suggested replace_if_null I quickly adopted 
>>>>>>>>>> that as
>>>>>>>>>> I think that name exactly describes what it does. In 
>>>>>>>>>> particular, I
>>>>>>>>>> think "atomic replace if" pretty clearly suggests a 
>>>>>>>>>> test-and-set /
>>>>>>>>>> compare-and-swap type of operation.
>>>>>>>>>
>>>>>>>>> I totally agree. It's an Atomic operation, the implementation 
>>>>>>>>> will involve something atomic, it doesn't matter if it is 
>>>>>>>>> cmpxchg or something else. The name replace_if_null describes 
>>>>>>>>> exactly what the function does - it doesn't have to describe 
>>>>>>>>> how it does it.
>>>>>>>>>
>>>>>>>>> David
>>>>>>>>> -----
>>>>>>>>>
>>>>>>>>>> Further, I think any name involving "cmpxchg" is problematic 
>>>>>>>>>> because
>>>>>>>>>> the result of this operation is intentionally different from 
>>>>>>>>>> cmpxchg,
>>>>>>>>>> in order to better support the primary use-case, which is lazy
>>>>>>>>>> initialization.
>>>>>>>>>>
>>>>>>>>>> I also object to your alternative suggestion of removing the 
>>>>>>>>>> operation
>>>>>>>>>> entirely and just using cmpxchg directly instead.? I don't 
>>>>>>>>>> recall how
>>>>>>>>>> many occurrences there presently are, but I suspect more 
>>>>>>>>>> could easily
>>>>>>>>>> be added; it's part of a lazy initialization pattern similar 
>>>>>>>>>> to DCLP
>>>>>>>>>> but without the locks.
>>>>>>>>>>
>>>>>>>>
>>>>>
>>>>
>>


From bob.vandette at oracle.com  Thu Oct 12 15:43:17 2017
From: bob.vandette at oracle.com (Bob Vandette)
Date: Thu, 12 Oct 2017 11:43:17 -0400
Subject: RFR: 8146115 - Improve docker container detection and resource
 configuration usage
In-Reply-To: <799205ae-ba9f-ce3a-8dd6-1a55e32689df@oracle.com>
References: <74630458-926E-4B3E-B967-6F6ADCA0A406@oracle.com>
 <7e9322e8-274d-9fb2-f6a5-8cc612e3fe68@oracle.com>
 <33d617e7-7ec4-ebde-efa1-5602189e8470@oracle.com>
 <12909a67-6876-a40c-85b9-b959ed9f02df@oracle.com>
 <39AD9F8D-7E2B-4C15-8525-36DBA7C74302@oracle.com>
 <d3c36d08-1c9f-69bb-4e0c-aa00d3e62691@oracle.com>
 <F25C2D01-C643-4838-B2A7-DAA19D1B1EF5@oracle.com>
 <1d05a95f-75db-e0ca-e069-12fe41502e4f@oracle.com>
 <D2E3D675-2A4A-4ED1-BC30-6C52B185B731@oracle.com>
 <CAHeneC8cxBJ61A4u1AkfmpP63ZG6-WEkMyFLKnhRv=G6W32pFw@mail.gmail.com>
 <615e504d-94af-bae3-b721-6ca1dac6a567@oracle.com>
 <CAHeneC9tdgc9aCx-VLa8TvYufq_V-z+ApoAv5Cewi6UzX9uz8Q@mail.gmail.com>
 <1BD883DB-8C8B-405D-8F85-3A026B19286F@oracle.com>
 <5f7f3d85-db48-fe6e-28f5-e3f4858f33e8@oracle.com>
 <A5CDB304-6B44-448E-ADF3-C736FB49C85C@oracle.com>
 <ff0d651b-eea1-85a1-46d8-c4904556fce2@oracle.com>
 <CD1C6FA4-5729-40A7-91DC-224FCAC55949@oracle.com>
 <799205ae-ba9f-ce3a-8dd6-1a55e32689df@oracle.com>
Message-ID: <9956F9D0-B01B-44FE-AE56-527907816436@oracle.com>


> On Oct 11, 2017, at 9:04 PM, David Holmes <david.holmes at oracle.com> wrote:
> 
> Hi Bob,
> 
> On 12/10/2017 5:11 AM, Bob Vandette wrote:
>> Here?s an updated webrev for this RFE that contains changes and cleanups based on feedback I?ve received so far.
>> I?m still investigating the best approach for reacting to cpu shares and quotas.  I do not believe doing nothing is the answer.
> 
> I do. :) Let me try this again. When you run outside of a container you don't get 100% of the CPUs - you have to share with whatever else is running on the system. You get a fraction of CPU time based on the load. We don't try to communicate load information to the VM/application so it can adapt. Within a container setting shares/quotas is just a way of setting an artificial load. So why should we be treating it any differently?
Because today we optimize for a lightly loaded system and when running serverless applications in containers we should be 
optimizing for a fully loaded system.  If developers don?t want this, then don?t use shares or quotas and you?ll have exactly
the behavior you have today.  I think we just have to document the new behavior (and how to turn it off) so people know what 
to expect.  

You seem to discount the added cost of 100s of VMs creating lots of un-necessaary threads.  In the current JDK 10 code base,
In a heavily loaded system with 88 processors, VmData grows from 60MBs (1 cpu) to 376MB (88 cpus).  This is only mapped
memory and it depends heavily on how deep in the stack these threads go before it impacts VmRSS but it shows the potential downside
of having 100s of VMs thinking they each own the entire machine.

I haven?t even done any experiments to determine the added context switching cost if the VM decides to use excessive
pthreads.

> 
> That's not to say an API to provide load/shares/quota information may not be useful, but that is a separate issue to what the "active processor count" should report.
I don?t have a problem with active processor count reporting the number of processors we have, but I do have a problem
with our current usage of this information within the VM and Core libraries.

> 
>> http://cr.openjdk.java.net/~bobv/8146115/webrev.01
>> Updates:
>> 1. I had to move the processing of AggressiveHeap since the container memory size needs to be known before this can be processed.
> 
> I don't like the placement of this - we don't call os:: init functions from inside Arguments - we manage the initialization sequence from Threads::create_vm. Seems to me that container initialization can/should happen in os::init_before_ergo, and the AggressiveHeap processing can occur at the start of Arguments::apply_ergo().
> 
> That said we need to be sure nothing touched by set_aggressive_heap_flags will be used before we now reach that code - there are a lot of flags being set in there.

This is exactly the reason why I put the call where it did.  I put the call to set_aggressive_heap_flags in finalize_vm_init_args
because that is exactly what this call is doing.  It?s finalizing flags used after the parsing.  The impacted flags are definitely being
used shortly after and before init_before_ergo is called.  

> 
>> 2. I no longer use the cpuset.cpus contents since sched_getaffinity reports the correct results
>> even if someone manually updates the cgroup data.  I originally didn?t think this was the case since
>> sched_setaffinity didn?t automatically update the cpuset file contents but the inverse is true.
> 
> Ok.
> 
>> 3. I ifdef?d the container function support in src/hotspot/share/runtime/os.hpp to avoid putting stubs in all other os
>> platform directories.  I can do this if it?s absolutely necessary.
> 
> You should not need to do this if initialization moves as I suggested above. os::init_before_ergo() in os_linux.cpp can call OSContainer::init().

> No need for os::initialize_container_support() or os::pd_initialize_container_support.

But os::init_before_ergo is in shared code.

> 
> 
> Some further comments:
> 
> src/hotspot/share/runtime/globals.hpp
> 
> +           "Optimize heap optnios
> 
> Typo.

Thx.

> 
> +   product(intx, ActiveProcessorCount, -1,

Cut and paste issue, fixed.

> 
> Why intx? It can be int then the logging
> 
>     log_trace(os)("active_processor_count: "
>                   "active processor count set by user : %d",
>                   (int)ActiveProcessorCount);
> 
> can use %d without casts. Or you can init to 0 and make it uint (and use %u).
> 
> +   product(bool, UseContainerSupport, true,      \
> +           "(Linux Only)
> 
> Sorry don't recall if we already covered this, but this should be in ./os/linux/globals_linux.hpp

Fixed.

> 
> ---
> 
> src/hotspot/os/linux/os_linux.cpp/.hpp
> 
> 187         log_trace(os)("available container memory: " JULONG_FORMAT, avail_mem);
> 188         return avail_mem;
> 189       } else {
> 190         log_debug(os,container)("container memory usage call failed: " JLONG_FORMAT, mem_usage);
> 
> Why "trace" (the third logging level) to show the information, but "debug" (the second level) to show failed calls? You use debug in other files for basic info. Overall I'm unclear on your use of debug versus trace for the logging.

I use trace for noisy information that is not reporting errors and debug for failures that are informational and not fatal.
In this case, the call could return -1 or -2.  -1 is unlimited and -2 is an error.  In either case we fallback to the 
standard system call to get available memory.  I would have used warning but since these messages were occurring
during a test run causing test failures.

> 
> ---
> 
> src/hotspot/os/linux/osContainer_linux.cpp
> 
> Dead code:
> 
> 376 #if 0
> 377   os::Linux::print_container_info(tty);
> ...
> 390 #endif

I left it in for standalone testing.  Should I use some other #if?

Bob.

> 
> Thanks,
> David
> 
>> Bob.


From mbrandy at linux.vnet.ibm.com  Thu Oct 12 16:16:16 2017
From: mbrandy at linux.vnet.ibm.com (Matthew Brandyberry)
Date: Thu, 12 Oct 2017 11:16:16 -0500
Subject: RFR(M) 8188165: PPC64: Optimize Unsafe.copyMemory and arraycopy
In-Reply-To: <ae866e84-2556-f27d-a978-300d067d35ed@linux.vnet.ibm.com>
References: <ae866e84-2556-f27d-a978-300d067d35ed@linux.vnet.ibm.com>
Message-ID: <f958489e-d12a-b221-19f4-26cf14c87684@linux.vnet.ibm.com>

[Ping]

On 9/29/17 4:00 PM, Matthew Brandyberry wrote:
> This is specific to PPC64LE only.
>
> The emphasis in the proposed code is on minimizing branches. Thus, 
> this code makes no attempt to avoid misaligned accesses and each block 
> is designed to copy as many elements as possible.
>
> As one data point, this yields as much as a 13x improvement in 
> jbyte_disjoint_arraycopy for certain misaligned scenarios.
>
> Bug: https://bugs.openjdk.java.net/browse/JDK-8188165
> Webrev: http://cr.openjdk.java.net/~mbrandy/8188165/jdk10/v1/
>
> Thanks,
> -Matt
>


From mbrandy at linux.vnet.ibm.com  Thu Oct 12 16:17:12 2017
From: mbrandy at linux.vnet.ibm.com (Matthew Brandyberry)
Date: Thu, 12 Oct 2017 11:17:12 -0500
Subject: [8u] RFR (M) 8181809 PPC64: Leverage mtfprd/mffprd on POWER8
In-Reply-To: <ffd5091e-da26-9cc6-4e2d-be9a545254f6@linux.vnet.ibm.com>
References: <ffd5091e-da26-9cc6-4e2d-be9a545254f6@linux.vnet.ibm.com>
Message-ID: <1726202e-b051-6c46-73f8-2f2f5f01e418@linux.vnet.ibm.com>

[Ping]

On 9/28/17 12:53 PM, Matthew Brandyberry wrote:
> Hi,
>
> Please review this backport of 8181809 for jdk8u.
>
> It applies cleanly to jdk8u except for the lack of C1 support on PPC 
> in 8u -- thus those changes are omitted here.
>
> This is a PPC-specific hotspot optimization that leverages the 
> mtfprd/mffprd instructions for for movement between general purpose 
> and floating point registers (rather than through memory). It yields a 
> ~35% improvement measured via a microbenchmark.
>
> webrev?????? :http://cr.openjdk.java.net/~mbrandy/8181809/jdk8u/v1
>
> bug????????? :https://bugs.openjdk.java.net/browse/JDK-8181809
>
> review 
> thread:http://mail.openjdk.java.net/pipermail/hotspot-dev/2017-June/027226.html
>
>
> Thank you.
> -Matt
>


From coleen.phillimore at oracle.com  Thu Oct 12 17:23:33 2017
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Thu, 12 Oct 2017 13:23:33 -0400
Subject: RFR (L, but tedious) 8188220: Remove Atomic::*_ptr() uses and
 overloads from hotspot
In-Reply-To: <1a37a25f-8a72-3990-4849-24dbfbc21b0a@oracle.com>
References: <b6d75bd8-ca77-6507-f1f0-db57bbb99ae0@oracle.com>
 <B99DF3C6-7D79-4147-B4A0-C69F0EFA3339@oracle.com>
 <7e4fbba3-4462-a729-b663-99fd6919360f@oracle.com>
 <59DDCC37.8050306@oracle.com>
 <3089b845-0532-d6a9-b68f-91b3b21c6ef3@oracle.com>
 <591c33b3-f9b1-55f3-2c4b-ddfad4ed9a39@oracle.com>
 <bae33049-29be-0f61-5a0a-c7403a387b86@oracle.com>
 <e3380ab0-ffde-b80e-7d65-f7fcaf3993ec@oracle.com>
 <4fb119b8-cb0b-474c-ebbc-60841ef4aa46@oracle.com>
 <5986a9d6-a27f-8462-d13e-5e11de8e358c@oracle.com>
 <354409e5-8985-6710-1a1a-848a6b366d12@oracle.com>
 <1a37a25f-8a72-3990-4849-24dbfbc21b0a@oracle.com>
Message-ID: <ac6b8bcd-020f-7964-5935-6131312ec3c5@oracle.com>


Here's the qseries in webrevs.

open webrev at http://cr.openjdk.java.net/~coleenp/8188220.add_ptr/webrev
open webrev at 
http://cr.openjdk.java.net/~coleenp/8188220.cmpxchg_ptr/webrev
open webrev at 
http://cr.openjdk.java.net/~coleenp/8188220.cmpxchg_if_null/webrev
open webrev at http://cr.openjdk.java.net/~coleenp/8188220.xchg_ptr/webrev
open webrev at http://cr.openjdk.java.net/~coleenp/8188220.store_ptr/webrev
open webrev at 
http://cr.openjdk.java.net/~coleenp/8188220.load_ptr_acquire/webrev
open webrev at 
http://cr.openjdk.java.net/~coleenp/8188220.assembler_cmpxchg/webrev
open webrev at http://cr.openjdk.java.net/~coleenp/8188220.casptr/webrev
open webrev at 
http://cr.openjdk.java.net/~coleenp/8188220.review-comments/webrev

assembler_cmpxchg should be release_store_ptr which got qrefreshed with 
trying to get the cmpxchg function pointer to compile.

Thanks,
Coleen

On 10/12/17 8:55 AM, coleen.phillimore at oracle.com wrote:
>
>
> On 10/12/17 8:21 AM, David Holmes wrote:
>> On 12/10/2017 9:52 PM, coleen.phillimore at oracle.com wrote:
>>> On 10/12/17 3:23 AM, David Holmes wrote:
>>>> Hi Coleen,
>>>>
>>>> Thanks for doing this tedious cleanup!
>>>>
>>>> It was good to see so many casts disappear; and sad to see so many 
>>>> have to now appear in the sync code. :(
>>>
>>> The sync code has _owner field as void* because it can be several 
>>> things.? I didn't try to
>>
>> Yeah I understood why this had to happen.
>>
>>>>
>>>> There were a few things that struck me ...
>>>>
>>>> Atomic::xchg_ptr turned into Atomic::xchg; yet for the stub 
>>>> generator routines atomic_xchg_ptr became atomic_xchg_long - but I 
>>>> can't see where that stub will now come into play?
>>>
>>> http://cr.openjdk.java.net/~coleenp/8188220.02/webrev/src/hotspot/cpu/x86/stubGenerator_x86_64.cpp.udiff.html 
>>>
>>>
>>> I tried to remove it but windows x64 uses a stub for xchg (and others). 
>>
>> Ah so I think this is where it is used:
>>
>> ./os_cpu/windows_x86/atomic_windows_x86.hpp:DEFINE_STUB_XCHG(8, 
>> jlong, os::atomic_xchg_ptr_func)
>>
>> ie atomic_xchg_ptr is the stub for Atomic::xchg<8>
>>
>>> There was a preexisting stub for cmpxchg_long which I followed 
>>> naming convention.
>>>
>>> ?? static address _atomic_cmpxchg_entry;
>>> ?? static address _atomic_cmpxchg_byte_entry;
>>> ?? static address _atomic_cmpxchg_long_entry;
>>>
>>> Technically I think it should be long_long, as well as the 
>>> cmpxchg_long_entry as well.
>>
>> Or int64_t
>>
>>> I also missed renaming store_ptr_entry and add_ptr_entry.? What do 
>>> you suggest?
>>
>> store_ptr_entry actually seems unused.
>>
>> add_ptr_entry looks like it needs to be the 64-bit Atomic::add<8> 
>> implementation - so probably add_int64_t_entry.
>
> https://bugs.openjdk.java.net/browse/JDK-8186903
>
> I'm renaming to ptr => long for now to follow other code and fixing 
> the name with this RFE to what it really is, and what we decide.
>
> It was pretty ugly as:
>
> ? static jint????? (*atomic_add_func)?????????? (jint, volatile jint*);
> ? static intptr_t? (*atomic_add_ptr_func)?????? (intptr_t, volatile 
> intptr_t*);
>
> When the other uses jint as an argument.?? Actually, I think add_ptr 
> makes more sense in this context than long.? I think I should leave 
> this name and not make it long.
>>
>>>>
>>>> ---
>>>>
>>>> src/hotspot/share/gc/shared/taskqueue.inline.hpp
>>>>
>>>> +? return (size_t) Atomic::cmpxchg((intptr_t)new_age._data,
>>>> +????????????????????????????????? (volatile intptr_t *)&_data,
>>>> +????????????????????????????????? (intptr_t)old_age._data);
>>>>
>>>> The actual types here should be size_t, can we now change it to use 
>>>> the real type?
>>>
>>> Yes, fixed.? Missed that one.
>>>>
>>>> ---
>>>>
>>>> src/hotspot/share/oops/cpCache.cpp
>>>>
>>>> ?114 bool ConstantPoolCacheEntry::init_flags_atomic(intptr_t flags) {
>>>> ?115?? intptr_t result = Atomic::cmpxchg(flags, &_flags, (intptr_t)0);
>>>> ?116?? return (result == 0);
>>>> ?117 }
>>>>
>>>> _flags is actually intx, yet above we treat it as intptr_t. But 
>>>> then later:
>>>>
>>>> ?156?? if (_flags == 0) {
>>>> ?157???? intx newflags = (value & parameter_size_mask);
>>>> ?158???? Atomic::cmpxchg(newflags, &_flags, (intx)0);
>>>> ?159?? }
>>>>
>>>> its intx again. This looks really odd to me.
>>>
>>> It's better as an intx, because that's what it's declared as. I'll 
>>> patch up some other uses but don't promise total consistency because 
>>> I don't want to pull on this particular sweater thread too much. 
>>> intx and intptr_t I believe are typedefed to each other.
>>>
>>> typedef intptr_t? intx;
>>>
>>> Should we not have intx and uintx and change all their uses? I've 
>>> sworn off large changes after this though.
>>
>> I don't know why we have intx/uintx other than someone not liking 
>> having to type intptr_t all the time.
>>
>>> ConstantPoolCacheEntry::make_flags returns an int.?? I fixed 
>>> init_flags_atomic() because it's declared with an intx and defined 
>>> with intptr_t.
>>
>> Ok.
>>
>>>>
>>>> ---
>>>>
>>>> src/hotspot/share/runtime/objectMonitor.inline.hpp
>>>>
>>>> The addition of header_addr() made me a little nervous :) Can we 
>>>> add a sanity assert either inside it (or in synchronizer.cpp), to 
>>>> verify that this == &_header? (or monitor == monitor->header_addr())
>>>
>>> Where I introduced it, looked like undefined behavior because it 
>>> assumed that the header was the first field.
>>
>> Assumes and expects, I think. Not sure if it is undefined behaviour 
>> or not.
>
> Assumes without giving the static compiler a chance to check that what 
> you've done is correct or not.? Maybe that's not undefined behavior.
>>
>>> So I should sanity check that other places with undefined behavior 
>>> won't break?? Sure I'll do that.
>>
>> No only sanity check that your change actually didn't change 
>> anything. :)
>
> As well.
>>
>>>>
>>>> ---
>>>>
>>>> src/hotspot/share/runtime/synchronizer.cpp
>>>>
>>>> ?// global list of blocks of monitors
>>>> -// gBlockList is really PaddedEnd<ObjectMonitor> *, but we don't
>>>> -// want to expose the PaddedEnd template more than necessary.
>>>> -ObjectMonitor * volatile ObjectSynchronizer::gBlockList = NULL;
>>>> +PaddedEnd<ObjectMonitor> * volatile ObjectSynchronizer::gBlockList 
>>>> = NULL;
>>>>
>>>> Did this have to change? I'm not sure why we didn't want to expose 
>>>> PaddedEnd, but it is now being exposed.
>>>
>>> I didn't see why not and it avoided a bunch of ugly casts.?? I 
>>> tested that the SA was fine with it because the SA manually did the 
>>> address adjustment.? The SA could be fixed to know about PaddedEnd 
>>> if it's somehting they want to do.
>>
>> Glad you mentioned SA as I forgot to mention that with the vmStructs 
>> changes. :)
>>
>>> Thanks for going through and reviewing all of this.?? Please answer 
>>> question about the stub function name and I'll include the change 
>>> with this patch.
>>
>> Would like to see an incremental webrev please. (Should be easy if 
>> you're using mq :) )
>
> Will do.
>
> Thanks,
> Coleen
>>
>> Thanks,
>> David
>>
>>> Coleen
>>>
>>>>
>>>> Thanks,
>>>> David
>>>> -----
>>>>
>>>>
>>>> On 11/10/2017 11:50 PM, coleen.phillimore at oracle.com wrote:
>>>>>
>>>>> Please review version .02 which removes use of replace_if_null, 
>>>>> but not the function.? A separate RFE can be filed to discuss that.
>>>>>
>>>>> open webrev at http://cr.openjdk.java.net/~coleenp/8188220.02/webrev
>>>>>
>>>>> Thanks,
>>>>> Coleen
>>>>>
>>>>> On 10/11/17 7:07 AM, coleen.phillimore at oracle.com wrote:
>>>>>>
>>>>>>
>>>>>> On 10/11/17 4:12 AM, Robbin Ehn wrote:
>>>>>>> On 10/11/2017 10:09 AM, David Holmes wrote:
>>>>>>>> On 11/10/2017 5:45 PM, Erik ?sterlund wrote:
>>>>>>>>
>>>>>>>> Removing the operation is a different argument to renaming it. 
>>>>>>>> Most of the above argues for removing it. :)
>>>>>>>
>>>>>>> +1 on removing
>>>>>>
>>>>>> Thank you for all your feedback.? Erik best described what I was 
>>>>>> thinking.? I will remove it then.? There were not that many 
>>>>>> instances and one instance that people thought would be useful, 
>>>>>> needed the old return value.
>>>>>>
>>>>>> Coleen
>>>>>>>
>>>>>>> Thanks, Robbin
>>>>>>>
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>> David
>>>>>>>> -----
>>>>>>>>
>>>>>>>>> I have not reviewed this completely yet - thought I'd wait 
>>>>>>>>> with that until we agree about replace_if_null, if that is okay.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> /Erik
>>>>>>>>>
>>>>>>>>> On 2017-10-11 05:55, David Holmes wrote:
>>>>>>>>>> On 11/10/2017 1:43 PM, Kim Barrett wrote:
>>>>>>>>>>>> On Oct 10, 2017, at 6:01 PM, coleen.phillimore at oracle.com 
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> Summary: With the new template functions these are 
>>>>>>>>>>>> unnecessary.
>>>>>>>>>>>>
>>>>>>>>>>>> 2. renamed Atomic::replace_if_null to 
>>>>>>>>>>>> Atomic::cmpxchg_if_null. I disliked the first name because 
>>>>>>>>>>>> it's not explicit from the callers that there's an 
>>>>>>>>>>>> underlying cas. If people want to fight, I'll remove the 
>>>>>>>>>>>> function and use cmpxchg because there are only a couple 
>>>>>>>>>>>> places where this is a little nicer.
>>>>>>>>>>>
>>>>>>>>>>> I'm still looking at other parts, but I want to respond to 
>>>>>>>>>>> this now.
>>>>>>>>>>>
>>>>>>>>>>> I object to this change.? I think the proposed new name is 
>>>>>>>>>>> confusing,
>>>>>>>>>>> suggesting there are two different comparisons involved.
>>>>>>>>>>>
>>>>>>>>>>> I originally called it something else that I wasn't entirely 
>>>>>>>>>>> happy
>>>>>>>>>>> with.? When David suggested replace_if_null I quickly 
>>>>>>>>>>> adopted that as
>>>>>>>>>>> I think that name exactly describes what it does. In 
>>>>>>>>>>> particular, I
>>>>>>>>>>> think "atomic replace if" pretty clearly suggests a 
>>>>>>>>>>> test-and-set /
>>>>>>>>>>> compare-and-swap type of operation.
>>>>>>>>>>
>>>>>>>>>> I totally agree. It's an Atomic operation, the implementation 
>>>>>>>>>> will involve something atomic, it doesn't matter if it is 
>>>>>>>>>> cmpxchg or something else. The name replace_if_null describes 
>>>>>>>>>> exactly what the function does - it doesn't have to describe 
>>>>>>>>>> how it does it.
>>>>>>>>>>
>>>>>>>>>> David
>>>>>>>>>> -----
>>>>>>>>>>
>>>>>>>>>>> Further, I think any name involving "cmpxchg" is problematic 
>>>>>>>>>>> because
>>>>>>>>>>> the result of this operation is intentionally different from 
>>>>>>>>>>> cmpxchg,
>>>>>>>>>>> in order to better support the primary use-case, which is lazy
>>>>>>>>>>> initialization.
>>>>>>>>>>>
>>>>>>>>>>> I also object to your alternative suggestion of removing the 
>>>>>>>>>>> operation
>>>>>>>>>>> entirely and just using cmpxchg directly instead.? I don't 
>>>>>>>>>>> recall how
>>>>>>>>>>> many occurrences there presently are, but I suspect more 
>>>>>>>>>>> could easily
>>>>>>>>>>> be added; it's part of a lazy initialization pattern similar 
>>>>>>>>>>> to DCLP
>>>>>>>>>>> but without the locks.
>>>>>>>>>>>
>>>>>>>>>
>>>>>>
>>>>>
>>>
>


From david.holmes at oracle.com  Thu Oct 12 21:56:24 2017
From: david.holmes at oracle.com (David Holmes)
Date: Fri, 13 Oct 2017 07:56:24 +1000
Subject: RFR (L, but tedious) 8188220: Remove Atomic::*_ptr() uses and
 overloads from hotspot
In-Reply-To: <ac6b8bcd-020f-7964-5935-6131312ec3c5@oracle.com>
References: <b6d75bd8-ca77-6507-f1f0-db57bbb99ae0@oracle.com>
 <B99DF3C6-7D79-4147-B4A0-C69F0EFA3339@oracle.com>
 <7e4fbba3-4462-a729-b663-99fd6919360f@oracle.com>
 <59DDCC37.8050306@oracle.com>
 <3089b845-0532-d6a9-b68f-91b3b21c6ef3@oracle.com>
 <591c33b3-f9b1-55f3-2c4b-ddfad4ed9a39@oracle.com>
 <bae33049-29be-0f61-5a0a-c7403a387b86@oracle.com>
 <e3380ab0-ffde-b80e-7d65-f7fcaf3993ec@oracle.com>
 <4fb119b8-cb0b-474c-ebbc-60841ef4aa46@oracle.com>
 <5986a9d6-a27f-8462-d13e-5e11de8e358c@oracle.com>
 <354409e5-8985-6710-1a1a-848a6b366d12@oracle.com>
 <1a37a25f-8a72-3990-4849-24dbfbc21b0a@oracle.com>
 <ac6b8bcd-020f-7964-5935-6131312ec3c5@oracle.com>
Message-ID: <a5266e41-9816-d171-b464-7773079d6398@oracle.com>

On 13/10/2017 3:23 AM, coleen.phillimore at oracle.com wrote:
> 
> Here's the qseries in webrevs.

Are these the latest or do they match the big webrev you previously put out?

> open webrev at http://cr.openjdk.java.net/~coleenp/8188220.add_ptr/webrev

There are still two add(-n) instead of sub(n) cases.

Also here:

--- old/src/hotspot/share/services/mallocTracker.hpp	2017-10-12 
12:15:32.951573341 -0400
+++ new/src/hotspot/share/services/mallocTracker.hpp	2017-10-12 
12:15:32.386616320 -0400
@@ -68,7 +68,7 @@
      if (sz > 0) {
        // unary minus operator applied to unsigned type, result still 
unsigned
        #pragma warning(suppress: 4146)
-      Atomic::add(-sz, &_size);
+      Atomic::sub(sz, &_size);

You should be able to remove the comment and pragma now as no unary 
minus is being applied (at this level).

Thanks,
David

> open webrev at 
> http://cr.openjdk.java.net/~coleenp/8188220.cmpxchg_ptr/webrev
> open webrev at 
> http://cr.openjdk.java.net/~coleenp/8188220.cmpxchg_if_null/webrev
> open webrev at http://cr.openjdk.java.net/~coleenp/8188220.xchg_ptr/webrev
> open webrev at http://cr.openjdk.java.net/~coleenp/8188220.store_ptr/webrev
> open webrev at 
> http://cr.openjdk.java.net/~coleenp/8188220.load_ptr_acquire/webrev
> open webrev at 
> http://cr.openjdk.java.net/~coleenp/8188220.assembler_cmpxchg/webrev
> open webrev at http://cr.openjdk.java.net/~coleenp/8188220.casptr/webrev
> open webrev at 
> http://cr.openjdk.java.net/~coleenp/8188220.review-comments/webrev
> 
> assembler_cmpxchg should be release_store_ptr which got qrefreshed with 
> trying to get the cmpxchg function pointer to compile.
> 
> Thanks,
> Coleen
> 
> On 10/12/17 8:55 AM, coleen.phillimore at oracle.com wrote:
>>
>>
>> On 10/12/17 8:21 AM, David Holmes wrote:
>>> On 12/10/2017 9:52 PM, coleen.phillimore at oracle.com wrote:
>>>> On 10/12/17 3:23 AM, David Holmes wrote:
>>>>> Hi Coleen,
>>>>>
>>>>> Thanks for doing this tedious cleanup!
>>>>>
>>>>> It was good to see so many casts disappear; and sad to see so many 
>>>>> have to now appear in the sync code. :(
>>>>
>>>> The sync code has _owner field as void* because it can be several 
>>>> things.? I didn't try to
>>>
>>> Yeah I understood why this had to happen.
>>>
>>>>>
>>>>> There were a few things that struck me ...
>>>>>
>>>>> Atomic::xchg_ptr turned into Atomic::xchg; yet for the stub 
>>>>> generator routines atomic_xchg_ptr became atomic_xchg_long - but I 
>>>>> can't see where that stub will now come into play?
>>>>
>>>> http://cr.openjdk.java.net/~coleenp/8188220.02/webrev/src/hotspot/cpu/x86/stubGenerator_x86_64.cpp.udiff.html 
>>>>
>>>>
>>>> I tried to remove it but windows x64 uses a stub for xchg (and others). 
>>>
>>> Ah so I think this is where it is used:
>>>
>>> ./os_cpu/windows_x86/atomic_windows_x86.hpp:DEFINE_STUB_XCHG(8, 
>>> jlong, os::atomic_xchg_ptr_func)
>>>
>>> ie atomic_xchg_ptr is the stub for Atomic::xchg<8>
>>>
>>>> There was a preexisting stub for cmpxchg_long which I followed 
>>>> naming convention.
>>>>
>>>> ?? static address _atomic_cmpxchg_entry;
>>>> ?? static address _atomic_cmpxchg_byte_entry;
>>>> ?? static address _atomic_cmpxchg_long_entry;
>>>>
>>>> Technically I think it should be long_long, as well as the 
>>>> cmpxchg_long_entry as well.
>>>
>>> Or int64_t
>>>
>>>> I also missed renaming store_ptr_entry and add_ptr_entry.? What do 
>>>> you suggest?
>>>
>>> store_ptr_entry actually seems unused.
>>>
>>> add_ptr_entry looks like it needs to be the 64-bit Atomic::add<8> 
>>> implementation - so probably add_int64_t_entry.
>>
>> https://bugs.openjdk.java.net/browse/JDK-8186903
>>
>> I'm renaming to ptr => long for now to follow other code and fixing 
>> the name with this RFE to what it really is, and what we decide.
>>
>> It was pretty ugly as:
>>
>> ? static jint????? (*atomic_add_func)?????????? (jint, volatile jint*);
>> ? static intptr_t? (*atomic_add_ptr_func)?????? (intptr_t, volatile 
>> intptr_t*);
>>
>> When the other uses jint as an argument.?? Actually, I think add_ptr 
>> makes more sense in this context than long.? I think I should leave 
>> this name and not make it long.
>>>
>>>>>
>>>>> ---
>>>>>
>>>>> src/hotspot/share/gc/shared/taskqueue.inline.hpp
>>>>>
>>>>> +? return (size_t) Atomic::cmpxchg((intptr_t)new_age._data,
>>>>> +????????????????????????????????? (volatile intptr_t *)&_data,
>>>>> +????????????????????????????????? (intptr_t)old_age._data);
>>>>>
>>>>> The actual types here should be size_t, can we now change it to use 
>>>>> the real type?
>>>>
>>>> Yes, fixed.? Missed that one.
>>>>>
>>>>> ---
>>>>>
>>>>> src/hotspot/share/oops/cpCache.cpp
>>>>>
>>>>> ?114 bool ConstantPoolCacheEntry::init_flags_atomic(intptr_t flags) {
>>>>> ?115?? intptr_t result = Atomic::cmpxchg(flags, &_flags, (intptr_t)0);
>>>>> ?116?? return (result == 0);
>>>>> ?117 }
>>>>>
>>>>> _flags is actually intx, yet above we treat it as intptr_t. But 
>>>>> then later:
>>>>>
>>>>> ?156?? if (_flags == 0) {
>>>>> ?157???? intx newflags = (value & parameter_size_mask);
>>>>> ?158???? Atomic::cmpxchg(newflags, &_flags, (intx)0);
>>>>> ?159?? }
>>>>>
>>>>> its intx again. This looks really odd to me.
>>>>
>>>> It's better as an intx, because that's what it's declared as. I'll 
>>>> patch up some other uses but don't promise total consistency because 
>>>> I don't want to pull on this particular sweater thread too much. 
>>>> intx and intptr_t I believe are typedefed to each other.
>>>>
>>>> typedef intptr_t? intx;
>>>>
>>>> Should we not have intx and uintx and change all their uses? I've 
>>>> sworn off large changes after this though.
>>>
>>> I don't know why we have intx/uintx other than someone not liking 
>>> having to type intptr_t all the time.
>>>
>>>> ConstantPoolCacheEntry::make_flags returns an int.?? I fixed 
>>>> init_flags_atomic() because it's declared with an intx and defined 
>>>> with intptr_t.
>>>
>>> Ok.
>>>
>>>>>
>>>>> ---
>>>>>
>>>>> src/hotspot/share/runtime/objectMonitor.inline.hpp
>>>>>
>>>>> The addition of header_addr() made me a little nervous :) Can we 
>>>>> add a sanity assert either inside it (or in synchronizer.cpp), to 
>>>>> verify that this == &_header? (or monitor == monitor->header_addr())
>>>>
>>>> Where I introduced it, looked like undefined behavior because it 
>>>> assumed that the header was the first field.
>>>
>>> Assumes and expects, I think. Not sure if it is undefined behaviour 
>>> or not.
>>
>> Assumes without giving the static compiler a chance to check that what 
>> you've done is correct or not.? Maybe that's not undefined behavior.
>>>
>>>> So I should sanity check that other places with undefined behavior 
>>>> won't break?? Sure I'll do that.
>>>
>>> No only sanity check that your change actually didn't change 
>>> anything. :)
>>
>> As well.
>>>
>>>>>
>>>>> ---
>>>>>
>>>>> src/hotspot/share/runtime/synchronizer.cpp
>>>>>
>>>>> ?// global list of blocks of monitors
>>>>> -// gBlockList is really PaddedEnd<ObjectMonitor> *, but we don't
>>>>> -// want to expose the PaddedEnd template more than necessary.
>>>>> -ObjectMonitor * volatile ObjectSynchronizer::gBlockList = NULL;
>>>>> +PaddedEnd<ObjectMonitor> * volatile ObjectSynchronizer::gBlockList 
>>>>> = NULL;
>>>>>
>>>>> Did this have to change? I'm not sure why we didn't want to expose 
>>>>> PaddedEnd, but it is now being exposed.
>>>>
>>>> I didn't see why not and it avoided a bunch of ugly casts.?? I 
>>>> tested that the SA was fine with it because the SA manually did the 
>>>> address adjustment.? The SA could be fixed to know about PaddedEnd 
>>>> if it's somehting they want to do.
>>>
>>> Glad you mentioned SA as I forgot to mention that with the vmStructs 
>>> changes. :)
>>>
>>>> Thanks for going through and reviewing all of this.?? Please answer 
>>>> question about the stub function name and I'll include the change 
>>>> with this patch.
>>>
>>> Would like to see an incremental webrev please. (Should be easy if 
>>> you're using mq :) )
>>
>> Will do.
>>
>> Thanks,
>> Coleen
>>>
>>> Thanks,
>>> David
>>>
>>>> Coleen
>>>>
>>>>>
>>>>> Thanks,
>>>>> David
>>>>> -----
>>>>>
>>>>>
>>>>> On 11/10/2017 11:50 PM, coleen.phillimore at oracle.com wrote:
>>>>>>
>>>>>> Please review version .02 which removes use of replace_if_null, 
>>>>>> but not the function.? A separate RFE can be filed to discuss that.
>>>>>>
>>>>>> open webrev at http://cr.openjdk.java.net/~coleenp/8188220.02/webrev
>>>>>>
>>>>>> Thanks,
>>>>>> Coleen
>>>>>>
>>>>>> On 10/11/17 7:07 AM, coleen.phillimore at oracle.com wrote:
>>>>>>>
>>>>>>>
>>>>>>> On 10/11/17 4:12 AM, Robbin Ehn wrote:
>>>>>>>> On 10/11/2017 10:09 AM, David Holmes wrote:
>>>>>>>>> On 11/10/2017 5:45 PM, Erik ?sterlund wrote:
>>>>>>>>>
>>>>>>>>> Removing the operation is a different argument to renaming it. 
>>>>>>>>> Most of the above argues for removing it. :)
>>>>>>>>
>>>>>>>> +1 on removing
>>>>>>>
>>>>>>> Thank you for all your feedback.? Erik best described what I was 
>>>>>>> thinking.? I will remove it then.? There were not that many 
>>>>>>> instances and one instance that people thought would be useful, 
>>>>>>> needed the old return value.
>>>>>>>
>>>>>>> Coleen
>>>>>>>>
>>>>>>>> Thanks, Robbin
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Cheers,
>>>>>>>>> David
>>>>>>>>> -----
>>>>>>>>>
>>>>>>>>>> I have not reviewed this completely yet - thought I'd wait 
>>>>>>>>>> with that until we agree about replace_if_null, if that is okay.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> /Erik
>>>>>>>>>>
>>>>>>>>>> On 2017-10-11 05:55, David Holmes wrote:
>>>>>>>>>>> On 11/10/2017 1:43 PM, Kim Barrett wrote:
>>>>>>>>>>>>> On Oct 10, 2017, at 6:01 PM, coleen.phillimore at oracle.com 
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>> Summary: With the new template functions these are 
>>>>>>>>>>>>> unnecessary.
>>>>>>>>>>>>>
>>>>>>>>>>>>> 2. renamed Atomic::replace_if_null to 
>>>>>>>>>>>>> Atomic::cmpxchg_if_null. I disliked the first name because 
>>>>>>>>>>>>> it's not explicit from the callers that there's an 
>>>>>>>>>>>>> underlying cas. If people want to fight, I'll remove the 
>>>>>>>>>>>>> function and use cmpxchg because there are only a couple 
>>>>>>>>>>>>> places where this is a little nicer.
>>>>>>>>>>>>
>>>>>>>>>>>> I'm still looking at other parts, but I want to respond to 
>>>>>>>>>>>> this now.
>>>>>>>>>>>>
>>>>>>>>>>>> I object to this change.? I think the proposed new name is 
>>>>>>>>>>>> confusing,
>>>>>>>>>>>> suggesting there are two different comparisons involved.
>>>>>>>>>>>>
>>>>>>>>>>>> I originally called it something else that I wasn't entirely 
>>>>>>>>>>>> happy
>>>>>>>>>>>> with.? When David suggested replace_if_null I quickly 
>>>>>>>>>>>> adopted that as
>>>>>>>>>>>> I think that name exactly describes what it does. In 
>>>>>>>>>>>> particular, I
>>>>>>>>>>>> think "atomic replace if" pretty clearly suggests a 
>>>>>>>>>>>> test-and-set /
>>>>>>>>>>>> compare-and-swap type of operation.
>>>>>>>>>>>
>>>>>>>>>>> I totally agree. It's an Atomic operation, the implementation 
>>>>>>>>>>> will involve something atomic, it doesn't matter if it is 
>>>>>>>>>>> cmpxchg or something else. The name replace_if_null describes 
>>>>>>>>>>> exactly what the function does - it doesn't have to describe 
>>>>>>>>>>> how it does it.
>>>>>>>>>>>
>>>>>>>>>>> David
>>>>>>>>>>> -----
>>>>>>>>>>>
>>>>>>>>>>>> Further, I think any name involving "cmpxchg" is problematic 
>>>>>>>>>>>> because
>>>>>>>>>>>> the result of this operation is intentionally different from 
>>>>>>>>>>>> cmpxchg,
>>>>>>>>>>>> in order to better support the primary use-case, which is lazy
>>>>>>>>>>>> initialization.
>>>>>>>>>>>>
>>>>>>>>>>>> I also object to your alternative suggestion of removing the 
>>>>>>>>>>>> operation
>>>>>>>>>>>> entirely and just using cmpxchg directly instead.? I don't 
>>>>>>>>>>>> recall how
>>>>>>>>>>>> many occurrences there presently are, but I suspect more 
>>>>>>>>>>>> could easily
>>>>>>>>>>>> be added; it's part of a lazy initialization pattern similar 
>>>>>>>>>>>> to DCLP
>>>>>>>>>>>> but without the locks.
>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>
>>>>>>
>>>>
>>
> 

From kim.barrett at oracle.com  Thu Oct 12 23:17:38 2017
From: kim.barrett at oracle.com (Kim Barrett)
Date: Thu, 12 Oct 2017 19:17:38 -0400
Subject: RFR (L, but tedious) 8188220: Remove Atomic::*_ptr() uses and
 overloads from hotspot
In-Reply-To: <b6d75bd8-ca77-6507-f1f0-db57bbb99ae0@oracle.com>
References: <b6d75bd8-ca77-6507-f1f0-db57bbb99ae0@oracle.com>
Message-ID: <7A475565-84D9-4F98-AE7B-2FDB206CC6E1@oracle.com>

> On Oct 10, 2017, at 6:01 PM, coleen.phillimore at oracle.com wrote:
> 
> Summary: With the new template functions these are unnecessary.
> 
> The changes are mostly s/_ptr// and removing the cast to return type.  There weren't many types that needed to be improved to match the template version of the function.   Some notes:
> 1. replaced CASPTR with Atomic::cmpxchg() in mutex.cpp, rearranging arguments.
> 2. renamed Atomic::replace_if_null to Atomic::cmpxchg_if_null.  I disliked the first name because it's not explicit from the callers that there's an underlying cas.  If people want to fight, I'll remove the function and use cmpxchg because there are only a couple places where this is a little nicer.
> 3. Added Atomic::sub()
> 
> Tested with JPRT, mach5 tier1-5 on linux,windows and solaris.
> 
> open webrev at http://cr.openjdk.java.net/~coleenp/8188220.01/webrev
> bug link https://bugs.openjdk.java.net/browse/JDK-8188220
> 
> Thanks,
> Coleen

I looked harder at the potential ABA problems, and believe they are
okay.  There can be multiple threads doing pushes, and there can be
multiple threads doing pops, but not both at the same time.

------------------------------------------------------------------------------
src/hotspot/cpu/zero/cppInterpreter_zero.cpp
 279     if (Atomic::cmpxchg(monitor, lockee->mark_addr(), disp) != disp) {

How does this work?  monitor and disp seem like they have unrelated
types?  Given that this is zero-specific code, maybe this hasn't been
tested?

Similarly here:
 423       if (Atomic::cmpxchg(header, rcvr->mark_addr(), lock) != lock) {

------------------------------------------------------------------------------
src/hotspot/share/asm/assembler.cpp
 239         dcon->value_fn = cfn;

Is it actually safe to remove the atomic update?  If multiple threads
performing the assignment *are* possible (and I don't understand the
context yet, so don't know the answer to that), then a bare non-atomic
assignment is a race, e.g. undefined behavior.

Regardless of that, I think the CAST_FROM_FN_PTR should be retained.

------------------------------------------------------------------------------
src/hotspot/share/classfile/classLoaderData.cpp
 167   Chunk* head = (Chunk*) OrderAccess::load_acquire(&_head);

I think the cast to Chunk* is no longer needed.

------------------------------------------------------------------------------
src/hotspot/share/classfile/classLoaderData.cpp
 946     ClassLoaderData* old = Atomic::cmpxchg(cld, cld_addr, (ClassLoaderData*)NULL);
 947     if (old != NULL) {
 948       delete cld;
 949       // Returns the data.
 950       return old;
 951     }

That could instead be

  if (!Atomic::replace_if_null(cld, cld_addr)) {
    delete cld;           // Lost the race.
    return *cld_addr;     // Use the winner's value.
  }

And apparently the caller of CLDG::add doesn't care whether the
returned CLD has actually been added to the graph yet.  If that's not
true, then there's a bug here, since a race loser might return a
winner's value before the winner has actually done the insertion.

------------------------------------------------------------------------------
src/hotspot/share/classfile/verifier.cpp
  71 static void* verify_byte_codes_fn() {
  72   if (OrderAccess::load_acquire(&_verify_byte_codes_fn) == NULL) {
  73     void *lib_handle = os::native_java_library();
  74     void *func = os::dll_lookup(lib_handle, "VerifyClassCodesForMajorVersion");
  75     OrderAccess::release_store(&_verify_byte_codes_fn, func);
  76     if (func == NULL) {
  77       _is_new_verify_byte_codes_fn = false;
  78       func = os::dll_lookup(lib_handle, "VerifyClassCodes");
  79       OrderAccess::release_store(&_verify_byte_codes_fn, func);
  80     }
  81   }
  82   return (void*)_verify_byte_codes_fn;
  83 }

[pre-existing]

I think this code has race problems; a caller could unexpectedly and
inappropriately return NULL.  Consider the case where there is no
VerifyClassCodesForMajorVersion, but there is VerifyClassCodes.

The variable is initially NULL.

Both Thread1 and Thread2 reach line 73, having both seen a NULL value
for the variable.

Thread1 reaches line 80, setting the variable to VerifyClassCodes.

Thread2 reaches line 76, resetting the variable to NULL.

Thread1 reads the now (momentarily) NULL value and returns it.

I think the first release_store should be conditional on func != NULL.
Also, the usage of _is_new_verify_byte_codes_fn seems suspect.
And a minor additional nit: the cast in the return is unnecessary.

------------------------------------------------------------------------------
src/hotspot/share/code/nmethod.cpp
1664   nmethod* observed_mark_link = _oops_do_mark_link;
1665   if (observed_mark_link == NULL) {
1666     // Claim this nmethod for this thread to mark.
1667     if (Atomic::cmpxchg_if_null(NMETHOD_SENTINEL, &_oops_do_mark_link)) {

With these changes, the only use of observed_mark_link is in the if.
I'm not sure that variable is really useful anymore, e.g. just use

  if (_oops_do_mark_link == NULL) {

------------------------------------------------------------------------------
src/hotspot/share/gc/cms/concurrentMarkSweepGeneration.cpp

In CMSCollector::par_take_from_overflow_list, if BUSY and prefix were
of type oopDesc*, I think there would be a whole lot fewer casts and
cast_to_oop's.  Later on, I think suffix_head, observed_overflow_list,
and curr_overflow_list could also be oopDesc* instead of oop to
eliminate more casts.

And some similar changes in CMSCollector::par_push_on_overflow_list.

And similarly in parNewGeneration.cpp, in push_on_overflow_list and
take_from_overflow_list_work.

As noted in the comments for JDK-8165857, the lists and "objects"
involved here aren't really oops, but rather the shattered remains of
oops.  The suggestion there was to use HeapWord* and carry through the
fanout; what was actually done was to change _overflow_list to
oopDesc* to minimize fanout, even though that's kind of lying to the
type system.  Now, with the cleanup of cmpxchg_ptr and such, we're
paying the price of doing the minimal thing back then.

------------------------------------------------------------------------------
src/hotspot/share/gc/cms/concurrentMarkSweepGeneration.cpp  
7960   Atomic::add(-n, &_num_par_pushes);

Atomic::sub

------------------------------------------------------------------------------
src/hotspot/share/gc/cms/parNewGeneration.cpp
1455   Atomic::add(-n, &_num_par_pushes);

Atomic::sub

------------------------------------------------------------------------------
src/hotspot/share/gc/g1/dirtyCardQueue.cpp
 283     void* actual = Atomic::cmpxchg(next, &_cur_par_buffer_node, nd);
...
 289       nd = static_cast<BufferNode*>(actual);

Change actual's type to BufferNode* and remove the cast on line 289.

------------------------------------------------------------------------------
src/hotspot/share/gc/g1/g1CollectedHeap.cpp

[pre-existing]
3499         old = (CompiledMethod*)_postponed_list;

I think that cast is only needed because
G1CodeCacheUnloadingTask::_postponed_list is incorrectly typed as
"volatile CompiledMethod*", when I think it ought to be
"CompiledMethod* volatile".

I think G1CodeCacheUnloading::_claimed_nmethod is similarly mis-typed,
with a similar should not be needed cast:
3530       first = (CompiledMethod*)_claimed_nmethod;

and another for _postponed_list here:
3552       claim = (CompiledMethod*)_postponed_list;

------------------------------------------------------------------------------
src/hotspot/share/gc/g1/g1HotCardCache.cpp
  77   jbyte* previous_ptr = (jbyte*)Atomic::cmpxchg(card_ptr,

I think the cast of the cmpxchg result is no longer needed.

------------------------------------------------------------------------------
src/hotspot/share/gc/g1/g1PageBasedVirtualSpace.cpp
 254       char* touch_addr = (char*)Atomic::add(actual_chunk_size, &_cur_addr) - actual_chunk_size;

I think the cast of the add result is no longer needed.

------------------------------------------------------------------------------
src/hotspot/share/gc/g1/g1StringDedup.cpp
 213   return (size_t)Atomic::add(partition_size, &_next_bucket) - partition_size;

I think the cast of the add result is no longer needed.

------------------------------------------------------------------------------
src/hotspot/share/gc/g1/heapRegionRemSet.cpp
 200       PerRegionTable* res =
 201         Atomic::cmpxchg(nxt, &_free_list, fl);

Please remove the line break, now that the code has been simplified.

But wait, doesn't this alloc exhibit classic ABA problems?  I *think*
this works because alloc and bulk_free are called in different phases,
never overlapping.   

------------------------------------------------------------------------------
src/hotspot/share/gc/g1/sparsePRT.cpp
 295     SparsePRT* res =
 296       Atomic::cmpxchg(sprt, &_head_expanded_list, hd);
and
 307     SparsePRT* res =
 308       Atomic::cmpxchg(next, &_head_expanded_list, hd);

I'd rather not have the line breaks in these either.

And get_from_expanded_list also appears to have classic ABA problems.
I *think* this works because add_to_expanded_list and
get_from_expanded_list are called in different phases, never
overlapping.

------------------------------------------------------------------------------
src/hotspot/share/gc/shared/taskqueue.inline.hpp
 262   return (size_t) Atomic::cmpxchg((intptr_t)new_age._data,
 263                                   (volatile intptr_t *)&_data,
 264                                   (intptr_t)old_age._data);

This should be

  return Atomic::cmpxchg(new_age._data, &_data, old_age._data);

------------------------------------------------------------------------------
src/hotspot/share/interpreter/bytecodeInterpreter.cpp
This doesn't have any casts, which I think is correct.
 708             if (Atomic::cmpxchg(header, rcvr->mark_addr(), mark) == mark) {

but these do.
 718             if (Atomic::cmpxchg((void*)new_header, rcvr->mark_addr(), mark) == mark) {
 737             if (Atomic::cmpxchg((void*)new_header, rcvr->mark_addr(), header) == header) {

I'm not sure how the ones with casts even compile?  mark_addr() seems
to be a markOop*, which is a markOopDesc**, where markOopDesc is a
class.  void* is not implicitly convertible to markOopDesc*.

Hm, this entire file is #ifdef CC_INTERP.  Is this zero-only code?  Or
something like that?

Similarly here:
 906           if (Atomic::cmpxchg(header, lockee->mark_addr(), mark) == mark) {
and
 917           if (Atomic::cmpxchg((void*)new_header, lockee->mark_addr(), mark) == mark) {
 935           if (Atomic::cmpxchg((void*)new_header, lockee->mark_addr(), header) == header) {

and here:
1847               if (Atomic::cmpxchg(header, lockee->mark_addr(), mark) == mark) {
1858               if (Atomic::cmpxchg((void*)new_header, lockee->mark_addr(), mark) == mark) {
1878               if (Atomic::cmpxchg((void*)new_header, lockee->mark_addr(), header) == header) {

and here:
1847               if (Atomic::cmpxchg(header, lockee->mark_addr(), mark) == mark) {
1858               if (Atomic::cmpxchg((void*)new_header, lockee->mark_addr(), mark) == mark) {
1878               if (Atomic::cmpxchg((void*)new_header, lockee->mark_addr(), header) == header) {

------------------------------------------------------------------------------
src/hotspot/share/memory/metaspace.cpp
1502   size_t value = OrderAccess::load_acquire(&_capacity_until_GC);
...
1537   return (size_t)Atomic::sub((intptr_t)v, &_capacity_until_GC);

These and other uses of _capacity_until_GC suggest that variable's
type should be size_t rather than intptr_t.  Note that I haven't done
a careful check of uses to see if there are any places where such a
change would cause problems.

------------------------------------------------------------------------------
src/hotspot/share/oops/constantPool.cpp
 229   OrderAccess::release_store((Klass* volatile *)adr, k);
 246   OrderAccess::release_store((Klass* volatile *)adr, k);
 514   OrderAccess::release_store((Klass* volatile *)adr, k);

Casts are not needed.

------------------------------------------------------------------------------
src/hotspot/share/oops/constantPool.hpp
 148     volatile intptr_t adr = OrderAccess::load_acquire(obj_at_addr_raw(which));

[pre-existing]  
Why is adr declared volatile?

------------------------------------------------------------------------------
src/hotspot/share/oops/cpCache.cpp
 157     intx newflags = (value & parameter_size_mask);
 158     Atomic::cmpxchg(newflags, &_flags, (intx)0);

This is a nice demonstration of why I wanted to include some value
preserving integral conversions in cmpxchg, rather than requiring
exact type matching in the integral case.  There have been some others
that I haven't commented on.  Apparently we (I) got away with
including such conversions in Atomic::add, which I'd forgotten about.
And see comment regarding Atomic::sub below.

------------------------------------------------------------------------------
src/hotspot/share/oops/cpCache.hpp
 139   volatile Metadata*   _f1;       // entry specific metadata field

[pre-existing]  
I suspect the type should be Metadata* volatile.  And that would
eliminate the need for the cast here:

 339   Metadata* f1_ord() const                       { return (Metadata *)OrderAccess::load_acquire(&_f1); }

I don't know if there are any other changes needed or desirable around
_f1 usage.

------------------------------------------------------------------------------
src/hotspot/share/oops/method.hpp
 139   volatile address from_compiled_entry() const   { return OrderAccess::load_acquire(&_from_compiled_entry); }
 140   volatile address from_compiled_entry_no_trampoline() const;
 141   volatile address from_interpreted_entry() const{ return OrderAccess::load_acquire(&_from_interpreted_entry); }

[pre-existing]  
The volatile qualifiers here seem suspect to me.

------------------------------------------------------------------------------
src/hotspot/share/oops/oop.inline.hpp
 391     narrowOop old = (narrowOop)Atomic::xchg(val, (narrowOop*)dest);

Cast of return type is not needed.

------------------------------------------------------------------------------
src/hotspot/share/prims/jni.cpp

[pre-existing] 

copy_jni_function_table should be using Copy::disjoint_words_atomic.

------------------------------------------------------------------------------
src/hotspot/share/prims/jni.cpp

[pre-existing] 

3892   // We're about to use Atomic::xchg for synchronization.  Some Zero
3893   // platforms use the GCC builtin __sync_lock_test_and_set for this,
3894   // but __sync_lock_test_and_set is not guaranteed to do what we want
3895   // on all architectures.  So we check it works before relying on it.
3896 #if defined(ZERO) && defined(ASSERT)
3897   {
3898     jint a = 0xcafebabe;
3899     jint b = Atomic::xchg(0xdeadbeef, &a);
3900     void *c = &a;
3901     void *d = Atomic::xchg(&b, &c);
3902     assert(a == (jint) 0xdeadbeef && b == (jint) 0xcafebabe, "Atomic::xchg() works");
3903     assert(c == &b && d == &a, "Atomic::xchg() works");
3904   }
3905 #endif // ZERO && ASSERT

It seems rather strange to be testing Atomic::xchg() here, rather than
as part of unit testing Atomic?  Fail unit testing => don't try to
use...

------------------------------------------------------------------------------
src/hotspot/share/prims/jvmtiRawMonitor.cpp
 130     if (Atomic::cmpxchg_if_null((void*)Self, &_owner)) {
 142     if (_owner == NULL && Atomic::cmpxchg_if_null((void*)Self, &_owner)) {

I think these casts aren't needed. _owner is void*, and Self is
Thread*, which is implicitly convertible to void*.

Similarly here, for the THREAD argument:
 280     Contended = Atomic::cmpxchg((void*)THREAD, &_owner, (void*)NULL);
 283     Contended = Atomic::cmpxchg((void*)THREAD, &_owner, (void*)NULL);

------------------------------------------------------------------------------
src/hotspot/share/prims/jvmtiRawMonitor.hpp

This file is in the webrev, but seems to be unchanged.

------------------------------------------------------------------------------
src/hotspot/share/runtime/atomic.hpp
 520 template<typename I, typename D>
 521 inline D Atomic::sub(I sub_value, D volatile* dest) {
 522   STATIC_ASSERT(IsPointer<D>::value || IsIntegral<D>::value);
 523   // Assumes two's complement integer representation.
 524   #pragma warning(suppress: 4146)
 525   return Atomic::add(-sub_value, dest);
 526 }

I'm pretty sure this implementation is incorrect.  I think it produces
the wrong result when I and D are both unsigned integer types and
sizeof(I) < sizeof(D).

------------------------------------------------------------------------------
src/hotspot/share/runtime/mutex.cpp
 304   intptr_t v = Atomic::cmpxchg((intptr_t)_LBIT, &_LockWord.FullWord, (intptr_t)0);  // agro ...

_LBIT should probably be intptr_t, rather than an enum.  Note that the
enum type is unused.  The old value here is another place where an
implicit widening of same signedness would have been nice.  (Such
implicit widening doesn't work for enums, since it's unspecified
whether they default to signed or unsigned representation, and
implementatinos differ.)

------------------------------------------------------------------------------
src/hotspot/share/runtime/mutex.hpp

[pre-existing] 

I think the Address member of the SplitWord union is unused.  Looking
at AcquireOrPush (and others), I'm wondering whether it *should* be
used there, or whether just using intptr_t casts and doing integral
arithmetic (as is presently being done) is easier and clearer.

Also the _LSBINDEX macro probably ought to be defined in mutex.cpp
rather than polluting the global namespace.  And technically, that
name is reserved word.

------------------------------------------------------------------------------
src/hotspot/share/runtime/objectMonitor.cpp 
 252   void * cur = Atomic::cmpxchg((void*)Self, &_owner, (void*)NULL);
 409   if (Atomic::cmpxchg_if_null((void*)Self, &_owner)) {
1983       ox = (Thread*)Atomic::cmpxchg((void*)Self, &_owner, (void*)NULL);

I think the casts of Self aren't needed. 

------------------------------------------------------------------------------
src/hotspot/share/runtime/objectMonitor.cpp 
 995       if (!Atomic::cmpxchg_if_null((void*)THREAD, &_owner)) {
1020         if (!Atomic::cmpxchg_if_null((void*)THREAD, &_owner)) {

I think the casts of THREAD aren't needed. 

------------------------------------------------------------------------------
src/hotspot/share/runtime/objectMonitor.hpp
 254   markOopDesc* volatile* header_addr();

Why isn't this volatile markOop* ?

------------------------------------------------------------------------------
src/hotspot/share/runtime/synchronizer.cpp
 242         Atomic::cmpxchg_if_null((void*)Self, &(m->_owner))) {

I think the cast of Self isn't needed.

------------------------------------------------------------------------------
src/hotspot/share/runtime/synchronizer.cpp
 992   for (; block != NULL; block = (PaddedEnd<ObjectMonitor> *)next(block)) {
1734     for (; block != NULL; block = (PaddedEnd<ObjectMonitor> *)next(block)) {

[pre-existing] 
All calls to next() pass a PaddedEnd<ObjectMonitor>* and cast the
result.  How about moving all that behavior into next().

------------------------------------------------------------------------------
src/hotspot/share/runtime/synchronizer.cpp
1970     if (monitor > (ObjectMonitor *)&block[0] &&
1971         monitor < (ObjectMonitor *)&block[_BLOCKSIZE]) {

[pre-existing] 
Are the casts needed here?  I think PaddedEnd<ObjectMonitor> is
derived from ObjectMonitor, so implicit conversions should apply.

------------------------------------------------------------------------------
src/hotspot/share/runtime/synchronizer.hpp
  28 #include "memory/padded.hpp"
 163   static PaddedEnd<ObjectMonitor> * volatile gBlockList;

I was going to suggest as an alternative just making gBlockList a file
scoped variable in synchronizer.cpp, since it isn't used outside of
that file. Except that it is referenced by vmStructs.  Curses!

------------------------------------------------------------------------------
src/hotspot/share/runtime/thread.cpp
4707   intptr_t w = Atomic::cmpxchg((intptr_t)LOCKBIT, Lock, (intptr_t)0);

This and other places suggest LOCKBIT should be defined as intptr_t,
rather than as an enum value.  The MuxBits enum type is unused.

And the cast of 0 is another case where implicit widening would be nice.

------------------------------------------------------------------------------
src/hotspot/share/services/mallocSiteTable.cpp
 261 bool MallocSiteHashtableEntry::atomic_insert(const MallocSiteHashtableEntry* entry) {
 262   return Atomic::cmpxchg_if_null(entry, (const MallocSiteHashtableEntry**)&_next);
 263 }

I think the problem here that is leading to the cast is that
atomic_insert is taking a const T*.  Note that it's only caller passes
a non-const T*.

------------------------------------------------------------------------------


From david.holmes at oracle.com  Fri Oct 13 00:55:53 2017
From: david.holmes at oracle.com (David Holmes)
Date: Fri, 13 Oct 2017 10:55:53 +1000
Subject: RFR (L, but tedious) 8188220: Remove Atomic::*_ptr() uses and
 overloads from hotspot
In-Reply-To: <7A475565-84D9-4F98-AE7B-2FDB206CC6E1@oracle.com>
References: <b6d75bd8-ca77-6507-f1f0-db57bbb99ae0@oracle.com>
 <7A475565-84D9-4F98-AE7B-2FDB206CC6E1@oracle.com>
Message-ID: <2e8d66a6-24c3-b4de-e187-47a9e582238c@oracle.com>

Hi Kim,

Very detailed analysis! A few things have already been updated by Coleen.

Many of the issues with possibly incorrect/inappropriate types really 
need to be dealt with separately - they go beyond the basic renaming - 
by their component teams.

Similarly any ABA issues - which are likely non-issues but not clearly 
documented - should be handled separately. And the potential race you 
highlight below - though to be honest I couldn't match your statements 
with the code as shown.

Thanks,
David

On 13/10/2017 9:17 AM, Kim Barrett wrote:
>> On Oct 10, 2017, at 6:01 PM, coleen.phillimore at oracle.com wrote:
>>
>> Summary: With the new template functions these are unnecessary.
>>
>> The changes are mostly s/_ptr// and removing the cast to return type.  There weren't many types that needed to be improved to match the template version of the function.   Some notes:
>> 1. replaced CASPTR with Atomic::cmpxchg() in mutex.cpp, rearranging arguments.
>> 2. renamed Atomic::replace_if_null to Atomic::cmpxchg_if_null.  I disliked the first name because it's not explicit from the callers that there's an underlying cas.  If people want to fight, I'll remove the function and use cmpxchg because there are only a couple places where this is a little nicer.
>> 3. Added Atomic::sub()
>>
>> Tested with JPRT, mach5 tier1-5 on linux,windows and solaris.
>>
>> open webrev at http://cr.openjdk.java.net/~coleenp/8188220.01/webrev
>> bug link https://bugs.openjdk.java.net/browse/JDK-8188220
>>
>> Thanks,
>> Coleen
> 
> I looked harder at the potential ABA problems, and believe they are
> okay.  There can be multiple threads doing pushes, and there can be
> multiple threads doing pops, but not both at the same time.
> 
> ------------------------------------------------------------------------------
> src/hotspot/cpu/zero/cppInterpreter_zero.cpp
>   279     if (Atomic::cmpxchg(monitor, lockee->mark_addr(), disp) != disp) {
> 
> How does this work?  monitor and disp seem like they have unrelated
> types?  Given that this is zero-specific code, maybe this hasn't been
> tested?
> 
> Similarly here:
>   423       if (Atomic::cmpxchg(header, rcvr->mark_addr(), lock) != lock) {
> 
> ------------------------------------------------------------------------------
> src/hotspot/share/asm/assembler.cpp
>   239         dcon->value_fn = cfn;
> 
> Is it actually safe to remove the atomic update?  If multiple threads
> performing the assignment *are* possible (and I don't understand the
> context yet, so don't know the answer to that), then a bare non-atomic
> assignment is a race, e.g. undefined behavior.
> 
> Regardless of that, I think the CAST_FROM_FN_PTR should be retained.
> 
> ------------------------------------------------------------------------------
> src/hotspot/share/classfile/classLoaderData.cpp
>   167   Chunk* head = (Chunk*) OrderAccess::load_acquire(&_head);
> 
> I think the cast to Chunk* is no longer needed.
> 
> ------------------------------------------------------------------------------
> src/hotspot/share/classfile/classLoaderData.cpp
>   946     ClassLoaderData* old = Atomic::cmpxchg(cld, cld_addr, (ClassLoaderData*)NULL);
>   947     if (old != NULL) {
>   948       delete cld;
>   949       // Returns the data.
>   950       return old;
>   951     }
> 
> That could instead be
> 
>    if (!Atomic::replace_if_null(cld, cld_addr)) {
>      delete cld;           // Lost the race.
>      return *cld_addr;     // Use the winner's value.
>    }
> 
> And apparently the caller of CLDG::add doesn't care whether the
> returned CLD has actually been added to the graph yet.  If that's not
> true, then there's a bug here, since a race loser might return a
> winner's value before the winner has actually done the insertion.
> 
> ------------------------------------------------------------------------------
> src/hotspot/share/classfile/verifier.cpp
>    71 static void* verify_byte_codes_fn() {
>    72   if (OrderAccess::load_acquire(&_verify_byte_codes_fn) == NULL) {
>    73     void *lib_handle = os::native_java_library();
>    74     void *func = os::dll_lookup(lib_handle, "VerifyClassCodesForMajorVersion");
>    75     OrderAccess::release_store(&_verify_byte_codes_fn, func);
>    76     if (func == NULL) {
>    77       _is_new_verify_byte_codes_fn = false;
>    78       func = os::dll_lookup(lib_handle, "VerifyClassCodes");
>    79       OrderAccess::release_store(&_verify_byte_codes_fn, func);
>    80     }
>    81   }
>    82   return (void*)_verify_byte_codes_fn;
>    83 }
> 
> [pre-existing]
> 
> I think this code has race problems; a caller could unexpectedly and
> inappropriately return NULL.  Consider the case where there is no
> VerifyClassCodesForMajorVersion, but there is VerifyClassCodes.
> 
> The variable is initially NULL.
> 
> Both Thread1 and Thread2 reach line 73, having both seen a NULL value
> for the variable.
> 
> Thread1 reaches line 80, setting the variable to VerifyClassCodes.
> 
> Thread2 reaches line 76, resetting the variable to NULL.
> 
> Thread1 reads the now (momentarily) NULL value and returns it.
> 
> I think the first release_store should be conditional on func != NULL.
> Also, the usage of _is_new_verify_byte_codes_fn seems suspect.
> And a minor additional nit: the cast in the return is unnecessary.
> 
> ------------------------------------------------------------------------------
> src/hotspot/share/code/nmethod.cpp
> 1664   nmethod* observed_mark_link = _oops_do_mark_link;
> 1665   if (observed_mark_link == NULL) {
> 1666     // Claim this nmethod for this thread to mark.
> 1667     if (Atomic::cmpxchg_if_null(NMETHOD_SENTINEL, &_oops_do_mark_link)) {
> 
> With these changes, the only use of observed_mark_link is in the if.
> I'm not sure that variable is really useful anymore, e.g. just use
> 
>    if (_oops_do_mark_link == NULL) {
> 
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/cms/concurrentMarkSweepGeneration.cpp
> 
> In CMSCollector::par_take_from_overflow_list, if BUSY and prefix were
> of type oopDesc*, I think there would be a whole lot fewer casts and
> cast_to_oop's.  Later on, I think suffix_head, observed_overflow_list,
> and curr_overflow_list could also be oopDesc* instead of oop to
> eliminate more casts.
> 
> And some similar changes in CMSCollector::par_push_on_overflow_list.
> 
> And similarly in parNewGeneration.cpp, in push_on_overflow_list and
> take_from_overflow_list_work.
> 
> As noted in the comments for JDK-8165857, the lists and "objects"
> involved here aren't really oops, but rather the shattered remains of
> oops.  The suggestion there was to use HeapWord* and carry through the
> fanout; what was actually done was to change _overflow_list to
> oopDesc* to minimize fanout, even though that's kind of lying to the
> type system.  Now, with the cleanup of cmpxchg_ptr and such, we're
> paying the price of doing the minimal thing back then.
> 
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/cms/concurrentMarkSweepGeneration.cpp
> 7960   Atomic::add(-n, &_num_par_pushes);
> 
> Atomic::sub
> 
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/cms/parNewGeneration.cpp
> 1455   Atomic::add(-n, &_num_par_pushes);
> 
> Atomic::sub
> 
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/dirtyCardQueue.cpp
>   283     void* actual = Atomic::cmpxchg(next, &_cur_par_buffer_node, nd);
> ...
>   289       nd = static_cast<BufferNode*>(actual);
> 
> Change actual's type to BufferNode* and remove the cast on line 289.
> 
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/g1CollectedHeap.cpp
> 
> [pre-existing]
> 3499         old = (CompiledMethod*)_postponed_list;
> 
> I think that cast is only needed because
> G1CodeCacheUnloadingTask::_postponed_list is incorrectly typed as
> "volatile CompiledMethod*", when I think it ought to be
> "CompiledMethod* volatile".
> 
> I think G1CodeCacheUnloading::_claimed_nmethod is similarly mis-typed,
> with a similar should not be needed cast:
> 3530       first = (CompiledMethod*)_claimed_nmethod;
> 
> and another for _postponed_list here:
> 3552       claim = (CompiledMethod*)_postponed_list;
> 
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/g1HotCardCache.cpp
>    77   jbyte* previous_ptr = (jbyte*)Atomic::cmpxchg(card_ptr,
> 
> I think the cast of the cmpxchg result is no longer needed.
> 
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/g1PageBasedVirtualSpace.cpp
>   254       char* touch_addr = (char*)Atomic::add(actual_chunk_size, &_cur_addr) - actual_chunk_size;
> 
> I think the cast of the add result is no longer needed.
> 
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/g1StringDedup.cpp
>   213   return (size_t)Atomic::add(partition_size, &_next_bucket) - partition_size;
> 
> I think the cast of the add result is no longer needed.
> 
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/heapRegionRemSet.cpp
>   200       PerRegionTable* res =
>   201         Atomic::cmpxchg(nxt, &_free_list, fl);
> 
> Please remove the line break, now that the code has been simplified.
> 
> But wait, doesn't this alloc exhibit classic ABA problems?  I *think*
> this works because alloc and bulk_free are called in different phases,
> never overlapping.
> 
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/sparsePRT.cpp
>   295     SparsePRT* res =
>   296       Atomic::cmpxchg(sprt, &_head_expanded_list, hd);
> and
>   307     SparsePRT* res =
>   308       Atomic::cmpxchg(next, &_head_expanded_list, hd);
> 
> I'd rather not have the line breaks in these either.
> 
> And get_from_expanded_list also appears to have classic ABA problems.
> I *think* this works because add_to_expanded_list and
> get_from_expanded_list are called in different phases, never
> overlapping.
> 
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/shared/taskqueue.inline.hpp
>   262   return (size_t) Atomic::cmpxchg((intptr_t)new_age._data,
>   263                                   (volatile intptr_t *)&_data,
>   264                                   (intptr_t)old_age._data);
> 
> This should be
> 
>    return Atomic::cmpxchg(new_age._data, &_data, old_age._data);
> 
> ------------------------------------------------------------------------------
> src/hotspot/share/interpreter/bytecodeInterpreter.cpp
> This doesn't have any casts, which I think is correct.
>   708             if (Atomic::cmpxchg(header, rcvr->mark_addr(), mark) == mark) {
> 
> but these do.
>   718             if (Atomic::cmpxchg((void*)new_header, rcvr->mark_addr(), mark) == mark) {
>   737             if (Atomic::cmpxchg((void*)new_header, rcvr->mark_addr(), header) == header) {
> 
> I'm not sure how the ones with casts even compile?  mark_addr() seems
> to be a markOop*, which is a markOopDesc**, where markOopDesc is a
> class.  void* is not implicitly convertible to markOopDesc*.
> 
> Hm, this entire file is #ifdef CC_INTERP.  Is this zero-only code?  Or
> something like that?
> 
> Similarly here:
>   906           if (Atomic::cmpxchg(header, lockee->mark_addr(), mark) == mark) {
> and
>   917           if (Atomic::cmpxchg((void*)new_header, lockee->mark_addr(), mark) == mark) {
>   935           if (Atomic::cmpxchg((void*)new_header, lockee->mark_addr(), header) == header) {
> 
> and here:
> 1847               if (Atomic::cmpxchg(header, lockee->mark_addr(), mark) == mark) {
> 1858               if (Atomic::cmpxchg((void*)new_header, lockee->mark_addr(), mark) == mark) {
> 1878               if (Atomic::cmpxchg((void*)new_header, lockee->mark_addr(), header) == header) {
> 
> and here:
> 1847               if (Atomic::cmpxchg(header, lockee->mark_addr(), mark) == mark) {
> 1858               if (Atomic::cmpxchg((void*)new_header, lockee->mark_addr(), mark) == mark) {
> 1878               if (Atomic::cmpxchg((void*)new_header, lockee->mark_addr(), header) == header) {
> 
> ------------------------------------------------------------------------------
> src/hotspot/share/memory/metaspace.cpp
> 1502   size_t value = OrderAccess::load_acquire(&_capacity_until_GC);
> ...
> 1537   return (size_t)Atomic::sub((intptr_t)v, &_capacity_until_GC);
> 
> These and other uses of _capacity_until_GC suggest that variable's
> type should be size_t rather than intptr_t.  Note that I haven't done
> a careful check of uses to see if there are any places where such a
> change would cause problems.
> 
> ------------------------------------------------------------------------------
> src/hotspot/share/oops/constantPool.cpp
>   229   OrderAccess::release_store((Klass* volatile *)adr, k);
>   246   OrderAccess::release_store((Klass* volatile *)adr, k);
>   514   OrderAccess::release_store((Klass* volatile *)adr, k);
> 
> Casts are not needed.
> 
> ------------------------------------------------------------------------------
> src/hotspot/share/oops/constantPool.hpp
>   148     volatile intptr_t adr = OrderAccess::load_acquire(obj_at_addr_raw(which));
> 
> [pre-existing]
> Why is adr declared volatile?
> 
> ------------------------------------------------------------------------------
> src/hotspot/share/oops/cpCache.cpp
>   157     intx newflags = (value & parameter_size_mask);
>   158     Atomic::cmpxchg(newflags, &_flags, (intx)0);
> 
> This is a nice demonstration of why I wanted to include some value
> preserving integral conversions in cmpxchg, rather than requiring
> exact type matching in the integral case.  There have been some others
> that I haven't commented on.  Apparently we (I) got away with
> including such conversions in Atomic::add, which I'd forgotten about.
> And see comment regarding Atomic::sub below.
> 
> ------------------------------------------------------------------------------
> src/hotspot/share/oops/cpCache.hpp
>   139   volatile Metadata*   _f1;       // entry specific metadata field
> 
> [pre-existing]
> I suspect the type should be Metadata* volatile.  And that would
> eliminate the need for the cast here:
> 
>   339   Metadata* f1_ord() const                       { return (Metadata *)OrderAccess::load_acquire(&_f1); }
> 
> I don't know if there are any other changes needed or desirable around
> _f1 usage.
> 
> ------------------------------------------------------------------------------
> src/hotspot/share/oops/method.hpp
>   139   volatile address from_compiled_entry() const   { return OrderAccess::load_acquire(&_from_compiled_entry); }
>   140   volatile address from_compiled_entry_no_trampoline() const;
>   141   volatile address from_interpreted_entry() const{ return OrderAccess::load_acquire(&_from_interpreted_entry); }
> 
> [pre-existing]
> The volatile qualifiers here seem suspect to me.
> 
> ------------------------------------------------------------------------------
> src/hotspot/share/oops/oop.inline.hpp
>   391     narrowOop old = (narrowOop)Atomic::xchg(val, (narrowOop*)dest);
> 
> Cast of return type is not needed.
> 
> ------------------------------------------------------------------------------
> src/hotspot/share/prims/jni.cpp
> 
> [pre-existing]
> 
> copy_jni_function_table should be using Copy::disjoint_words_atomic.
> 
> ------------------------------------------------------------------------------
> src/hotspot/share/prims/jni.cpp
> 
> [pre-existing]
> 
> 3892   // We're about to use Atomic::xchg for synchronization.  Some Zero
> 3893   // platforms use the GCC builtin __sync_lock_test_and_set for this,
> 3894   // but __sync_lock_test_and_set is not guaranteed to do what we want
> 3895   // on all architectures.  So we check it works before relying on it.
> 3896 #if defined(ZERO) && defined(ASSERT)
> 3897   {
> 3898     jint a = 0xcafebabe;
> 3899     jint b = Atomic::xchg(0xdeadbeef, &a);
> 3900     void *c = &a;
> 3901     void *d = Atomic::xchg(&b, &c);
> 3902     assert(a == (jint) 0xdeadbeef && b == (jint) 0xcafebabe, "Atomic::xchg() works");
> 3903     assert(c == &b && d == &a, "Atomic::xchg() works");
> 3904   }
> 3905 #endif // ZERO && ASSERT
> 
> It seems rather strange to be testing Atomic::xchg() here, rather than
> as part of unit testing Atomic?  Fail unit testing => don't try to
> use...
> 
> ------------------------------------------------------------------------------
> src/hotspot/share/prims/jvmtiRawMonitor.cpp
>   130     if (Atomic::cmpxchg_if_null((void*)Self, &_owner)) {
>   142     if (_owner == NULL && Atomic::cmpxchg_if_null((void*)Self, &_owner)) {
> 
> I think these casts aren't needed. _owner is void*, and Self is
> Thread*, which is implicitly convertible to void*.
> 
> Similarly here, for the THREAD argument:
>   280     Contended = Atomic::cmpxchg((void*)THREAD, &_owner, (void*)NULL);
>   283     Contended = Atomic::cmpxchg((void*)THREAD, &_owner, (void*)NULL);
> 
> ------------------------------------------------------------------------------
> src/hotspot/share/prims/jvmtiRawMonitor.hpp
> 
> This file is in the webrev, but seems to be unchanged.
> 
> ------------------------------------------------------------------------------
> src/hotspot/share/runtime/atomic.hpp
>   520 template<typename I, typename D>
>   521 inline D Atomic::sub(I sub_value, D volatile* dest) {
>   522   STATIC_ASSERT(IsPointer<D>::value || IsIntegral<D>::value);
>   523   // Assumes two's complement integer representation.
>   524   #pragma warning(suppress: 4146)
>   525   return Atomic::add(-sub_value, dest);
>   526 }
> 
> I'm pretty sure this implementation is incorrect.  I think it produces
> the wrong result when I and D are both unsigned integer types and
> sizeof(I) < sizeof(D).
> 
> ------------------------------------------------------------------------------
> src/hotspot/share/runtime/mutex.cpp
>   304   intptr_t v = Atomic::cmpxchg((intptr_t)_LBIT, &_LockWord.FullWord, (intptr_t)0);  // agro ...
> 
> _LBIT should probably be intptr_t, rather than an enum.  Note that the
> enum type is unused.  The old value here is another place where an
> implicit widening of same signedness would have been nice.  (Such
> implicit widening doesn't work for enums, since it's unspecified
> whether they default to signed or unsigned representation, and
> implementatinos differ.)
> 
> ------------------------------------------------------------------------------
> src/hotspot/share/runtime/mutex.hpp
> 
> [pre-existing]
> 
> I think the Address member of the SplitWord union is unused.  Looking
> at AcquireOrPush (and others), I'm wondering whether it *should* be
> used there, or whether just using intptr_t casts and doing integral
> arithmetic (as is presently being done) is easier and clearer.
> 
> Also the _LSBINDEX macro probably ought to be defined in mutex.cpp
> rather than polluting the global namespace.  And technically, that
> name is reserved word.
> 
> ------------------------------------------------------------------------------
> src/hotspot/share/runtime/objectMonitor.cpp
>   252   void * cur = Atomic::cmpxchg((void*)Self, &_owner, (void*)NULL);
>   409   if (Atomic::cmpxchg_if_null((void*)Self, &_owner)) {
> 1983       ox = (Thread*)Atomic::cmpxchg((void*)Self, &_owner, (void*)NULL);
> 
> I think the casts of Self aren't needed.
> 
> ------------------------------------------------------------------------------
> src/hotspot/share/runtime/objectMonitor.cpp
>   995       if (!Atomic::cmpxchg_if_null((void*)THREAD, &_owner)) {
> 1020         if (!Atomic::cmpxchg_if_null((void*)THREAD, &_owner)) {
> 
> I think the casts of THREAD aren't needed.
> 
> ------------------------------------------------------------------------------
> src/hotspot/share/runtime/objectMonitor.hpp
>   254   markOopDesc* volatile* header_addr();
> 
> Why isn't this volatile markOop* ?
> 
> ------------------------------------------------------------------------------
> src/hotspot/share/runtime/synchronizer.cpp
>   242         Atomic::cmpxchg_if_null((void*)Self, &(m->_owner))) {
> 
> I think the cast of Self isn't needed.
> 
> ------------------------------------------------------------------------------
> src/hotspot/share/runtime/synchronizer.cpp
>   992   for (; block != NULL; block = (PaddedEnd<ObjectMonitor> *)next(block)) {
> 1734     for (; block != NULL; block = (PaddedEnd<ObjectMonitor> *)next(block)) {
> 
> [pre-existing]
> All calls to next() pass a PaddedEnd<ObjectMonitor>* and cast the
> result.  How about moving all that behavior into next().
> 
> ------------------------------------------------------------------------------
> src/hotspot/share/runtime/synchronizer.cpp
> 1970     if (monitor > (ObjectMonitor *)&block[0] &&
> 1971         monitor < (ObjectMonitor *)&block[_BLOCKSIZE]) {
> 
> [pre-existing]
> Are the casts needed here?  I think PaddedEnd<ObjectMonitor> is
> derived from ObjectMonitor, so implicit conversions should apply.
> 
> ------------------------------------------------------------------------------
> src/hotspot/share/runtime/synchronizer.hpp
>    28 #include "memory/padded.hpp"
>   163   static PaddedEnd<ObjectMonitor> * volatile gBlockList;
> 
> I was going to suggest as an alternative just making gBlockList a file
> scoped variable in synchronizer.cpp, since it isn't used outside of
> that file. Except that it is referenced by vmStructs.  Curses!
> 
> ------------------------------------------------------------------------------
> src/hotspot/share/runtime/thread.cpp
> 4707   intptr_t w = Atomic::cmpxchg((intptr_t)LOCKBIT, Lock, (intptr_t)0);
> 
> This and other places suggest LOCKBIT should be defined as intptr_t,
> rather than as an enum value.  The MuxBits enum type is unused.
> 
> And the cast of 0 is another case where implicit widening would be nice.
> 
> ------------------------------------------------------------------------------
> src/hotspot/share/services/mallocSiteTable.cpp
>   261 bool MallocSiteHashtableEntry::atomic_insert(const MallocSiteHashtableEntry* entry) {
>   262   return Atomic::cmpxchg_if_null(entry, (const MallocSiteHashtableEntry**)&_next);
>   263 }
> 
> I think the problem here that is leading to the cast is that
> atomic_insert is taking a const T*.  Note that it's only caller passes
> a non-const T*.
> 
> ------------------------------------------------------------------------------
> 

From david.holmes at oracle.com  Fri Oct 13 03:08:23 2017
From: david.holmes at oracle.com (David Holmes)
Date: Fri, 13 Oct 2017 13:08:23 +1000
Subject: RFR: 8146115 - Improve docker container detection and resource
 configuration usage
In-Reply-To: <9956F9D0-B01B-44FE-AE56-527907816436@oracle.com>
References: <74630458-926E-4B3E-B967-6F6ADCA0A406@oracle.com>
 <33d617e7-7ec4-ebde-efa1-5602189e8470@oracle.com>
 <12909a67-6876-a40c-85b9-b959ed9f02df@oracle.com>
 <39AD9F8D-7E2B-4C15-8525-36DBA7C74302@oracle.com>
 <d3c36d08-1c9f-69bb-4e0c-aa00d3e62691@oracle.com>
 <F25C2D01-C643-4838-B2A7-DAA19D1B1EF5@oracle.com>
 <1d05a95f-75db-e0ca-e069-12fe41502e4f@oracle.com>
 <D2E3D675-2A4A-4ED1-BC30-6C52B185B731@oracle.com>
 <CAHeneC8cxBJ61A4u1AkfmpP63ZG6-WEkMyFLKnhRv=G6W32pFw@mail.gmail.com>
 <615e504d-94af-bae3-b721-6ca1dac6a567@oracle.com>
 <CAHeneC9tdgc9aCx-VLa8TvYufq_V-z+ApoAv5Cewi6UzX9uz8Q@mail.gmail.com>
 <1BD883DB-8C8B-405D-8F85-3A026B19286F@oracle.com>
 <5f7f3d85-db48-fe6e-28f5-e3f4858f33e8@oracle.com>
 <A5CDB304-6B44-448E-ADF3-C736FB49C85C@oracle.com>
 <ff0d651b-eea1-85a1-46d8-c4904556fce2@oracle.com>
 <CD1C6FA4-5729-40A7-91DC-224FCAC55949@oracle.com>
 <799205ae-ba9f-ce3a-8dd6-1a55e32689df@oracle.com>
 <9956F9D0-B01B-44FE-AE56-527907816436@oracle.com>
Message-ID: <20ef0bac-1942-b29f-a9e2-4ea4d4f81cd2@oracle.com>

Hi Bob,

On 13/10/2017 1:43 AM, Bob Vandette wrote:
> 
>> On Oct 11, 2017, at 9:04 PM, David Holmes <david.holmes at oracle.com> wrote:
>>
>> Hi Bob,
>>
>> On 12/10/2017 5:11 AM, Bob Vandette wrote:
>>> Here?s an updated webrev for this RFE that contains changes and cleanups based on feedback I?ve received so far.
>>> I?m still investigating the best approach for reacting to cpu shares and quotas.  I do not believe doing nothing is the answer.
>>
>> I do. :) Let me try this again. When you run outside of a container you don't get 100% of the CPUs - you have to share with whatever else is running on the system. You get a fraction of CPU time based on the load. We don't try to communicate load information to the VM/application so it can adapt. Within a container setting shares/quotas is just a way of setting an artificial load. So why should we be treating it any differently?
> Because today we optimize for a lightly loaded system and when running serverless applications in containers we should be
> optimizing for a fully loaded system.  If developers don?t want this, then don?t use shares or quotas and you?ll have exactly
> the behavior you have today.  I think we just have to document the new behavior (and how to turn it off) so people know what
> to expect.

The person deploying the app may not have control over how the app is 
deployed in terms of shares/quotas. It all depends how (and who) manages 
the containers. This is a big part of my problem/concerns here that I 
don't know exactly how all this is organized and who knows what in 
advance and what they can control.

But I'll let this drop, other than raising an additional concern. I 
don't think just allowing the user to hardwire the number of processors 
to use will necessarily solve the problem with what 
available_processors() returns. I'm concerned the execution of the VM 
may occur in a context where the number of processors is not known in 
advance, and the user can not disable shares/quotas. In that case we may 
need to have a flag that says to ignore shares/quotas in the processor 
count calculation.

> You seem to discount the added cost of 100s of VMs creating lots of un-necessaary threads.  In the current JDK 10 code base,
> In a heavily loaded system with 88 processors, VmData grows from 60MBs (1 cpu) to 376MB (88 cpus).  This is only mapped
> memory and it depends heavily on how deep in the stack these threads go before it impacts VmRSS but it shows the potential downside
> of having 100s of VMs thinking they each own the entire machine.

I agree that the default ergonomics does not scale well. Anyone doing 
any serious Java deployment tunes the VM explicitly and does not rely on 
the defaults. How will they do that in a container environment? I don't 
know.

I would love to see some actual deployment scenarios/experiences for 
this to understand things better.

> I haven?t even done any experiments to determine the added context switching cost if the VM decides to use excessive
> pthreads.
> 
>>
>> That's not to say an API to provide load/shares/quota information may not be useful, but that is a separate issue to what the "active processor count" should report.
> I don?t have a problem with active processor count reporting the number of processors we have, but I do have a problem
> with our current usage of this information within the VM and Core libraries.

That is a somewhat separate issue. One worth pursuing separately.

>>
>>> http://cr.openjdk.java.net/~bobv/8146115/webrev.01
>>> Updates:
>>> 1. I had to move the processing of AggressiveHeap since the container memory size needs to be known before this can be processed.
>>
>> I don't like the placement of this - we don't call os:: init functions from inside Arguments - we manage the initialization sequence from Threads::create_vm. Seems to me that container initialization can/should happen in os::init_before_ergo, and the AggressiveHeap processing can occur at the start of Arguments::apply_ergo().
>>
>> That said we need to be sure nothing touched by set_aggressive_heap_flags will be used before we now reach that code - there are a lot of flags being set in there.
> 
> This is exactly the reason why I put the call where it did.  I put the call to set_aggressive_heap_flags in finalize_vm_init_args
> because that is exactly what this call is doing.  It?s finalizing flags used after the parsing.  The impacted flags are definitely being
> used shortly after and before init_before_ergo is called.

I see that now and it is very unfortunate because I really do not like 
what you had to do here. As you can tell from the logic in create_vm we 
have always refactored to ensure we can progressively manage the 
interleaving of OS initialization with Arguments processing. So having a 
deep part of Argument processing go off and call some more OS 
initialization is not nice. That said I can't see a way around it 
without very unreasonable refactoring.

But I do have a couple of changes I'd like to request please:

1. Move the call to os::initialize_container_support() up a level to 
before the call to finalize_vm_init_args(), with a more elaborate comment:

// We need to ensure processor and memory resources have been properly
// configured - which may rely on arguments we just processed - before
// doing the final argument processing. Any argument processing that
// needs to know about processor and memory resources must occur after
// this point.

os::initialize_container_support();

// Do final processing now that all arguments have been parsed
result = finalize_vm_init_args(patch_mod_javabase);

2. Simplify and modify os.hpp as follows:

+  LINUX_ONLY(static void pd_initialize_container_support();)

    public:
     static void init(void);                      // Called before 
command line parsing

+   static void initialize_container_support() { // Called during 
command line parsing
+     LINUX_ONLY(pd_initialize_container_support();)
+   }

     static void init_before_ergo(void);          // Called after 
command line parsing
                                                  // before VM ergonomics

3. In thread.cpp add a comment here:

    // Parse arguments
+  // Note: this internally calls os::initialize_container_support()
    jint parse_result = Arguments::parse(args);

Thanks.

> 
>>
>>> 2. I no longer use the cpuset.cpus contents since sched_getaffinity reports the correct results
>>> even if someone manually updates the cgroup data.  I originally didn?t think this was the case since
>>> sched_setaffinity didn?t automatically update the cpuset file contents but the inverse is true.
>>
>> Ok.
>>
>>> 3. I ifdef?d the container function support in src/hotspot/share/runtime/os.hpp to avoid putting stubs in all other os
>>> platform directories.  I can do this if it?s absolutely necessary.
>>
>> You should not need to do this if initialization moves as I suggested above. os::init_before_ergo() in os_linux.cpp can call OSContainer::init().
> 
>> No need for os::initialize_container_support() or os::pd_initialize_container_support.
> 
> But os::init_before_ergo is in shared code.

Yep my bad - point is moot now anyway.

<snip>

>> src/hotspot/os/linux/os_linux.cpp/.hpp
>>
>> 187         log_trace(os)("available container memory: " JULONG_FORMAT, avail_mem);
>> 188         return avail_mem;
>> 189       } else {
>> 190         log_debug(os,container)("container memory usage call failed: " JLONG_FORMAT, mem_usage);
>>
>> Why "trace" (the third logging level) to show the information, but "debug" (the second level) to show failed calls? You use debug in other files for basic info. Overall I'm unclear on your use of debug versus trace for the logging.
> 
> I use trace for noisy information that is not reporting errors and debug for failures that are informational and not fatal.
> In this case, the call could return -1 or -2.  -1 is unlimited and -2 is an error.  In either case we fallback to the
> standard system call to get available memory.  I would have used warning but since these messages were occurring
> during a test run causing test failures.

Okay. Thanks for clarifying.

>>
>> ---
>>
>> src/hotspot/os/linux/osContainer_linux.cpp
>>
>> Dead code:
>>
>> 376 #if 0
>> 377   os::Linux::print_container_info(tty);
>> ...
>> 390 #endif
> 
> I left it in for standalone testing.  Should I use some other #if?

We don't generally leave in dead code in the runtime code. Do you see 
this as useful after you've finalized the changes?

Is this testing just for showing the logging? Is it worth making this a 
logging controlled call? Is it suitable for a Gtest test?

Thanks,
David
-----

> Bob.
> 
>>
>> Thanks,
>> David
>>
>>> Bob.
> 

From goetz.lindenmaier at sap.com  Fri Oct 13 06:38:59 2017
From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz)
Date: Fri, 13 Oct 2017 06:38:59 +0000
Subject: RFR(M): 8189102: All tools should support -?, -h and --help
In-Reply-To: <a0c7c170-7ee7-6fcb-90c2-ce774a328669@oracle.com>
References: <c27028f9f5844c3c818885f88c75f84b@sap.com>
 <a0c7c170-7ee7-6fcb-90c2-ce774a328669@oracle.com>
Message-ID: <2cd7785d6dad442e90d403b2eb96c588@sap.com>

Hi Vladimir, 

added that for jaotc, thanks!

Best regards,
  Goetz.

> -----Original Message-----
> From: hotspot-dev [mailto:hotspot-dev-bounces at openjdk.java.net] On
> Behalf Of Vladimir Kozlov
> Sent: Donnerstag, 12. Oktober 2017 01:04
> To: hotspot-dev at openjdk.java.net
> Subject: Re: RFR(M): 8189102: All tools should support -?, -h and --help
> 
> You missed AOT tool jaotc:
> 
> http://hg.openjdk.java.net/jdk10/hs/file/44117bc2bedf/src/jdk.aot/share/cl
> asses/jdk.tools.jaotc/src/jdk/tools/jaotc/Options.java#l230
> 
>      }, new Option("  --help                     Print this usage message", false, "--help",
> "-h", "-?") {
> 
> Vladimir
> 
> On 10/11/17 1:06 PM, Lindenmaier, Goetz wrote:
> > Hi
> >
> > The tools in jdk should all show the same behavior wrt. help flags.
> > This change normalizes the help flags of a row of the tools in the jdk.
> > Java accepts -?, -h and --help, thus I changed the tools to support
> > these, too.  Some tools exited with '1' after displaying the help message,
> > I turned this to '0'.
> >
> > Maybe this is not the right mailing list for this, please advise.
> >
> > Please review this change. I please need a sponsor.
> > http://cr.openjdk.java.net/~goetz/wr17/8189102-
> helpMessage/webrev.01/
> >
> > In detail, this fixes the help message of the following tools:
> > jar          -? -h --help;  added -?.
> > jarsigner    -? -h --help;  added --help. -help accepted but not documented.
> > javac        -?    --help;  added -?. Removed -help. -h is taken for other
> purpose
> > javadoc      -? -h --help;  added -h -?. Removed -help
> > javap        -? -h --help;  added -h. -help accepted but no more documented.
> > jcmd         -? -h --help;  added -? --help. -help accepted but no more
> documented. Changed return value to '0'
> > jdb          -? -h --help;  added -? -h --help. -help accepted but no more
> documented.
> > jdeprscan    -? -h --help;  added -?
> > jinfo        -? -h --help;  added -? --help. -help accepted but no more
> documented.
> > jjs             -h --help;  Replaced -help by --help. Adding more not straight
> forward.
> > jps          -? -h --help;  added -? --help. -help accepted but no more
> documented.
> > jshell       -? -h --help;  added -?
> > jstat        -? -h --help;  added -h --help. -help accepted but no more
> documented.
> >
> > Best regards,
> >    Goetz.
> >

From thomas.schatzl at oracle.com  Fri Oct 13 13:04:21 2017
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Fri, 13 Oct 2017 15:04:21 +0200
Subject: RFR(M) 8186834:Expanding old area without full GC in parallel GC
In-Reply-To: <OFCBC5A5CD.3EF6ED1F-ON0025818A.00475480-4925818A.005443B4@notes.na.collabserv.com>
References: <OFCBC5A5CD.3EF6ED1F-ON0025818A.00475480-4925818A.005443B4@notes.na.collabserv.com>
Message-ID: <1507899861.3162.12.camel@oracle.com>

Hi,

On Tue, 2017-08-29 at 00:20 +0900, Michihiro Horie wrote:
> Dear all,
> 
> Would you please review the following change?
> bug: https://bugs.openjdk.java.net/browse/JDK-8186834
> webrev: http://cr.openjdk.java.net/~mhorie/8186834/webrev.00/
> 
> In parallel GC, old area is expanded only after a full GC occurs.
> I am wondering if we could give an option to expand old area without
> full GC. So, I added an option
> UseAdaptiveGenerationSizePolicyBeforeMajorCollection

Sorry for the late (and probably stupid) question, but what is the
difference (in performance) to simply set -Xms==-Xmx here?

And why not make the (first) full gc expand the heap more aggressively?
(I think there is at least one way to do that, something like
Min/MaxFreeHeapRatio or so, I can look it up if needed).

Thanks,
  Thomas

> Following is a simple micro benchmark I used to see the benefit of
> this change.
> As a result, pause time of full GC reduced by 30%. Full GC count
> reduced by 54%.
> Elapsed time reduced by 7%.
> 
> import java.util.HashMap;
> import java.util.Map;
> public class HeapExpandTest {
> ? static Map<Integer, byte[]> map = new HashMap<>();
> ? public static void main(String[] args) throws Exception {
> ????long start = System.currentTimeMillis();
> ????for (int i = 0; i < 2200; ++i) {
> ??????map.put(i, new byte[1024*1024]); // 1MB
> ????}
> ????System.out.println("elapsed= " + (System.currentTimeMillis() -
> start));
> ? }
> }
> 
> JVM options: -XX:+UseParallelGC -XX:+UseAdaptiveSizePolicy
> -XX:ParallelGCThreads=8 -Xms64m -Xmx3g
> -XX:+UseAdaptiveGenerationSizePolicyBeforeMajorCollection


From bob.vandette at oracle.com  Fri Oct 13 13:14:19 2017
From: bob.vandette at oracle.com (Bob Vandette)
Date: Fri, 13 Oct 2017 09:14:19 -0400
Subject: RFR: 8146115 - Improve docker container detection and resource
 configuration usage
In-Reply-To: <20ef0bac-1942-b29f-a9e2-4ea4d4f81cd2@oracle.com>
References: <74630458-926E-4B3E-B967-6F6ADCA0A406@oracle.com>
 <33d617e7-7ec4-ebde-efa1-5602189e8470@oracle.com>
 <12909a67-6876-a40c-85b9-b959ed9f02df@oracle.com>
 <39AD9F8D-7E2B-4C15-8525-36DBA7C74302@oracle.com>
 <d3c36d08-1c9f-69bb-4e0c-aa00d3e62691@oracle.com>
 <F25C2D01-C643-4838-B2A7-DAA19D1B1EF5@oracle.com>
 <1d05a95f-75db-e0ca-e069-12fe41502e4f@oracle.com>
 <D2E3D675-2A4A-4ED1-BC30-6C52B185B731@oracle.com>
 <CAHeneC8cxBJ61A4u1AkfmpP63ZG6-WEkMyFLKnhRv=G6W32pFw@mail.gmail.com>
 <615e504d-94af-bae3-b721-6ca1dac6a567@oracle.com>
 <CAHeneC9tdgc9aCx-VLa8TvYufq_V-z+ApoAv5Cewi6UzX9uz8Q@mail.gmail.com>
 <1BD883DB-8C8B-405D-8F85-3A026B19286F@oracle.com>
 <5f7f3d85-db48-fe6e-28f5-e3f4858f33e8@oracle.com>
 <A5CDB304-6B44-448E-ADF3-C736FB49C85C@oracle.com>
 <ff0d651b-eea1-85a1-46d8-c4904556fce2@oracle.com>
 <CD1C6FA4-5729-40A7-91DC-224FCAC55949@oracle.com>
 <799205ae-ba9f-ce3a-8dd6-1a55e32689df@oracle.com>
 <9956F9D0-B01B-44FE-AE56-527907816436@oracle.com>
 <20ef0bac-1942-b29f-a9e2-4ea4d4f81cd2@oracle.com>
Message-ID: <C080A2E3-5EFE-4753-8048-4B991CEBDB24@oracle.com>


> On Oct 12, 2017, at 11:08 PM, David Holmes <david.holmes at oracle.com> wrote:
> 
> Hi Bob,
> 
> On 13/10/2017 1:43 AM, Bob Vandette wrote:
>>> On Oct 11, 2017, at 9:04 PM, David Holmes <david.holmes at oracle.com> wrote:
>>> 
>>> Hi Bob,
>>> 
>>> On 12/10/2017 5:11 AM, Bob Vandette wrote:
>>>> Here?s an updated webrev for this RFE that contains changes and cleanups based on feedback I?ve received so far.
>>>> I?m still investigating the best approach for reacting to cpu shares and quotas.  I do not believe doing nothing is the answer.
>>> 
>>> I do. :) Let me try this again. When you run outside of a container you don't get 100% of the CPUs - you have to share with whatever else is running on the system. You get a fraction of CPU time based on the load. We don't try to communicate load information to the VM/application so it can adapt. Within a container setting shares/quotas is just a way of setting an artificial load. So why should we be treating it any differently?
>> Because today we optimize for a lightly loaded system and when running serverless applications in containers we should be
>> optimizing for a fully loaded system.  If developers don?t want this, then don?t use shares or quotas and you?ll have exactly
>> the behavior you have today.  I think we just have to document the new behavior (and how to turn it off) so people know what
>> to expect.
> 
> The person deploying the app may not have control over how the app is deployed in terms of shares/quotas. It all depends how (and who) manages the containers. This is a big part of my problem/concerns here that I don't know exactly how all this is organized and who knows what in advance and what they can control.
> 
> But I'll let this drop, other than raising an additional concern. I don't think just allowing the user to hardwire the number of processors to use will necessarily solve the problem with what available_processors() returns. I'm concerned the execution of the VM may occur in a context where the number of processors is not known in advance, and the user can not disable shares/quotas. In that case we may need to have a flag that says to ignore shares/quotas in the processor count calculation.

I?m not sure that?s a high probability issue.  It?s my understanding that whoever is configuring the container 
management will be specifying the resources required to run these applications which comes along with a 
guarantee of these resources.  If this issue does come up, I do have the -XX:-UseContainerSupport big 
switch that turns all of this off.  It will however disable the memory support as well.

> 
>> You seem to discount the added cost of 100s of VMs creating lots of un-necessaary threads.  In the current JDK 10 code base,
>> In a heavily loaded system with 88 processors, VmData grows from 60MBs (1 cpu) to 376MB (88 cpus).  This is only mapped
>> memory and it depends heavily on how deep in the stack these threads go before it impacts VmRSS but it shows the potential downside
>> of having 100s of VMs thinking they each own the entire machine.
> 
> I agree that the default ergonomics does not scale well. Anyone doing any serious Java deployment tunes the VM explicitly and does not rely on the defaults. How will they do that in a container environment? I don't know.
> 
> I would love to see some actual deployment scenarios/experiences for this to understand things better.

This is one of the reasons I want to get this support out in JDK 10, to get some feedback under real scenarios.

> 
>> I haven?t even done any experiments to determine the added context switching cost if the VM decides to use excessive
>> pthreads.
>>> 
>>> That's not to say an API to provide load/shares/quota information may not be useful, but that is a separate issue to what the "active processor count" should report.
>> I don?t have a problem with active processor count reporting the number of processors we have, but I do have a problem
>> with our current usage of this information within the VM and Core libraries.
> 
> That is a somewhat separate issue. One worth pursuing separately.

We should look at this as part of the ?Container aware Java? JEP.

> 
>>> 
>>>> http://cr.openjdk.java.net/~bobv/8146115/webrev.01
>>>> Updates:
>>>> 1. I had to move the processing of AggressiveHeap since the container memory size needs to be known before this can be processed.
>>> 
>>> I don't like the placement of this - we don't call os:: init functions from inside Arguments - we manage the initialization sequence from Threads::create_vm. Seems to me that container initialization can/should happen in os::init_before_ergo, and the AggressiveHeap processing can occur at the start of Arguments::apply_ergo().
>>> 
>>> That said we need to be sure nothing touched by set_aggressive_heap_flags will be used before we now reach that code - there are a lot of flags being set in there.
>> This is exactly the reason why I put the call where it did.  I put the call to set_aggressive_heap_flags in finalize_vm_init_args
>> because that is exactly what this call is doing.  It?s finalizing flags used after the parsing.  The impacted flags are definitely being
>> used shortly after and before init_before_ergo is called.
> 
> I see that now and it is very unfortunate because I really do not like what you had to do here. As you can tell from the logic in create_vm we have always refactored to ensure we can progressively manage the interleaving of OS initialization with Arguments processing. So having a deep part of Argument processing go off and call some more OS initialization is not nice. That said I can't see a way around it without very unreasonable refactoring.
> 
> But I do have a couple of changes I'd like to request please:
> 
> 1. Move the call to os::initialize_container_support() up a level to before the call to finalize_vm_init_args(), with a more elaborate comment:
> 
> // We need to ensure processor and memory resources have been properly
> // configured - which may rely on arguments we just processed - before
> // doing the final argument processing. Any argument processing that
> // needs to know about processor and memory resources must occur after
> // this point.
> 
> os::initialize_container_support();
> 
> // Do final processing now that all arguments have been parsed
> result = finalize_vm_init_args(patch_mod_javabase);
> 
> 2. Simplify and modify os.hpp as follows:
> 
> +  LINUX_ONLY(static void pd_initialize_container_support();)
> 
>   public:
>    static void init(void);                      // Called before command line parsing
> 
> +   static void initialize_container_support() { // Called during command line parsing
> +     LINUX_ONLY(pd_initialize_container_support();)
> +   }
> 
>    static void init_before_ergo(void);          // Called after command line parsing
>                                                 // before VM ergonomics
> 
> 3. In thread.cpp add a comment here:
> 
>   // Parse arguments
> +  // Note: this internally calls os::initialize_container_support()
>   jint parse_result = Arguments::parse(args);

All very reasonable changes. 

Thanks,
Bob.

> 
> Thanks.
> 
>>> 
>>>> 2. I no longer use the cpuset.cpus contents since sched_getaffinity reports the correct results
>>>> even if someone manually updates the cgroup data.  I originally didn?t think this was the case since
>>>> sched_setaffinity didn?t automatically update the cpuset file contents but the inverse is true.
>>> 
>>> Ok.
>>> 
>>>> 3. I ifdef?d the container function support in src/hotspot/share/runtime/os.hpp to avoid putting stubs in all other os
>>>> platform directories.  I can do this if it?s absolutely necessary.
>>> 
>>> You should not need to do this if initialization moves as I suggested above. os::init_before_ergo() in os_linux.cpp can call OSContainer::init().
>>> No need for os::initialize_container_support() or os::pd_initialize_container_support.
>> But os::init_before_ergo is in shared code.
> 
> Yep my bad - point is moot now anyway.
> 
> <snip>
> 
>>> src/hotspot/os/linux/os_linux.cpp/.hpp
>>> 
>>> 187         log_trace(os)("available container memory: " JULONG_FORMAT, avail_mem);
>>> 188         return avail_mem;
>>> 189       } else {
>>> 190         log_debug(os,container)("container memory usage call failed: " JLONG_FORMAT, mem_usage);
>>> 
>>> Why "trace" (the third logging level) to show the information, but "debug" (the second level) to show failed calls? You use debug in other files for basic info. Overall I'm unclear on your use of debug versus trace for the logging.
>> I use trace for noisy information that is not reporting errors and debug for failures that are informational and not fatal.
>> In this case, the call could return -1 or -2.  -1 is unlimited and -2 is an error.  In either case we fallback to the
>> standard system call to get available memory.  I would have used warning but since these messages were occurring
>> during a test run causing test failures.
> 
> Okay. Thanks for clarifying.
> 
>>> 
>>> ---
>>> 
>>> src/hotspot/os/linux/osContainer_linux.cpp
>>> 
>>> Dead code:
>>> 
>>> 376 #if 0
>>> 377   os::Linux::print_container_info(tty);
>>> ...
>>> 390 #endif
>> I left it in for standalone testing.  Should I use some other #if?
> 
> We don't generally leave in dead code in the runtime code. Do you see this as useful after you've finalized the changes?
> 
> Is this testing just for showing the logging? Is it worth making this a logging controlled call? Is it suitable for a Gtest test?
> 
> Thanks,
> David
> -----
> 
>> Bob.
>>> 
>>> Thanks,
>>> David
>>> 
>>>> Bob.


From coleen.phillimore at oracle.com  Fri Oct 13 13:25:06 2017
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Fri, 13 Oct 2017 09:25:06 -0400
Subject: RFR (L, but tedious) 8188220: Remove Atomic::*_ptr() uses and
 overloads from hotspot
In-Reply-To: <7A475565-84D9-4F98-AE7B-2FDB206CC6E1@oracle.com>
References: <b6d75bd8-ca77-6507-f1f0-db57bbb99ae0@oracle.com>
 <7A475565-84D9-4F98-AE7B-2FDB206CC6E1@oracle.com>
Message-ID: <49b7c5f7-2f6d-16ac-0b60-140619d0fffd@oracle.com>


Hi Kim, Thank you for the detailed review and the time you've spent on 
it, and discussion yesterday.

On 10/12/17 7:17 PM, Kim Barrett wrote:
>> On Oct 10, 2017, at 6:01 PM, coleen.phillimore at oracle.com wrote:
>>
>> Summary: With the new template functions these are unnecessary.
>>
>> The changes are mostly s/_ptr// and removing the cast to return type.  There weren't many types that needed to be improved to match the template version of the function.   Some notes:
>> 1. replaced CASPTR with Atomic::cmpxchg() in mutex.cpp, rearranging arguments.
>> 2. renamed Atomic::replace_if_null to Atomic::cmpxchg_if_null.  I disliked the first name because it's not explicit from the callers that there's an underlying cas.  If people want to fight, I'll remove the function and use cmpxchg because there are only a couple places where this is a little nicer.
>> 3. Added Atomic::sub()
>>
>> Tested with JPRT, mach5 tier1-5 on linux,windows and solaris.
>>
>> open webrev at http://cr.openjdk.java.net/~coleenp/8188220.01/webrev
>> bug link https://bugs.openjdk.java.net/browse/JDK-8188220
>>
>> Thanks,
>> Coleen
> I looked harder at the potential ABA problems, and believe they are
> okay.  There can be multiple threads doing pushes, and there can be
> multiple threads doing pops, but not both at the same time.
>
> ------------------------------------------------------------------------------
> src/hotspot/cpu/zero/cppInterpreter_zero.cpp
>   279     if (Atomic::cmpxchg(monitor, lockee->mark_addr(), disp) != disp) {
>
> How does this work?  monitor and disp seem like they have unrelated
> types?  Given that this is zero-specific code, maybe this hasn't been
> tested?
>
> Similarly here:
>   423       if (Atomic::cmpxchg(header, rcvr->mark_addr(), lock) != lock) {

I haven't built zero.? I don't know how to do this anymore (help?) I 
fixed the obvious type mismatches here and in bytecodeInterpreter.cpp.? 
I'll try to build it.
>
> ------------------------------------------------------------------------------
> src/hotspot/share/asm/assembler.cpp
>   239         dcon->value_fn = cfn;
>
> Is it actually safe to remove the atomic update?  If multiple threads
> performing the assignment *are* possible (and I don't understand the
> context yet, so don't know the answer to that), then a bare non-atomic
> assignment is a race, e.g. undefined behavior.
>
> Regardless of that, I think the CAST_FROM_FN_PTR should be retained.

I can find no uses of this code, ie. looking for "delayed_value".? I 
think it was early jsr292 code.? I could also not find any combination 
of casts that would make it compile, so in the end I believed the 
comment and took out the cmpxchg.?? The code appears to be intended to 
for bootstrapping, see the call to update_delayed_values() in 
JavaClasses::compute_offsets().

The CAST_FROM_FN_PTR was to get it to compile with cmpxchg, the new code 
does not require a cast.? If you can help with finding the right set of 
casts, I'd be happy to put the cmpxchg back in.? I just couldn't find one.
>
> ------------------------------------------------------------------------------
> src/hotspot/share/classfile/classLoaderData.cpp
>   167   Chunk* head = (Chunk*) OrderAccess::load_acquire(&_head);
>
> I think the cast to Chunk* is no longer needed.

Missed another, thanks.? No that's the same one David found.
>
> ------------------------------------------------------------------------------
> src/hotspot/share/classfile/classLoaderData.cpp
>   946     ClassLoaderData* old = Atomic::cmpxchg(cld, cld_addr, (ClassLoaderData*)NULL);
>   947     if (old != NULL) {
>   948       delete cld;
>   949       // Returns the data.
>   950       return old;
>   951     }
>
> That could instead be
>
>    if (!Atomic::replace_if_null(cld, cld_addr)) {
>      delete cld;           // Lost the race.
>      return *cld_addr;     // Use the winner's value.
>    }
>
> And apparently the caller of CLDG::add doesn't care whether the
> returned CLD has actually been added to the graph yet.  If that's not
> true, then there's a bug here, since a race loser might return a
> winner's value before the winner has actually done the insertion.

True, the race loser doesn't care whether the CLD has been added to the 
graph.
Your instead code requires a comment that replace_if_null is really a 
compare exchange and has an extra read of the original value, so I am 
leaving what I have which is clearer to me.

>
> ------------------------------------------------------------------------------
> src/hotspot/share/classfile/verifier.cpp
>    71 static void* verify_byte_codes_fn() {
>    72   if (OrderAccess::load_acquire(&_verify_byte_codes_fn) == NULL) {
>    73     void *lib_handle = os::native_java_library();
>    74     void *func = os::dll_lookup(lib_handle, "VerifyClassCodesForMajorVersion");
>    75     OrderAccess::release_store(&_verify_byte_codes_fn, func);
>    76     if (func == NULL) {
>    77       _is_new_verify_byte_codes_fn = false;
>    78       func = os::dll_lookup(lib_handle, "VerifyClassCodes");
>    79       OrderAccess::release_store(&_verify_byte_codes_fn, func);
>    80     }
>    81   }
>    82   return (void*)_verify_byte_codes_fn;
>    83 }
>
> [pre-existing]
>
> I think this code has race problems; a caller could unexpectedly and
> inappropriately return NULL.  Consider the case where there is no
> VerifyClassCodesForMajorVersion, but there is VerifyClassCodes.
>
> The variable is initially NULL.
>
> Both Thread1 and Thread2 reach line 73, having both seen a NULL value
> for the variable.
>
> Thread1 reaches line 80, setting the variable to VerifyClassCodes.
>
> Thread2 reaches line 76, resetting the variable to NULL.
>
> Thread1 reads the now (momentarily) NULL value and returns it.
>
> I think the first release_store should be conditional on func != NULL.
> Also, the usage of _is_new_verify_byte_codes_fn seems suspect.
> And a minor additional nit: the cast in the return is unnecessary.

Yes, this looks like a bug.?? I'll cut/paste this and file it.?? It may 
be that this is support for the old verifier in old jdk versions that 
can be cleaned up.
>
> ------------------------------------------------------------------------------
> src/hotspot/share/code/nmethod.cpp
> 1664   nmethod* observed_mark_link = _oops_do_mark_link;
> 1665   if (observed_mark_link == NULL) {
> 1666     // Claim this nmethod for this thread to mark.
> 1667     if (Atomic::cmpxchg_if_null(NMETHOD_SENTINEL, &_oops_do_mark_link)) {
>
> With these changes, the only use of observed_mark_link is in the if.
> I'm not sure that variable is really useful anymore, e.g. just use
>
>    if (_oops_do_mark_link == NULL) {

Ok fixed.
>
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/cms/concurrentMarkSweepGeneration.cpp
>
> In CMSCollector::par_take_from_overflow_list, if BUSY and prefix were
> of type oopDesc*, I think there would be a whole lot fewer casts and
> cast_to_oop's.  Later on, I think suffix_head, observed_overflow_list,
> and curr_overflow_list could also be oopDesc* instead of oop to
> eliminate more casts.

I actually tried to make this change but ran into more fan out that way, 
so went back and just fixed the cmpxchg calls to cast oops to oopDesc* 
and things were less perturbed that way.
>
> And some similar changes in CMSCollector::par_push_on_overflow_list.
>
> And similarly in parNewGeneration.cpp, in push_on_overflow_list and
> take_from_overflow_list_work.
>
> As noted in the comments for JDK-8165857, the lists and "objects"
> involved here aren't really oops, but rather the shattered remains of

Yes, somewhat horrified at the value of BUSY.
> oops.  The suggestion there was to use HeapWord* and carry through the
> fanout; what was actually done was to change _overflow_list to
> oopDesc* to minimize fanout, even though that's kind of lying to the
> type system.  Now, with the cleanup of cmpxchg_ptr and such, we're
> paying the price of doing the minimal thing back then.

I will file an RFE about cleaning this up.? I think what I've done was 
the minimal thing.
>
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/cms/concurrentMarkSweepGeneration.cpp
> 7960   Atomic::add(-n, &_num_par_pushes);
>
> Atomic::sub

fixed.
>
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/cms/parNewGeneration.cpp
> 1455   Atomic::add(-n, &_num_par_pushes);
fixed.
> Atomic::sub
>
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/dirtyCardQueue.cpp
>   283     void* actual = Atomic::cmpxchg(next, &_cur_par_buffer_node, nd);
> ...
>   289       nd = static_cast<BufferNode*>(actual);
>
> Change actual's type to BufferNode* and remove the cast on line 289.

fixed.? missed that one. gross.
>
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/g1CollectedHeap.cpp
>
> [pre-existing]
> 3499         old = (CompiledMethod*)_postponed_list;
>
> I think that cast is only needed because
> G1CodeCacheUnloadingTask::_postponed_list is incorrectly typed as
> "volatile CompiledMethod*", when I think it ought to be
> "CompiledMethod* volatile".
>
> I think G1CodeCacheUnloading::_claimed_nmethod is similarly mis-typed,
> with a similar should not be needed cast:
> 3530       first = (CompiledMethod*)_claimed_nmethod;
>
> and another for _postponed_list here:
> 3552       claim = (CompiledMethod*)_postponed_list;

I've fixed this.?? C++ is so confusing about where to put the 
volatile.?? Everyone has been tripped up by it.
>
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/g1HotCardCache.cpp
>    77   jbyte* previous_ptr = (jbyte*)Atomic::cmpxchg(card_ptr,
>
> I think the cast of the cmpxchg result is no longer needed.

fixed.
>
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/g1PageBasedVirtualSpace.cpp
>   254       char* touch_addr = (char*)Atomic::add(actual_chunk_size, &_cur_addr) - actual_chunk_size;
>
> I think the cast of the add result is no longer needed.
got it already.
>
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/g1StringDedup.cpp
>   213   return (size_t)Atomic::add(partition_size, &_next_bucket) - partition_size;
>
> I think the cast of the add result is no longer needed.

I was slacking in the g1 files.? fixed.
>
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/heapRegionRemSet.cpp
>   200       PerRegionTable* res =
>   201         Atomic::cmpxchg(nxt, &_free_list, fl);
>
> Please remove the line break, now that the code has been simplified.
>
> But wait, doesn't this alloc exhibit classic ABA problems?  I *think*
> this works because alloc and bulk_free are called in different phases,
> never overlapping.

I don't know.? Do you want to file a bug to investigate this?
fixed.
>
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/sparsePRT.cpp
>   295     SparsePRT* res =
>   296       Atomic::cmpxchg(sprt, &_head_expanded_list, hd);
> and
>   307     SparsePRT* res =
>   308       Atomic::cmpxchg(next, &_head_expanded_list, hd);
>
> I'd rather not have the line breaks in these either.
>
> And get_from_expanded_list also appears to have classic ABA problems.
> I *think* this works because add_to_expanded_list and
> get_from_expanded_list are called in different phases, never
> overlapping.

Fixed, same question as above?? Or one bug to investigate both?
>
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/shared/taskqueue.inline.hpp
>   262   return (size_t) Atomic::cmpxchg((intptr_t)new_age._data,
>   263                                   (volatile intptr_t *)&_data,
>   264                                   (intptr_t)old_age._data);
>
> This should be
>
>    return Atomic::cmpxchg(new_age._data, &_data, old_age._data);

fixed.
>
> ------------------------------------------------------------------------------
> src/hotspot/share/interpreter/bytecodeInterpreter.cpp
> This doesn't have any casts, which I think is correct.
>   708             if (Atomic::cmpxchg(header, rcvr->mark_addr(), mark) == mark) {
>
> but these do.
>   718             if (Atomic::cmpxchg((void*)new_header, rcvr->mark_addr(), mark) == mark) {
>   737             if (Atomic::cmpxchg((void*)new_header, rcvr->mark_addr(), header) == header) {
>
> I'm not sure how the ones with casts even compile?  mark_addr() seems
> to be a markOop*, which is a markOopDesc**, where markOopDesc is a
> class.  void* is not implicitly convertible to markOopDesc*.
>
> Hm, this entire file is #ifdef CC_INTERP.  Is this zero-only code?  Or
> something like that?
>
> Similarly here:
>   906           if (Atomic::cmpxchg(header, lockee->mark_addr(), mark) == mark) {
> and
>   917           if (Atomic::cmpxchg((void*)new_header, lockee->mark_addr(), mark) == mark) {
>   935           if (Atomic::cmpxchg((void*)new_header, lockee->mark_addr(), header) == header) {
>
> and here:
> 1847               if (Atomic::cmpxchg(header, lockee->mark_addr(), mark) == mark) {
> 1858               if (Atomic::cmpxchg((void*)new_header, lockee->mark_addr(), mark) == mark) {
> 1878               if (Atomic::cmpxchg((void*)new_header, lockee->mark_addr(), header) == header) {
>
> and here:
> 1847               if (Atomic::cmpxchg(header, lockee->mark_addr(), mark) == mark) {
> 1858               if (Atomic::cmpxchg((void*)new_header, lockee->mark_addr(), mark) == mark) {
> 1878               if (Atomic::cmpxchg((void*)new_header, lockee->mark_addr(), header) == header) {

I've changed all these.?? This is part of Zero.
>
> ------------------------------------------------------------------------------
> src/hotspot/share/memory/metaspace.cpp
> 1502   size_t value = OrderAccess::load_acquire(&_capacity_until_GC);
> ...
> 1537   return (size_t)Atomic::sub((intptr_t)v, &_capacity_until_GC);
>
> These and other uses of _capacity_until_GC suggest that variable's
> type should be size_t rather than intptr_t.  Note that I haven't done
> a careful check of uses to see if there are any places where such a
> change would cause problems.

Yes, I had a hard time with metaspace.cpp because I agree 
_capacity_until_GC should be size_t.?? Tried to make this change and it 
cascaded a bit.? I'll file an RFE to change this type separately.

>
> ------------------------------------------------------------------------------
> src/hotspot/share/oops/constantPool.cpp
>   229   OrderAccess::release_store((Klass* volatile *)adr, k);
>   246   OrderAccess::release_store((Klass* volatile *)adr, k);
>   514   OrderAccess::release_store((Klass* volatile *)adr, k);
>
> Casts are not needed.

fixed.
>
> ------------------------------------------------------------------------------
> src/hotspot/share/oops/constantPool.hpp
>   148     volatile intptr_t adr = OrderAccess::load_acquire(obj_at_addr_raw(which));
>
> [pre-existing]
> Why is adr declared volatile?

golly beats me.? concurrency is scary, especially in the constant pool.
The load_acquire() should make sure the value is fetched from memory so 
volatile is unneeded.

>
> ------------------------------------------------------------------------------
> src/hotspot/share/oops/cpCache.cpp
>   157     intx newflags = (value & parameter_size_mask);
>   158     Atomic::cmpxchg(newflags, &_flags, (intx)0);
>
> This is a nice demonstration of why I wanted to include some value
> preserving integral conversions in cmpxchg, rather than requiring
> exact type matching in the integral case.  There have been some others
> that I haven't commented on.  Apparently we (I) got away with
> including such conversions in Atomic::add, which I'd forgotten about.
> And see comment regarding Atomic::sub below.
>
> ------------------------------------------------------------------------------
> src/hotspot/share/oops/cpCache.hpp
>   139   volatile Metadata*   _f1;       // entry specific metadata field
>
> [pre-existing]
> I suspect the type should be Metadata* volatile.  And that would
> eliminate the need for the cast here:
>
>   339   Metadata* f1_ord() const                       { return (Metadata *)OrderAccess::load_acquire(&_f1); }
>
> I don't know if there are any other changes needed or desirable around
> _f1 usage.

yes, fixed this.
>
> ------------------------------------------------------------------------------
> src/hotspot/share/oops/method.hpp
>   139   volatile address from_compiled_entry() const   { return OrderAccess::load_acquire(&_from_compiled_entry); }
>   140   volatile address from_compiled_entry_no_trampoline() const;
>   141   volatile address from_interpreted_entry() const{ return OrderAccess::load_acquire(&_from_interpreted_entry); }
>
> [pre-existing]
> The volatile qualifiers here seem suspect to me.

Again much suspicion about concurrency and giant pain, which I remember, 
of debugging these when they were broken.
>
> ------------------------------------------------------------------------------
> src/hotspot/share/oops/oop.inline.hpp
>   391     narrowOop old = (narrowOop)Atomic::xchg(val, (narrowOop*)dest);
>
> Cast of return type is not needed.

fixed.

>
> ------------------------------------------------------------------------------
> src/hotspot/share/prims/jni.cpp
>
> [pre-existing]
>
> copy_jni_function_table should be using Copy::disjoint_words_atomic.

yuck.
>
> ------------------------------------------------------------------------------
> src/hotspot/share/prims/jni.cpp
>
> [pre-existing]
>
> 3892   // We're about to use Atomic::xchg for synchronization.  Some Zero
> 3893   // platforms use the GCC builtin __sync_lock_test_and_set for this,
> 3894   // but __sync_lock_test_and_set is not guaranteed to do what we want
> 3895   // on all architectures.  So we check it works before relying on it.
> 3896 #if defined(ZERO) && defined(ASSERT)
> 3897   {
> 3898     jint a = 0xcafebabe;
> 3899     jint b = Atomic::xchg(0xdeadbeef, &a);
> 3900     void *c = &a;
> 3901     void *d = Atomic::xchg(&b, &c);
> 3902     assert(a == (jint) 0xdeadbeef && b == (jint) 0xcafebabe, "Atomic::xchg() works");
> 3903     assert(c == &b && d == &a, "Atomic::xchg() works");
> 3904   }
> 3905 #endif // ZERO && ASSERT
>
> It seems rather strange to be testing Atomic::xchg() here, rather than
> as part of unit testing Atomic?  Fail unit testing => don't try to
> use...

This is zero.? I'm not touching this.
>
> ------------------------------------------------------------------------------
> src/hotspot/share/prims/jvmtiRawMonitor.cpp
>   130     if (Atomic::cmpxchg_if_null((void*)Self, &_owner)) {
>   142     if (_owner == NULL && Atomic::cmpxchg_if_null((void*)Self, &_owner)) {
>
> I think these casts aren't needed. _owner is void*, and Self is
> Thread*, which is implicitly convertible to void*.
>
> Similarly here, for the THREAD argument:
>   280     Contended = Atomic::cmpxchg((void*)THREAD, &_owner, (void*)NULL);
>   283     Contended = Atomic::cmpxchg((void*)THREAD, &_owner, (void*)NULL);

Okay, let me see if the compiler(s) eat that. (yes they do)
>
> ------------------------------------------------------------------------------
> src/hotspot/share/prims/jvmtiRawMonitor.hpp
>
> This file is in the webrev, but seems to be unchanged.

It'll be cleaned up with the the commit and not be part of the changeset.
>
> ------------------------------------------------------------------------------
> src/hotspot/share/runtime/atomic.hpp
>   520 template<typename I, typename D>
>   521 inline D Atomic::sub(I sub_value, D volatile* dest) {
>   522   STATIC_ASSERT(IsPointer<D>::value || IsIntegral<D>::value);
>   523   // Assumes two's complement integer representation.
>   524   #pragma warning(suppress: 4146)
>   525   return Atomic::add(-sub_value, dest);
>   526 }
>
> I'm pretty sure this implementation is incorrect.  I think it produces
> the wrong result when I and D are both unsigned integer types and
> sizeof(I) < sizeof(D).

Can you suggest a correction?? I just copied Atomic::dec().
>
> ------------------------------------------------------------------------------
> src/hotspot/share/runtime/mutex.cpp
>   304   intptr_t v = Atomic::cmpxchg((intptr_t)_LBIT, &_LockWord.FullWord, (intptr_t)0);  // agro ...
>
> _LBIT should probably be intptr_t, rather than an enum.  Note that the
> enum type is unused.  The old value here is another place where an
> implicit widening of same signedness would have been nice.  (Such
> implicit widening doesn't work for enums, since it's unspecified
> whether they default to signed or unsigned representation, and
> implementatinos differ.)

This would be a good/simple cleanup.? I changed it to const intptr_t 
_LBIT = 1;
>
> ------------------------------------------------------------------------------
> src/hotspot/share/runtime/mutex.hpp
>
> [pre-existing]
>
> I think the Address member of the SplitWord union is unused.  Looking
> at AcquireOrPush (and others), I'm wondering whether it *should* be
> used there, or whether just using intptr_t casts and doing integral
> arithmetic (as is presently being done) is easier and clearer.
>
> Also the _LSBINDEX macro probably ought to be defined in mutex.cpp
> rather than polluting the global namespace.  And technically, that
> name is reserved word.

I moved both this and _LBIT into the top of mutex.cpp since they are 
used there.
Cant define const intptr_t _LBIT =1; in a class in our version of C++.
>
> ------------------------------------------------------------------------------
> src/hotspot/share/runtime/objectMonitor.cpp
>   252   void * cur = Atomic::cmpxchg((void*)Self, &_owner, (void*)NULL);
>   409   if (Atomic::cmpxchg_if_null((void*)Self, &_owner)) {
> 1983       ox = (Thread*)Atomic::cmpxchg((void*)Self, &_owner, (void*)NULL);
>
> I think the casts of Self aren't needed.

fixed.
>
> ------------------------------------------------------------------------------
> src/hotspot/share/runtime/objectMonitor.cpp
>   995       if (!Atomic::cmpxchg_if_null((void*)THREAD, &_owner)) {
> 1020         if (!Atomic::cmpxchg_if_null((void*)THREAD, &_owner)) {
>
> I think the casts of THREAD aren't needed.

nope, fixed.
> ------------------------------------------------------------------------------
> src/hotspot/share/runtime/objectMonitor.hpp
>   254   markOopDesc* volatile* header_addr();
>
> Why isn't this volatile markOop* ?

fixed.
>
> ------------------------------------------------------------------------------
> src/hotspot/share/runtime/synchronizer.cpp
>   242         Atomic::cmpxchg_if_null((void*)Self, &(m->_owner))) {
>
> I think the cast of Self isn't needed.

fixed.
>
> ------------------------------------------------------------------------------
> src/hotspot/share/runtime/synchronizer.cpp
>   992   for (; block != NULL; block = (PaddedEnd<ObjectMonitor> *)next(block)) {
> 1734     for (; block != NULL; block = (PaddedEnd<ObjectMonitor> *)next(block)) {
>
> [pre-existing]
> All calls to next() pass a PaddedEnd<ObjectMonitor>* and cast the
> result.  How about moving all that behavior into next().

I fixed this next() function, but it necessitated a cast to FreeNext 
field.? The PaddedEnd<> type was intentionally not propagated to all the 
things that use it.?? Which is a shame because there are a lot more 
casts to PaddedEnd<ObjectMonitor> that could have been removed.
>
> ------------------------------------------------------------------------------
> src/hotspot/share/runtime/synchronizer.cpp
> 1970     if (monitor > (ObjectMonitor *)&block[0] &&
> 1971         monitor < (ObjectMonitor *)&block[_BLOCKSIZE]) {
>
> [pre-existing]
> Are the casts needed here?  I think PaddedEnd<ObjectMonitor> is
> derived from ObjectMonitor, so implicit conversions should apply.

prob not.? removed them.
>
> ------------------------------------------------------------------------------
> src/hotspot/share/runtime/synchronizer.hpp
>    28 #include "memory/padded.hpp"
>   163   static PaddedEnd<ObjectMonitor> * volatile gBlockList;
>
> I was going to suggest as an alternative just making gBlockList a file
> scoped variable in synchronizer.cpp, since it isn't used outside of
> that file. Except that it is referenced by vmStructs.  Curses!

It's also used by the SA.
>
> ------------------------------------------------------------------------------
> src/hotspot/share/runtime/thread.cpp
> 4707   intptr_t w = Atomic::cmpxchg((intptr_t)LOCKBIT, Lock, (intptr_t)0);
>
> This and other places suggest LOCKBIT should be defined as intptr_t,
> rather than as an enum value.  The MuxBits enum type is unused.
>
> And the cast of 0 is another case where implicit widening would be nice.

Making LOCKBIT a const intptr_t = 1 removes a lot of casts.
>
> ------------------------------------------------------------------------------
> src/hotspot/share/services/mallocSiteTable.cpp
>   261 bool MallocSiteHashtableEntry::atomic_insert(const MallocSiteHashtableEntry* entry) {
>   262   return Atomic::cmpxchg_if_null(entry, (const MallocSiteHashtableEntry**)&_next);
>   263 }
>
> I think the problem here that is leading to the cast is that
> atomic_insert is taking a const T*.  Note that it's only caller passes
> a non-const T*.

I'll change the type to non-const.? We try to use consts...

Thanks for the detailed review!? The gcc compiler seems happy so far, 
I'll post a webrev of the result of these changes after fixing 
Atomic::sub() and seeing how the other compilers deal with these changes.

Thanks,
Coleen

>
> ------------------------------------------------------------------------------
>


From david.holmes at oracle.com  Fri Oct 13 13:34:03 2017
From: david.holmes at oracle.com (David Holmes)
Date: Fri, 13 Oct 2017 23:34:03 +1000
Subject: RFR: 8146115 - Improve docker container detection and resource
 configuration usage
In-Reply-To: <C080A2E3-5EFE-4753-8048-4B991CEBDB24@oracle.com>
References: <74630458-926E-4B3E-B967-6F6ADCA0A406@oracle.com>
 <39AD9F8D-7E2B-4C15-8525-36DBA7C74302@oracle.com>
 <d3c36d08-1c9f-69bb-4e0c-aa00d3e62691@oracle.com>
 <F25C2D01-C643-4838-B2A7-DAA19D1B1EF5@oracle.com>
 <1d05a95f-75db-e0ca-e069-12fe41502e4f@oracle.com>
 <D2E3D675-2A4A-4ED1-BC30-6C52B185B731@oracle.com>
 <CAHeneC8cxBJ61A4u1AkfmpP63ZG6-WEkMyFLKnhRv=G6W32pFw@mail.gmail.com>
 <615e504d-94af-bae3-b721-6ca1dac6a567@oracle.com>
 <CAHeneC9tdgc9aCx-VLa8TvYufq_V-z+ApoAv5Cewi6UzX9uz8Q@mail.gmail.com>
 <1BD883DB-8C8B-405D-8F85-3A026B19286F@oracle.com>
 <5f7f3d85-db48-fe6e-28f5-e3f4858f33e8@oracle.com>
 <A5CDB304-6B44-448E-ADF3-C736FB49C85C@oracle.com>
 <ff0d651b-eea1-85a1-46d8-c4904556fce2@oracle.com>
 <CD1C6FA4-5729-40A7-91DC-224FCAC55949@oracle.com>
 <799205ae-ba9f-ce3a-8dd6-1a55e32689df@oracle.com>
 <9956F9D0-B01B-44FE-AE56-527907816436@oracle.com>
 <20ef0bac-1942-b29f-a9e2-4ea4d4f81cd2@oracle.com>
 <C080A2E3-5EFE-4753-8048-4B991CEBDB24@oracle.com>
Message-ID: <5d217c60-3049-30a6-c207-d6c9274a5ddf@oracle.com>

Reading back through my suggestion for os.hpp 
initialize_container_support should just be init_container_support.

Thanks,
David

On 13/10/2017 11:14 PM, Bob Vandette wrote:
> 
>> On Oct 12, 2017, at 11:08 PM, David Holmes <david.holmes at oracle.com> wrote:
>>
>> Hi Bob,
>>
>> On 13/10/2017 1:43 AM, Bob Vandette wrote:
>>>> On Oct 11, 2017, at 9:04 PM, David Holmes <david.holmes at oracle.com> wrote:
>>>>
>>>> Hi Bob,
>>>>
>>>> On 12/10/2017 5:11 AM, Bob Vandette wrote:
>>>>> Here?s an updated webrev for this RFE that contains changes and cleanups based on feedback I?ve received so far.
>>>>> I?m still investigating the best approach for reacting to cpu shares and quotas.  I do not believe doing nothing is the answer.
>>>>
>>>> I do. :) Let me try this again. When you run outside of a container you don't get 100% of the CPUs - you have to share with whatever else is running on the system. You get a fraction of CPU time based on the load. We don't try to communicate load information to the VM/application so it can adapt. Within a container setting shares/quotas is just a way of setting an artificial load. So why should we be treating it any differently?
>>> Because today we optimize for a lightly loaded system and when running serverless applications in containers we should be
>>> optimizing for a fully loaded system.  If developers don?t want this, then don?t use shares or quotas and you?ll have exactly
>>> the behavior you have today.  I think we just have to document the new behavior (and how to turn it off) so people know what
>>> to expect.
>>
>> The person deploying the app may not have control over how the app is deployed in terms of shares/quotas. It all depends how (and who) manages the containers. This is a big part of my problem/concerns here that I don't know exactly how all this is organized and who knows what in advance and what they can control.
>>
>> But I'll let this drop, other than raising an additional concern. I don't think just allowing the user to hardwire the number of processors to use will necessarily solve the problem with what available_processors() returns. I'm concerned the execution of the VM may occur in a context where the number of processors is not known in advance, and the user can not disable shares/quotas. In that case we may need to have a flag that says to ignore shares/quotas in the processor count calculation.
> 
> I?m not sure that?s a high probability issue.  It?s my understanding that whoever is configuring the container
> management will be specifying the resources required to run these applications which comes along with a
> guarantee of these resources.  If this issue does come up, I do have the -XX:-UseContainerSupport big
> switch that turns all of this off.  It will however disable the memory support as well.
> 
>>
>>> You seem to discount the added cost of 100s of VMs creating lots of un-necessaary threads.  In the current JDK 10 code base,
>>> In a heavily loaded system with 88 processors, VmData grows from 60MBs (1 cpu) to 376MB (88 cpus).  This is only mapped
>>> memory and it depends heavily on how deep in the stack these threads go before it impacts VmRSS but it shows the potential downside
>>> of having 100s of VMs thinking they each own the entire machine.
>>
>> I agree that the default ergonomics does not scale well. Anyone doing any serious Java deployment tunes the VM explicitly and does not rely on the defaults. How will they do that in a container environment? I don't know.
>>
>> I would love to see some actual deployment scenarios/experiences for this to understand things better.
> 
> This is one of the reasons I want to get this support out in JDK 10, to get some feedback under real scenarios.
> 
>>
>>> I haven?t even done any experiments to determine the added context switching cost if the VM decides to use excessive
>>> pthreads.
>>>>
>>>> That's not to say an API to provide load/shares/quota information may not be useful, but that is a separate issue to what the "active processor count" should report.
>>> I don?t have a problem with active processor count reporting the number of processors we have, but I do have a problem
>>> with our current usage of this information within the VM and Core libraries.
>>
>> That is a somewhat separate issue. One worth pursuing separately.
> 
> We should look at this as part of the ?Container aware Java? JEP.
> 
>>
>>>>
>>>>> http://cr.openjdk.java.net/~bobv/8146115/webrev.01
>>>>> Updates:
>>>>> 1. I had to move the processing of AggressiveHeap since the container memory size needs to be known before this can be processed.
>>>>
>>>> I don't like the placement of this - we don't call os:: init functions from inside Arguments - we manage the initialization sequence from Threads::create_vm. Seems to me that container initialization can/should happen in os::init_before_ergo, and the AggressiveHeap processing can occur at the start of Arguments::apply_ergo().
>>>>
>>>> That said we need to be sure nothing touched by set_aggressive_heap_flags will be used before we now reach that code - there are a lot of flags being set in there.
>>> This is exactly the reason why I put the call where it did.  I put the call to set_aggressive_heap_flags in finalize_vm_init_args
>>> because that is exactly what this call is doing.  It?s finalizing flags used after the parsing.  The impacted flags are definitely being
>>> used shortly after and before init_before_ergo is called.
>>
>> I see that now and it is very unfortunate because I really do not like what you had to do here. As you can tell from the logic in create_vm we have always refactored to ensure we can progressively manage the interleaving of OS initialization with Arguments processing. So having a deep part of Argument processing go off and call some more OS initialization is not nice. That said I can't see a way around it without very unreasonable refactoring.
>>
>> But I do have a couple of changes I'd like to request please:
>>
>> 1. Move the call to os::initialize_container_support() up a level to before the call to finalize_vm_init_args(), with a more elaborate comment:
>>
>> // We need to ensure processor and memory resources have been properly
>> // configured - which may rely on arguments we just processed - before
>> // doing the final argument processing. Any argument processing that
>> // needs to know about processor and memory resources must occur after
>> // this point.
>>
>> os::initialize_container_support();
>>
>> // Do final processing now that all arguments have been parsed
>> result = finalize_vm_init_args(patch_mod_javabase);
>>
>> 2. Simplify and modify os.hpp as follows:
>>
>> +  LINUX_ONLY(static void pd_initialize_container_support();)
>>
>>    public:
>>     static void init(void);                      // Called before command line parsing
>>
>> +   static void initialize_container_support() { // Called during command line parsing
>> +     LINUX_ONLY(pd_initialize_container_support();)
>> +   }
>>
>>     static void init_before_ergo(void);          // Called after command line parsing
>>                                                  // before VM ergonomics
>>
>> 3. In thread.cpp add a comment here:
>>
>>    // Parse arguments
>> +  // Note: this internally calls os::initialize_container_support()
>>    jint parse_result = Arguments::parse(args);
> 
> All very reasonable changes.
> 
> Thanks,
> Bob.
> 
>>
>> Thanks.
>>
>>>>
>>>>> 2. I no longer use the cpuset.cpus contents since sched_getaffinity reports the correct results
>>>>> even if someone manually updates the cgroup data.  I originally didn?t think this was the case since
>>>>> sched_setaffinity didn?t automatically update the cpuset file contents but the inverse is true.
>>>>
>>>> Ok.
>>>>
>>>>> 3. I ifdef?d the container function support in src/hotspot/share/runtime/os.hpp to avoid putting stubs in all other os
>>>>> platform directories.  I can do this if it?s absolutely necessary.
>>>>
>>>> You should not need to do this if initialization moves as I suggested above. os::init_before_ergo() in os_linux.cpp can call OSContainer::init().
>>>> No need for os::initialize_container_support() or os::pd_initialize_container_support.
>>> But os::init_before_ergo is in shared code.
>>
>> Yep my bad - point is moot now anyway.
>>
>> <snip>
>>
>>>> src/hotspot/os/linux/os_linux.cpp/.hpp
>>>>
>>>> 187         log_trace(os)("available container memory: " JULONG_FORMAT, avail_mem);
>>>> 188         return avail_mem;
>>>> 189       } else {
>>>> 190         log_debug(os,container)("container memory usage call failed: " JLONG_FORMAT, mem_usage);
>>>>
>>>> Why "trace" (the third logging level) to show the information, but "debug" (the second level) to show failed calls? You use debug in other files for basic info. Overall I'm unclear on your use of debug versus trace for the logging.
>>> I use trace for noisy information that is not reporting errors and debug for failures that are informational and not fatal.
>>> In this case, the call could return -1 or -2.  -1 is unlimited and -2 is an error.  In either case we fallback to the
>>> standard system call to get available memory.  I would have used warning but since these messages were occurring
>>> during a test run causing test failures.
>>
>> Okay. Thanks for clarifying.
>>
>>>>
>>>> ---
>>>>
>>>> src/hotspot/os/linux/osContainer_linux.cpp
>>>>
>>>> Dead code:
>>>>
>>>> 376 #if 0
>>>> 377   os::Linux::print_container_info(tty);
>>>> ...
>>>> 390 #endif
>>> I left it in for standalone testing.  Should I use some other #if?
>>
>> We don't generally leave in dead code in the runtime code. Do you see this as useful after you've finalized the changes?
>>
>> Is this testing just for showing the logging? Is it worth making this a logging controlled call? Is it suitable for a Gtest test?
>>
>> Thanks,
>> David
>> -----
>>
>>> Bob.
>>>>
>>>> Thanks,
>>>> David
>>>>
>>>>> Bob.
> 

From coleen.phillimore at oracle.com  Fri Oct 13 14:09:42 2017
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Fri, 13 Oct 2017 10:09:42 -0400
Subject: RFR (L, but tedious) 8188220: Remove Atomic::*_ptr() uses and
 overloads from hotspot
In-Reply-To: <2e8d66a6-24c3-b4de-e187-47a9e582238c@oracle.com>
References: <b6d75bd8-ca77-6507-f1f0-db57bbb99ae0@oracle.com>
 <7A475565-84D9-4F98-AE7B-2FDB206CC6E1@oracle.com>
 <2e8d66a6-24c3-b4de-e187-47a9e582238c@oracle.com>
Message-ID: <756c8ab7-a63b-26e7-fbb9-79bc261068cd@oracle.com>


On 10/12/17 8:55 PM, David Holmes wrote:
> Hi Kim,
>
> Very detailed analysis! A few things have already been updated by Coleen.
>
> Many of the issues with possibly incorrect/inappropriate types really 
> need to be dealt with separately - they go beyond the basic renaming - 
> by their component teams.

Yes, I fixed up some types that were trivial changes, but agree with you 
and left other types to be dealt with by the component teams.

I filed some RFEs and bugs.

thanks,
Coleen
>
> Similarly any ABA issues - which are likely non-issues but not clearly 
> documented - should be handled separately. And the potential race you 
> highlight below - though to be honest I couldn't match your statements 
> with the code as shown.
>
> Thanks,
> David
>
> On 13/10/2017 9:17 AM, Kim Barrett wrote:
>>> On Oct 10, 2017, at 6:01 PM, coleen.phillimore at oracle.com wrote:
>>>
>>> Summary: With the new template functions these are unnecessary.
>>>
>>> The changes are mostly s/_ptr// and removing the cast to return 
>>> type.? There weren't many types that needed to be improved to match 
>>> the template version of the function.?? Some notes:
>>> 1. replaced CASPTR with Atomic::cmpxchg() in mutex.cpp, rearranging 
>>> arguments.
>>> 2. renamed Atomic::replace_if_null to Atomic::cmpxchg_if_null.? I 
>>> disliked the first name because it's not explicit from the callers 
>>> that there's an underlying cas.? If people want to fight, I'll 
>>> remove the function and use cmpxchg because there are only a couple 
>>> places where this is a little nicer.
>>> 3. Added Atomic::sub()
>>>
>>> Tested with JPRT, mach5 tier1-5 on linux,windows and solaris.
>>>
>>> open webrev at http://cr.openjdk.java.net/~coleenp/8188220.01/webrev
>>> bug link https://bugs.openjdk.java.net/browse/JDK-8188220
>>>
>>> Thanks,
>>> Coleen
>>
>> I looked harder at the potential ABA problems, and believe they are
>> okay.? There can be multiple threads doing pushes, and there can be
>> multiple threads doing pops, but not both at the same time.
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/cpu/zero/cppInterpreter_zero.cpp
>> ? 279???? if (Atomic::cmpxchg(monitor, lockee->mark_addr(), disp) != 
>> disp) {
>>
>> How does this work?? monitor and disp seem like they have unrelated
>> types?? Given that this is zero-specific code, maybe this hasn't been
>> tested?
>>
>> Similarly here:
>> ? 423?????? if (Atomic::cmpxchg(header, rcvr->mark_addr(), lock) != 
>> lock) {
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/asm/assembler.cpp
>> ? 239???????? dcon->value_fn = cfn;
>>
>> Is it actually safe to remove the atomic update?? If multiple threads
>> performing the assignment *are* possible (and I don't understand the
>> context yet, so don't know the answer to that), then a bare non-atomic
>> assignment is a race, e.g. undefined behavior.
>>
>> Regardless of that, I think the CAST_FROM_FN_PTR should be retained.
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/classfile/classLoaderData.cpp
>> ? 167?? Chunk* head = (Chunk*) OrderAccess::load_acquire(&_head);
>>
>> I think the cast to Chunk* is no longer needed.
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/classfile/classLoaderData.cpp
>> ? 946???? ClassLoaderData* old = Atomic::cmpxchg(cld, cld_addr, 
>> (ClassLoaderData*)NULL);
>> ? 947???? if (old != NULL) {
>> ? 948?????? delete cld;
>> ? 949?????? // Returns the data.
>> ? 950?????? return old;
>> ? 951???? }
>>
>> That could instead be
>>
>> ?? if (!Atomic::replace_if_null(cld, cld_addr)) {
>> ???? delete cld;?????????? // Lost the race.
>> ???? return *cld_addr;???? // Use the winner's value.
>> ?? }
>>
>> And apparently the caller of CLDG::add doesn't care whether the
>> returned CLD has actually been added to the graph yet.? If that's not
>> true, then there's a bug here, since a race loser might return a
>> winner's value before the winner has actually done the insertion.
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/classfile/verifier.cpp
>> ?? 71 static void* verify_byte_codes_fn() {
>> ?? 72?? if (OrderAccess::load_acquire(&_verify_byte_codes_fn) == NULL) {
>> ?? 73???? void *lib_handle = os::native_java_library();
>> ?? 74???? void *func = os::dll_lookup(lib_handle, 
>> "VerifyClassCodesForMajorVersion");
>> ?? 75???? OrderAccess::release_store(&_verify_byte_codes_fn, func);
>> ?? 76???? if (func == NULL) {
>> ?? 77?????? _is_new_verify_byte_codes_fn = false;
>> ?? 78?????? func = os::dll_lookup(lib_handle, "VerifyClassCodes");
>> ?? 79 OrderAccess::release_store(&_verify_byte_codes_fn, func);
>> ?? 80???? }
>> ?? 81?? }
>> ?? 82?? return (void*)_verify_byte_codes_fn;
>> ?? 83 }
>>
>> [pre-existing]
>>
>> I think this code has race problems; a caller could unexpectedly and
>> inappropriately return NULL.? Consider the case where there is no
>> VerifyClassCodesForMajorVersion, but there is VerifyClassCodes.
>>
>> The variable is initially NULL.
>>
>> Both Thread1 and Thread2 reach line 73, having both seen a NULL value
>> for the variable.
>>
>> Thread1 reaches line 80, setting the variable to VerifyClassCodes.
>>
>> Thread2 reaches line 76, resetting the variable to NULL.
>>
>> Thread1 reads the now (momentarily) NULL value and returns it.
>>
>> I think the first release_store should be conditional on func != NULL.
>> Also, the usage of _is_new_verify_byte_codes_fn seems suspect.
>> And a minor additional nit: the cast in the return is unnecessary.
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/code/nmethod.cpp
>> 1664?? nmethod* observed_mark_link = _oops_do_mark_link;
>> 1665?? if (observed_mark_link == NULL) {
>> 1666???? // Claim this nmethod for this thread to mark.
>> 1667???? if (Atomic::cmpxchg_if_null(NMETHOD_SENTINEL, 
>> &_oops_do_mark_link)) {
>>
>> With these changes, the only use of observed_mark_link is in the if.
>> I'm not sure that variable is really useful anymore, e.g. just use
>>
>> ?? if (_oops_do_mark_link == NULL) {
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/gc/cms/concurrentMarkSweepGeneration.cpp
>>
>> In CMSCollector::par_take_from_overflow_list, if BUSY and prefix were
>> of type oopDesc*, I think there would be a whole lot fewer casts and
>> cast_to_oop's.? Later on, I think suffix_head, observed_overflow_list,
>> and curr_overflow_list could also be oopDesc* instead of oop to
>> eliminate more casts.
>>
>> And some similar changes in CMSCollector::par_push_on_overflow_list.
>>
>> And similarly in parNewGeneration.cpp, in push_on_overflow_list and
>> take_from_overflow_list_work.
>>
>> As noted in the comments for JDK-8165857, the lists and "objects"
>> involved here aren't really oops, but rather the shattered remains of
>> oops.? The suggestion there was to use HeapWord* and carry through the
>> fanout; what was actually done was to change _overflow_list to
>> oopDesc* to minimize fanout, even though that's kind of lying to the
>> type system.? Now, with the cleanup of cmpxchg_ptr and such, we're
>> paying the price of doing the minimal thing back then.
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/gc/cms/concurrentMarkSweepGeneration.cpp
>> 7960?? Atomic::add(-n, &_num_par_pushes);
>>
>> Atomic::sub
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/gc/cms/parNewGeneration.cpp
>> 1455?? Atomic::add(-n, &_num_par_pushes);
>>
>> Atomic::sub
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/gc/g1/dirtyCardQueue.cpp
>> ? 283???? void* actual = Atomic::cmpxchg(next, &_cur_par_buffer_node, 
>> nd);
>> ...
>> ? 289?????? nd = static_cast<BufferNode*>(actual);
>>
>> Change actual's type to BufferNode* and remove the cast on line 289.
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/gc/g1/g1CollectedHeap.cpp
>>
>> [pre-existing]
>> 3499???????? old = (CompiledMethod*)_postponed_list;
>>
>> I think that cast is only needed because
>> G1CodeCacheUnloadingTask::_postponed_list is incorrectly typed as
>> "volatile CompiledMethod*", when I think it ought to be
>> "CompiledMethod* volatile".
>>
>> I think G1CodeCacheUnloading::_claimed_nmethod is similarly mis-typed,
>> with a similar should not be needed cast:
>> 3530?????? first = (CompiledMethod*)_claimed_nmethod;
>>
>> and another for _postponed_list here:
>> 3552?????? claim = (CompiledMethod*)_postponed_list;
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/gc/g1/g1HotCardCache.cpp
>> ?? 77?? jbyte* previous_ptr = (jbyte*)Atomic::cmpxchg(card_ptr,
>>
>> I think the cast of the cmpxchg result is no longer needed.
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/gc/g1/g1PageBasedVirtualSpace.cpp
>> ? 254?????? char* touch_addr = (char*)Atomic::add(actual_chunk_size, 
>> &_cur_addr) - actual_chunk_size;
>>
>> I think the cast of the add result is no longer needed.
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/gc/g1/g1StringDedup.cpp
>> ? 213?? return (size_t)Atomic::add(partition_size, &_next_bucket) - 
>> partition_size;
>>
>> I think the cast of the add result is no longer needed.
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/gc/g1/heapRegionRemSet.cpp
>> ? 200?????? PerRegionTable* res =
>> ? 201???????? Atomic::cmpxchg(nxt, &_free_list, fl);
>>
>> Please remove the line break, now that the code has been simplified.
>>
>> But wait, doesn't this alloc exhibit classic ABA problems?? I *think*
>> this works because alloc and bulk_free are called in different phases,
>> never overlapping.
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/gc/g1/sparsePRT.cpp
>> ? 295???? SparsePRT* res =
>> ? 296?????? Atomic::cmpxchg(sprt, &_head_expanded_list, hd);
>> and
>> ? 307???? SparsePRT* res =
>> ? 308?????? Atomic::cmpxchg(next, &_head_expanded_list, hd);
>>
>> I'd rather not have the line breaks in these either.
>>
>> And get_from_expanded_list also appears to have classic ABA problems.
>> I *think* this works because add_to_expanded_list and
>> get_from_expanded_list are called in different phases, never
>> overlapping.
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/gc/shared/taskqueue.inline.hpp
>> ? 262?? return (size_t) Atomic::cmpxchg((intptr_t)new_age._data,
>> ? 263?????????????????????????????????? (volatile intptr_t *)&_data,
>> ? 264 (intptr_t)old_age._data);
>>
>> This should be
>>
>> ?? return Atomic::cmpxchg(new_age._data, &_data, old_age._data);
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/interpreter/bytecodeInterpreter.cpp
>> This doesn't have any casts, which I think is correct.
>> ? 708???????????? if (Atomic::cmpxchg(header, rcvr->mark_addr(), 
>> mark) == mark) {
>>
>> but these do.
>> ? 718???????????? if (Atomic::cmpxchg((void*)new_header, 
>> rcvr->mark_addr(), mark) == mark) {
>> ? 737???????????? if (Atomic::cmpxchg((void*)new_header, 
>> rcvr->mark_addr(), header) == header) {
>>
>> I'm not sure how the ones with casts even compile?? mark_addr() seems
>> to be a markOop*, which is a markOopDesc**, where markOopDesc is a
>> class.? void* is not implicitly convertible to markOopDesc*.
>>
>> Hm, this entire file is #ifdef CC_INTERP.? Is this zero-only code?? Or
>> something like that?
>>
>> Similarly here:
>> ? 906?????????? if (Atomic::cmpxchg(header, lockee->mark_addr(), 
>> mark) == mark) {
>> and
>> ? 917?????????? if (Atomic::cmpxchg((void*)new_header, 
>> lockee->mark_addr(), mark) == mark) {
>> ? 935?????????? if (Atomic::cmpxchg((void*)new_header, 
>> lockee->mark_addr(), header) == header) {
>>
>> and here:
>> 1847?????????????? if (Atomic::cmpxchg(header, lockee->mark_addr(), 
>> mark) == mark) {
>> 1858?????????????? if (Atomic::cmpxchg((void*)new_header, 
>> lockee->mark_addr(), mark) == mark) {
>> 1878?????????????? if (Atomic::cmpxchg((void*)new_header, 
>> lockee->mark_addr(), header) == header) {
>>
>> and here:
>> 1847?????????????? if (Atomic::cmpxchg(header, lockee->mark_addr(), 
>> mark) == mark) {
>> 1858?????????????? if (Atomic::cmpxchg((void*)new_header, 
>> lockee->mark_addr(), mark) == mark) {
>> 1878?????????????? if (Atomic::cmpxchg((void*)new_header, 
>> lockee->mark_addr(), header) == header) {
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/memory/metaspace.cpp
>> 1502?? size_t value = OrderAccess::load_acquire(&_capacity_until_GC);
>> ...
>> 1537?? return (size_t)Atomic::sub((intptr_t)v, &_capacity_until_GC);
>>
>> These and other uses of _capacity_until_GC suggest that variable's
>> type should be size_t rather than intptr_t.? Note that I haven't done
>> a careful check of uses to see if there are any places where such a
>> change would cause problems.
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/oops/constantPool.cpp
>> ? 229?? OrderAccess::release_store((Klass* volatile *)adr, k);
>> ? 246?? OrderAccess::release_store((Klass* volatile *)adr, k);
>> ? 514?? OrderAccess::release_store((Klass* volatile *)adr, k);
>>
>> Casts are not needed.
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/oops/constantPool.hpp
>> ? 148???? volatile intptr_t adr = 
>> OrderAccess::load_acquire(obj_at_addr_raw(which));
>>
>> [pre-existing]
>> Why is adr declared volatile?
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/oops/cpCache.cpp
>> ? 157???? intx newflags = (value & parameter_size_mask);
>> ? 158???? Atomic::cmpxchg(newflags, &_flags, (intx)0);
>>
>> This is a nice demonstration of why I wanted to include some value
>> preserving integral conversions in cmpxchg, rather than requiring
>> exact type matching in the integral case.? There have been some others
>> that I haven't commented on.? Apparently we (I) got away with
>> including such conversions in Atomic::add, which I'd forgotten about.
>> And see comment regarding Atomic::sub below.
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/oops/cpCache.hpp
>> ? 139?? volatile Metadata*?? _f1;?????? // entry specific metadata field
>>
>> [pre-existing]
>> I suspect the type should be Metadata* volatile.? And that would
>> eliminate the need for the cast here:
>>
>> ? 339?? Metadata* f1_ord() const?????????????????????? { return 
>> (Metadata *)OrderAccess::load_acquire(&_f1); }
>>
>> I don't know if there are any other changes needed or desirable around
>> _f1 usage.
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/oops/method.hpp
>> ? 139?? volatile address from_compiled_entry() const?? { return 
>> OrderAccess::load_acquire(&_from_compiled_entry); }
>> ? 140?? volatile address from_compiled_entry_no_trampoline() const;
>> ? 141?? volatile address from_interpreted_entry() const{ return 
>> OrderAccess::load_acquire(&_from_interpreted_entry); }
>>
>> [pre-existing]
>> The volatile qualifiers here seem suspect to me.
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/oops/oop.inline.hpp
>> ? 391???? narrowOop old = (narrowOop)Atomic::xchg(val, 
>> (narrowOop*)dest);
>>
>> Cast of return type is not needed.
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/prims/jni.cpp
>>
>> [pre-existing]
>>
>> copy_jni_function_table should be using Copy::disjoint_words_atomic.
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/prims/jni.cpp
>>
>> [pre-existing]
>>
>> 3892?? // We're about to use Atomic::xchg for synchronization. Some Zero
>> 3893?? // platforms use the GCC builtin __sync_lock_test_and_set for 
>> this,
>> 3894?? // but __sync_lock_test_and_set is not guaranteed to do what 
>> we want
>> 3895?? // on all architectures.? So we check it works before relying 
>> on it.
>> 3896 #if defined(ZERO) && defined(ASSERT)
>> 3897?? {
>> 3898???? jint a = 0xcafebabe;
>> 3899???? jint b = Atomic::xchg(0xdeadbeef, &a);
>> 3900???? void *c = &a;
>> 3901???? void *d = Atomic::xchg(&b, &c);
>> 3902???? assert(a == (jint) 0xdeadbeef && b == (jint) 0xcafebabe, 
>> "Atomic::xchg() works");
>> 3903???? assert(c == &b && d == &a, "Atomic::xchg() works");
>> 3904?? }
>> 3905 #endif // ZERO && ASSERT
>>
>> It seems rather strange to be testing Atomic::xchg() here, rather than
>> as part of unit testing Atomic?? Fail unit testing => don't try to
>> use...
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/prims/jvmtiRawMonitor.cpp
>> ? 130???? if (Atomic::cmpxchg_if_null((void*)Self, &_owner)) {
>> ? 142???? if (_owner == NULL && Atomic::cmpxchg_if_null((void*)Self, 
>> &_owner)) {
>>
>> I think these casts aren't needed. _owner is void*, and Self is
>> Thread*, which is implicitly convertible to void*.
>>
>> Similarly here, for the THREAD argument:
>> ? 280???? Contended = Atomic::cmpxchg((void*)THREAD, &_owner, 
>> (void*)NULL);
>> ? 283???? Contended = Atomic::cmpxchg((void*)THREAD, &_owner, 
>> (void*)NULL);
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/prims/jvmtiRawMonitor.hpp
>>
>> This file is in the webrev, but seems to be unchanged.
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/runtime/atomic.hpp
>> ? 520 template<typename I, typename D>
>> ? 521 inline D Atomic::sub(I sub_value, D volatile* dest) {
>> ? 522?? STATIC_ASSERT(IsPointer<D>::value || IsIntegral<D>::value);
>> ? 523?? // Assumes two's complement integer representation.
>> ? 524?? #pragma warning(suppress: 4146)
>> ? 525?? return Atomic::add(-sub_value, dest);
>> ? 526 }
>>
>> I'm pretty sure this implementation is incorrect.? I think it produces
>> the wrong result when I and D are both unsigned integer types and
>> sizeof(I) < sizeof(D).
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/runtime/mutex.cpp
>> ? 304?? intptr_t v = Atomic::cmpxchg((intptr_t)_LBIT, 
>> &_LockWord.FullWord, (intptr_t)0);? // agro ...
>>
>> _LBIT should probably be intptr_t, rather than an enum.? Note that the
>> enum type is unused.? The old value here is another place where an
>> implicit widening of same signedness would have been nice. (Such
>> implicit widening doesn't work for enums, since it's unspecified
>> whether they default to signed or unsigned representation, and
>> implementatinos differ.)
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/runtime/mutex.hpp
>>
>> [pre-existing]
>>
>> I think the Address member of the SplitWord union is unused. Looking
>> at AcquireOrPush (and others), I'm wondering whether it *should* be
>> used there, or whether just using intptr_t casts and doing integral
>> arithmetic (as is presently being done) is easier and clearer.
>>
>> Also the _LSBINDEX macro probably ought to be defined in mutex.cpp
>> rather than polluting the global namespace.? And technically, that
>> name is reserved word.
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/runtime/objectMonitor.cpp
>> ? 252?? void * cur = Atomic::cmpxchg((void*)Self, &_owner, (void*)NULL);
>> ? 409?? if (Atomic::cmpxchg_if_null((void*)Self, &_owner)) {
>> 1983?????? ox = (Thread*)Atomic::cmpxchg((void*)Self, &_owner, 
>> (void*)NULL);
>>
>> I think the casts of Self aren't needed.
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/runtime/objectMonitor.cpp
>> ? 995?????? if (!Atomic::cmpxchg_if_null((void*)THREAD, &_owner)) {
>> 1020???????? if (!Atomic::cmpxchg_if_null((void*)THREAD, &_owner)) {
>>
>> I think the casts of THREAD aren't needed.
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/runtime/objectMonitor.hpp
>> ? 254?? markOopDesc* volatile* header_addr();
>>
>> Why isn't this volatile markOop* ?
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/runtime/synchronizer.cpp
>> ? 242???????? Atomic::cmpxchg_if_null((void*)Self, &(m->_owner))) {
>>
>> I think the cast of Self isn't needed.
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/runtime/synchronizer.cpp
>> ? 992?? for (; block != NULL; block = (PaddedEnd<ObjectMonitor> 
>> *)next(block)) {
>> 1734???? for (; block != NULL; block = (PaddedEnd<ObjectMonitor> 
>> *)next(block)) {
>>
>> [pre-existing]
>> All calls to next() pass a PaddedEnd<ObjectMonitor>* and cast the
>> result.? How about moving all that behavior into next().
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/runtime/synchronizer.cpp
>> 1970???? if (monitor > (ObjectMonitor *)&block[0] &&
>> 1971???????? monitor < (ObjectMonitor *)&block[_BLOCKSIZE]) {
>>
>> [pre-existing]
>> Are the casts needed here?? I think PaddedEnd<ObjectMonitor> is
>> derived from ObjectMonitor, so implicit conversions should apply.
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/runtime/synchronizer.hpp
>> ?? 28 #include "memory/padded.hpp"
>> ? 163?? static PaddedEnd<ObjectMonitor> * volatile gBlockList;
>>
>> I was going to suggest as an alternative just making gBlockList a file
>> scoped variable in synchronizer.cpp, since it isn't used outside of
>> that file. Except that it is referenced by vmStructs.? Curses!
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/runtime/thread.cpp
>> 4707?? intptr_t w = Atomic::cmpxchg((intptr_t)LOCKBIT, Lock, 
>> (intptr_t)0);
>>
>> This and other places suggest LOCKBIT should be defined as intptr_t,
>> rather than as an enum value.? The MuxBits enum type is unused.
>>
>> And the cast of 0 is another case where implicit widening would be nice.
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/services/mallocSiteTable.cpp
>> ? 261 bool MallocSiteHashtableEntry::atomic_insert(const 
>> MallocSiteHashtableEntry* entry) {
>> ? 262?? return Atomic::cmpxchg_if_null(entry, (const 
>> MallocSiteHashtableEntry**)&_next);
>> ? 263 }
>>
>> I think the problem here that is leading to the cast is that
>> atomic_insert is taking a const T*.? Note that it's only caller passes
>> a non-const T*.
>>
>> ------------------------------------------------------------------------------ 
>>
>>


From coleen.phillimore at oracle.com  Fri Oct 13 18:34:43 2017
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Fri, 13 Oct 2017 14:34:43 -0400
Subject: RFR (L, but tedious) 8188220: Remove Atomic::*_ptr() uses and
 overloads from hotspot
In-Reply-To: <49b7c5f7-2f6d-16ac-0b60-140619d0fffd@oracle.com>
References: <b6d75bd8-ca77-6507-f1f0-db57bbb99ae0@oracle.com>
 <7A475565-84D9-4F98-AE7B-2FDB206CC6E1@oracle.com>
 <49b7c5f7-2f6d-16ac-0b60-140619d0fffd@oracle.com>
Message-ID: <f50c2c96-11a8-2d02-5eaf-292451b6c6ff@oracle.com>


Hi, Here is the version with the changes from Kim's comments that has 
passed at least testing with JPRT and tier1, locally.?? More testing 
(tier2-5) is in progress.

Also includes a corrected version of Atomic::sub care of Erik Osterlund.

open webrev at 
http://cr.openjdk.java.net/~coleenp/8188220.kim-review-changes/webrev
open webrev at 
http://cr.openjdk.java.net/~coleenp/8188220.review-comments/webrev

Full version:

http://cr.openjdk.java.net/~coleenp/8188220.03/webrev

Thanks!
Coleen

On 10/13/17 9:25 AM, coleen.phillimore at oracle.com wrote:
>
> Hi Kim, Thank you for the detailed review and the time you've spent on 
> it, and discussion yesterday.
>
> On 10/12/17 7:17 PM, Kim Barrett wrote:
>>> On Oct 10, 2017, at 6:01 PM, coleen.phillimore at oracle.com wrote:
>>>
>>> Summary: With the new template functions these are unnecessary.
>>>
>>> The changes are mostly s/_ptr// and removing the cast to return 
>>> type.? There weren't many types that needed to be improved to match 
>>> the template version of the function.?? Some notes:
>>> 1. replaced CASPTR with Atomic::cmpxchg() in mutex.cpp, rearranging 
>>> arguments.
>>> 2. renamed Atomic::replace_if_null to Atomic::cmpxchg_if_null.? I 
>>> disliked the first name because it's not explicit from the callers 
>>> that there's an underlying cas.? If people want to fight, I'll 
>>> remove the function and use cmpxchg because there are only a couple 
>>> places where this is a little nicer.
>>> 3. Added Atomic::sub()
>>>
>>> Tested with JPRT, mach5 tier1-5 on linux,windows and solaris.
>>>
>>> open webrev at http://cr.openjdk.java.net/~coleenp/8188220.01/webrev
>>> bug link https://bugs.openjdk.java.net/browse/JDK-8188220
>>>
>>> Thanks,
>>> Coleen
>> I looked harder at the potential ABA problems, and believe they are
>> okay.? There can be multiple threads doing pushes, and there can be
>> multiple threads doing pops, but not both at the same time.
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/cpu/zero/cppInterpreter_zero.cpp
>> ? 279???? if (Atomic::cmpxchg(monitor, lockee->mark_addr(), disp) != 
>> disp) {
>>
>> How does this work?? monitor and disp seem like they have unrelated
>> types?? Given that this is zero-specific code, maybe this hasn't been
>> tested?
>>
>> Similarly here:
>> ? 423?????? if (Atomic::cmpxchg(header, rcvr->mark_addr(), lock) != 
>> lock) {
>
> I haven't built zero.? I don't know how to do this anymore (help?) I 
> fixed the obvious type mismatches here and in 
> bytecodeInterpreter.cpp.? I'll try to build it.
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/asm/assembler.cpp
>> ? 239???????? dcon->value_fn = cfn;
>>
>> Is it actually safe to remove the atomic update?? If multiple threads
>> performing the assignment *are* possible (and I don't understand the
>> context yet, so don't know the answer to that), then a bare non-atomic
>> assignment is a race, e.g. undefined behavior.
>>
>> Regardless of that, I think the CAST_FROM_FN_PTR should be retained.
>
> I can find no uses of this code, ie. looking for "delayed_value". I 
> think it was early jsr292 code.? I could also not find any combination 
> of casts that would make it compile, so in the end I believed the 
> comment and took out the cmpxchg.?? The code appears to be intended to 
> for bootstrapping, see the call to update_delayed_values() in 
> JavaClasses::compute_offsets().
>
> The CAST_FROM_FN_PTR was to get it to compile with cmpxchg, the new 
> code does not require a cast.? If you can help with finding the right 
> set of casts, I'd be happy to put the cmpxchg back in. I just couldn't 
> find one.
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/classfile/classLoaderData.cpp
>> ? 167?? Chunk* head = (Chunk*) OrderAccess::load_acquire(&_head);
>>
>> I think the cast to Chunk* is no longer needed.
>
> Missed another, thanks.? No that's the same one David found.
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/classfile/classLoaderData.cpp
>> ? 946???? ClassLoaderData* old = Atomic::cmpxchg(cld, cld_addr, 
>> (ClassLoaderData*)NULL);
>> ? 947???? if (old != NULL) {
>> ? 948?????? delete cld;
>> ? 949?????? // Returns the data.
>> ? 950?????? return old;
>> ? 951???? }
>>
>> That could instead be
>>
>> ?? if (!Atomic::replace_if_null(cld, cld_addr)) {
>> ???? delete cld;?????????? // Lost the race.
>> ???? return *cld_addr;???? // Use the winner's value.
>> ?? }
>>
>> And apparently the caller of CLDG::add doesn't care whether the
>> returned CLD has actually been added to the graph yet.? If that's not
>> true, then there's a bug here, since a race loser might return a
>> winner's value before the winner has actually done the insertion.
>
> True, the race loser doesn't care whether the CLD has been added to 
> the graph.
> Your instead code requires a comment that replace_if_null is really a 
> compare exchange and has an extra read of the original value, so I am 
> leaving what I have which is clearer to me.
>
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/classfile/verifier.cpp
>> ?? 71 static void* verify_byte_codes_fn() {
>> ?? 72?? if (OrderAccess::load_acquire(&_verify_byte_codes_fn) == NULL) {
>> ?? 73???? void *lib_handle = os::native_java_library();
>> ?? 74???? void *func = os::dll_lookup(lib_handle, 
>> "VerifyClassCodesForMajorVersion");
>> ?? 75???? OrderAccess::release_store(&_verify_byte_codes_fn, func);
>> ?? 76???? if (func == NULL) {
>> ?? 77?????? _is_new_verify_byte_codes_fn = false;
>> ?? 78?????? func = os::dll_lookup(lib_handle, "VerifyClassCodes");
>> ?? 79 OrderAccess::release_store(&_verify_byte_codes_fn, func);
>> ?? 80???? }
>> ?? 81?? }
>> ?? 82?? return (void*)_verify_byte_codes_fn;
>> ?? 83 }
>>
>> [pre-existing]
>>
>> I think this code has race problems; a caller could unexpectedly and
>> inappropriately return NULL.? Consider the case where there is no
>> VerifyClassCodesForMajorVersion, but there is VerifyClassCodes.
>>
>> The variable is initially NULL.
>>
>> Both Thread1 and Thread2 reach line 73, having both seen a NULL value
>> for the variable.
>>
>> Thread1 reaches line 80, setting the variable to VerifyClassCodes.
>>
>> Thread2 reaches line 76, resetting the variable to NULL.
>>
>> Thread1 reads the now (momentarily) NULL value and returns it.
>>
>> I think the first release_store should be conditional on func != NULL.
>> Also, the usage of _is_new_verify_byte_codes_fn seems suspect.
>> And a minor additional nit: the cast in the return is unnecessary.
>
> Yes, this looks like a bug.?? I'll cut/paste this and file it. It may 
> be that this is support for the old verifier in old jdk versions that 
> can be cleaned up.
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/code/nmethod.cpp
>> 1664?? nmethod* observed_mark_link = _oops_do_mark_link;
>> 1665?? if (observed_mark_link == NULL) {
>> 1666???? // Claim this nmethod for this thread to mark.
>> 1667???? if (Atomic::cmpxchg_if_null(NMETHOD_SENTINEL, 
>> &_oops_do_mark_link)) {
>>
>> With these changes, the only use of observed_mark_link is in the if.
>> I'm not sure that variable is really useful anymore, e.g. just use
>>
>> ?? if (_oops_do_mark_link == NULL) {
>
> Ok fixed.
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/gc/cms/concurrentMarkSweepGeneration.cpp
>>
>> In CMSCollector::par_take_from_overflow_list, if BUSY and prefix were
>> of type oopDesc*, I think there would be a whole lot fewer casts and
>> cast_to_oop's.? Later on, I think suffix_head, observed_overflow_list,
>> and curr_overflow_list could also be oopDesc* instead of oop to
>> eliminate more casts.
>
> I actually tried to make this change but ran into more fan out that 
> way, so went back and just fixed the cmpxchg calls to cast oops to 
> oopDesc* and things were less perturbed that way.
>>
>> And some similar changes in CMSCollector::par_push_on_overflow_list.
>>
>> And similarly in parNewGeneration.cpp, in push_on_overflow_list and
>> take_from_overflow_list_work.
>>
>> As noted in the comments for JDK-8165857, the lists and "objects"
>> involved here aren't really oops, but rather the shattered remains of
>
> Yes, somewhat horrified at the value of BUSY.
>> oops.? The suggestion there was to use HeapWord* and carry through the
>> fanout; what was actually done was to change _overflow_list to
>> oopDesc* to minimize fanout, even though that's kind of lying to the
>> type system.? Now, with the cleanup of cmpxchg_ptr and such, we're
>> paying the price of doing the minimal thing back then.
>
> I will file an RFE about cleaning this up.? I think what I've done was 
> the minimal thing.
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/gc/cms/concurrentMarkSweepGeneration.cpp
>> 7960?? Atomic::add(-n, &_num_par_pushes);
>>
>> Atomic::sub
>
> fixed.
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/gc/cms/parNewGeneration.cpp
>> 1455?? Atomic::add(-n, &_num_par_pushes);
> fixed.
>> Atomic::sub
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/gc/g1/dirtyCardQueue.cpp
>> ? 283???? void* actual = Atomic::cmpxchg(next, &_cur_par_buffer_node, 
>> nd);
>> ...
>> ? 289?????? nd = static_cast<BufferNode*>(actual);
>>
>> Change actual's type to BufferNode* and remove the cast on line 289.
>
> fixed.? missed that one. gross.
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/gc/g1/g1CollectedHeap.cpp
>>
>> [pre-existing]
>> 3499???????? old = (CompiledMethod*)_postponed_list;
>>
>> I think that cast is only needed because
>> G1CodeCacheUnloadingTask::_postponed_list is incorrectly typed as
>> "volatile CompiledMethod*", when I think it ought to be
>> "CompiledMethod* volatile".
>>
>> I think G1CodeCacheUnloading::_claimed_nmethod is similarly mis-typed,
>> with a similar should not be needed cast:
>> 3530?????? first = (CompiledMethod*)_claimed_nmethod;
>>
>> and another for _postponed_list here:
>> 3552?????? claim = (CompiledMethod*)_postponed_list;
>
> I've fixed this.?? C++ is so confusing about where to put the 
> volatile.?? Everyone has been tripped up by it.
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/gc/g1/g1HotCardCache.cpp
>> ?? 77?? jbyte* previous_ptr = (jbyte*)Atomic::cmpxchg(card_ptr,
>>
>> I think the cast of the cmpxchg result is no longer needed.
>
> fixed.
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/gc/g1/g1PageBasedVirtualSpace.cpp
>> ? 254?????? char* touch_addr = (char*)Atomic::add(actual_chunk_size, 
>> &_cur_addr) - actual_chunk_size;
>>
>> I think the cast of the add result is no longer needed.
> got it already.
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/gc/g1/g1StringDedup.cpp
>> ? 213?? return (size_t)Atomic::add(partition_size, &_next_bucket) - 
>> partition_size;
>>
>> I think the cast of the add result is no longer needed.
>
> I was slacking in the g1 files.? fixed.
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/gc/g1/heapRegionRemSet.cpp
>> ? 200?????? PerRegionTable* res =
>> ? 201???????? Atomic::cmpxchg(nxt, &_free_list, fl);
>>
>> Please remove the line break, now that the code has been simplified.
>>
>> But wait, doesn't this alloc exhibit classic ABA problems?? I *think*
>> this works because alloc and bulk_free are called in different phases,
>> never overlapping.
>
> I don't know.? Do you want to file a bug to investigate this?
> fixed.
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/gc/g1/sparsePRT.cpp
>> ? 295???? SparsePRT* res =
>> ? 296?????? Atomic::cmpxchg(sprt, &_head_expanded_list, hd);
>> and
>> ? 307???? SparsePRT* res =
>> ? 308?????? Atomic::cmpxchg(next, &_head_expanded_list, hd);
>>
>> I'd rather not have the line breaks in these either.
>>
>> And get_from_expanded_list also appears to have classic ABA problems.
>> I *think* this works because add_to_expanded_list and
>> get_from_expanded_list are called in different phases, never
>> overlapping.
>
> Fixed, same question as above?? Or one bug to investigate both?
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/gc/shared/taskqueue.inline.hpp
>> ? 262?? return (size_t) Atomic::cmpxchg((intptr_t)new_age._data,
>> ? 263?????????????????????????????????? (volatile intptr_t *)&_data,
>> ? 264 (intptr_t)old_age._data);
>>
>> This should be
>>
>> ?? return Atomic::cmpxchg(new_age._data, &_data, old_age._data);
>
> fixed.
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/interpreter/bytecodeInterpreter.cpp
>> This doesn't have any casts, which I think is correct.
>> ? 708???????????? if (Atomic::cmpxchg(header, rcvr->mark_addr(), 
>> mark) == mark) {
>>
>> but these do.
>> ? 718???????????? if (Atomic::cmpxchg((void*)new_header, 
>> rcvr->mark_addr(), mark) == mark) {
>> ? 737???????????? if (Atomic::cmpxchg((void*)new_header, 
>> rcvr->mark_addr(), header) == header) {
>>
>> I'm not sure how the ones with casts even compile?? mark_addr() seems
>> to be a markOop*, which is a markOopDesc**, where markOopDesc is a
>> class.? void* is not implicitly convertible to markOopDesc*.
>>
>> Hm, this entire file is #ifdef CC_INTERP.? Is this zero-only code?? Or
>> something like that?
>>
>> Similarly here:
>> ? 906?????????? if (Atomic::cmpxchg(header, lockee->mark_addr(), 
>> mark) == mark) {
>> and
>> ? 917?????????? if (Atomic::cmpxchg((void*)new_header, 
>> lockee->mark_addr(), mark) == mark) {
>> ? 935?????????? if (Atomic::cmpxchg((void*)new_header, 
>> lockee->mark_addr(), header) == header) {
>>
>> and here:
>> 1847?????????????? if (Atomic::cmpxchg(header, lockee->mark_addr(), 
>> mark) == mark) {
>> 1858?????????????? if (Atomic::cmpxchg((void*)new_header, 
>> lockee->mark_addr(), mark) == mark) {
>> 1878?????????????? if (Atomic::cmpxchg((void*)new_header, 
>> lockee->mark_addr(), header) == header) {
>>
>> and here:
>> 1847?????????????? if (Atomic::cmpxchg(header, lockee->mark_addr(), 
>> mark) == mark) {
>> 1858?????????????? if (Atomic::cmpxchg((void*)new_header, 
>> lockee->mark_addr(), mark) == mark) {
>> 1878?????????????? if (Atomic::cmpxchg((void*)new_header, 
>> lockee->mark_addr(), header) == header) {
>
> I've changed all these.?? This is part of Zero.
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/memory/metaspace.cpp
>> 1502?? size_t value = OrderAccess::load_acquire(&_capacity_until_GC);
>> ...
>> 1537?? return (size_t)Atomic::sub((intptr_t)v, &_capacity_until_GC);
>>
>> These and other uses of _capacity_until_GC suggest that variable's
>> type should be size_t rather than intptr_t.? Note that I haven't done
>> a careful check of uses to see if there are any places where such a
>> change would cause problems.
>
> Yes, I had a hard time with metaspace.cpp because I agree 
> _capacity_until_GC should be size_t.?? Tried to make this change and 
> it cascaded a bit.? I'll file an RFE to change this type separately.
>
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/oops/constantPool.cpp
>> ? 229?? OrderAccess::release_store((Klass* volatile *)adr, k);
>> ? 246?? OrderAccess::release_store((Klass* volatile *)adr, k);
>> ? 514?? OrderAccess::release_store((Klass* volatile *)adr, k);
>>
>> Casts are not needed.
>
> fixed.
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/oops/constantPool.hpp
>> ? 148???? volatile intptr_t adr = 
>> OrderAccess::load_acquire(obj_at_addr_raw(which));
>>
>> [pre-existing]
>> Why is adr declared volatile?
>
> golly beats me.? concurrency is scary, especially in the constant pool.
> The load_acquire() should make sure the value is fetched from memory 
> so volatile is unneeded.
>
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/oops/cpCache.cpp
>> ? 157???? intx newflags = (value & parameter_size_mask);
>> ? 158???? Atomic::cmpxchg(newflags, &_flags, (intx)0);
>>
>> This is a nice demonstration of why I wanted to include some value
>> preserving integral conversions in cmpxchg, rather than requiring
>> exact type matching in the integral case.? There have been some others
>> that I haven't commented on.? Apparently we (I) got away with
>> including such conversions in Atomic::add, which I'd forgotten about.
>> And see comment regarding Atomic::sub below.
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/oops/cpCache.hpp
>> ? 139?? volatile Metadata*?? _f1;?????? // entry specific metadata field
>>
>> [pre-existing]
>> I suspect the type should be Metadata* volatile.? And that would
>> eliminate the need for the cast here:
>>
>> ? 339?? Metadata* f1_ord() const?????????????????????? { return 
>> (Metadata *)OrderAccess::load_acquire(&_f1); }
>>
>> I don't know if there are any other changes needed or desirable around
>> _f1 usage.
>
> yes, fixed this.
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/oops/method.hpp
>> ? 139?? volatile address from_compiled_entry() const?? { return 
>> OrderAccess::load_acquire(&_from_compiled_entry); }
>> ? 140?? volatile address from_compiled_entry_no_trampoline() const;
>> ? 141?? volatile address from_interpreted_entry() const{ return 
>> OrderAccess::load_acquire(&_from_interpreted_entry); }
>>
>> [pre-existing]
>> The volatile qualifiers here seem suspect to me.
>
> Again much suspicion about concurrency and giant pain, which I 
> remember, of debugging these when they were broken.
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/oops/oop.inline.hpp
>> ? 391???? narrowOop old = (narrowOop)Atomic::xchg(val, 
>> (narrowOop*)dest);
>>
>> Cast of return type is not needed.
>
> fixed.
>
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/prims/jni.cpp
>>
>> [pre-existing]
>>
>> copy_jni_function_table should be using Copy::disjoint_words_atomic.
>
> yuck.
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/prims/jni.cpp
>>
>> [pre-existing]
>>
>> 3892?? // We're about to use Atomic::xchg for synchronization. Some Zero
>> 3893?? // platforms use the GCC builtin __sync_lock_test_and_set for 
>> this,
>> 3894?? // but __sync_lock_test_and_set is not guaranteed to do what 
>> we want
>> 3895?? // on all architectures.? So we check it works before relying 
>> on it.
>> 3896 #if defined(ZERO) && defined(ASSERT)
>> 3897?? {
>> 3898???? jint a = 0xcafebabe;
>> 3899???? jint b = Atomic::xchg(0xdeadbeef, &a);
>> 3900???? void *c = &a;
>> 3901???? void *d = Atomic::xchg(&b, &c);
>> 3902???? assert(a == (jint) 0xdeadbeef && b == (jint) 0xcafebabe, 
>> "Atomic::xchg() works");
>> 3903???? assert(c == &b && d == &a, "Atomic::xchg() works");
>> 3904?? }
>> 3905 #endif // ZERO && ASSERT
>>
>> It seems rather strange to be testing Atomic::xchg() here, rather than
>> as part of unit testing Atomic?? Fail unit testing => don't try to
>> use...
>
> This is zero.? I'm not touching this.
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/prims/jvmtiRawMonitor.cpp
>> ? 130???? if (Atomic::cmpxchg_if_null((void*)Self, &_owner)) {
>> ? 142???? if (_owner == NULL && Atomic::cmpxchg_if_null((void*)Self, 
>> &_owner)) {
>>
>> I think these casts aren't needed. _owner is void*, and Self is
>> Thread*, which is implicitly convertible to void*.
>>
>> Similarly here, for the THREAD argument:
>> ? 280???? Contended = Atomic::cmpxchg((void*)THREAD, &_owner, 
>> (void*)NULL);
>> ? 283???? Contended = Atomic::cmpxchg((void*)THREAD, &_owner, 
>> (void*)NULL);
>
> Okay, let me see if the compiler(s) eat that. (yes they do)
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/prims/jvmtiRawMonitor.hpp
>>
>> This file is in the webrev, but seems to be unchanged.
>
> It'll be cleaned up with the the commit and not be part of the changeset.
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/runtime/atomic.hpp
>> ? 520 template<typename I, typename D>
>> ? 521 inline D Atomic::sub(I sub_value, D volatile* dest) {
>> ? 522?? STATIC_ASSERT(IsPointer<D>::value || IsIntegral<D>::value);
>> ? 523?? // Assumes two's complement integer representation.
>> ? 524?? #pragma warning(suppress: 4146)
>> ? 525?? return Atomic::add(-sub_value, dest);
>> ? 526 }
>>
>> I'm pretty sure this implementation is incorrect.? I think it produces
>> the wrong result when I and D are both unsigned integer types and
>> sizeof(I) < sizeof(D).
>
> Can you suggest a correction?? I just copied Atomic::dec().
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/runtime/mutex.cpp
>> ? 304?? intptr_t v = Atomic::cmpxchg((intptr_t)_LBIT, 
>> &_LockWord.FullWord, (intptr_t)0);? // agro ...
>>
>> _LBIT should probably be intptr_t, rather than an enum.? Note that the
>> enum type is unused.? The old value here is another place where an
>> implicit widening of same signedness would have been nice. (Such
>> implicit widening doesn't work for enums, since it's unspecified
>> whether they default to signed or unsigned representation, and
>> implementatinos differ.)
>
> This would be a good/simple cleanup.? I changed it to const intptr_t 
> _LBIT = 1;
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/runtime/mutex.hpp
>>
>> [pre-existing]
>>
>> I think the Address member of the SplitWord union is unused. Looking
>> at AcquireOrPush (and others), I'm wondering whether it *should* be
>> used there, or whether just using intptr_t casts and doing integral
>> arithmetic (as is presently being done) is easier and clearer.
>>
>> Also the _LSBINDEX macro probably ought to be defined in mutex.cpp
>> rather than polluting the global namespace.? And technically, that
>> name is reserved word.
>
> I moved both this and _LBIT into the top of mutex.cpp since they are 
> used there.
> Cant define const intptr_t _LBIT =1; in a class in our version of C++.
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/runtime/objectMonitor.cpp
>> ? 252?? void * cur = Atomic::cmpxchg((void*)Self, &_owner, (void*)NULL);
>> ? 409?? if (Atomic::cmpxchg_if_null((void*)Self, &_owner)) {
>> 1983?????? ox = (Thread*)Atomic::cmpxchg((void*)Self, &_owner, 
>> (void*)NULL);
>>
>> I think the casts of Self aren't needed.
>
> fixed.
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/runtime/objectMonitor.cpp
>> ? 995?????? if (!Atomic::cmpxchg_if_null((void*)THREAD, &_owner)) {
>> 1020???????? if (!Atomic::cmpxchg_if_null((void*)THREAD, &_owner)) {
>>
>> I think the casts of THREAD aren't needed.
>
> nope, fixed.
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/runtime/objectMonitor.hpp
>> ? 254?? markOopDesc* volatile* header_addr();
>>
>> Why isn't this volatile markOop* ?
>
> fixed.
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/runtime/synchronizer.cpp
>> ? 242???????? Atomic::cmpxchg_if_null((void*)Self, &(m->_owner))) {
>>
>> I think the cast of Self isn't needed.
>
> fixed.
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/runtime/synchronizer.cpp
>> ? 992?? for (; block != NULL; block = (PaddedEnd<ObjectMonitor> 
>> *)next(block)) {
>> 1734???? for (; block != NULL; block = (PaddedEnd<ObjectMonitor> 
>> *)next(block)) {
>>
>> [pre-existing]
>> All calls to next() pass a PaddedEnd<ObjectMonitor>* and cast the
>> result.? How about moving all that behavior into next().
>
> I fixed this next() function, but it necessitated a cast to FreeNext 
> field.? The PaddedEnd<> type was intentionally not propagated to all 
> the things that use it.?? Which is a shame because there are a lot 
> more casts to PaddedEnd<ObjectMonitor> that could have been removed.
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/runtime/synchronizer.cpp
>> 1970???? if (monitor > (ObjectMonitor *)&block[0] &&
>> 1971???????? monitor < (ObjectMonitor *)&block[_BLOCKSIZE]) {
>>
>> [pre-existing]
>> Are the casts needed here?? I think PaddedEnd<ObjectMonitor> is
>> derived from ObjectMonitor, so implicit conversions should apply.
>
> prob not.? removed them.
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/runtime/synchronizer.hpp
>> ?? 28 #include "memory/padded.hpp"
>> ? 163?? static PaddedEnd<ObjectMonitor> * volatile gBlockList;
>>
>> I was going to suggest as an alternative just making gBlockList a file
>> scoped variable in synchronizer.cpp, since it isn't used outside of
>> that file. Except that it is referenced by vmStructs.? Curses!
>
> It's also used by the SA.
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/runtime/thread.cpp
>> 4707?? intptr_t w = Atomic::cmpxchg((intptr_t)LOCKBIT, Lock, 
>> (intptr_t)0);
>>
>> This and other places suggest LOCKBIT should be defined as intptr_t,
>> rather than as an enum value.? The MuxBits enum type is unused.
>>
>> And the cast of 0 is another case where implicit widening would be nice.
>
> Making LOCKBIT a const intptr_t = 1 removes a lot of casts.
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/services/mallocSiteTable.cpp
>> ? 261 bool MallocSiteHashtableEntry::atomic_insert(const 
>> MallocSiteHashtableEntry* entry) {
>> ? 262?? return Atomic::cmpxchg_if_null(entry, (const 
>> MallocSiteHashtableEntry**)&_next);
>> ? 263 }
>>
>> I think the problem here that is leading to the cast is that
>> atomic_insert is taking a const T*.? Note that it's only caller passes
>> a non-const T*.
>
> I'll change the type to non-const.? We try to use consts...
>
> Thanks for the detailed review!? The gcc compiler seems happy so far, 
> I'll post a webrev of the result of these changes after fixing 
> Atomic::sub() and seeing how the other compilers deal with these changes.
>
> Thanks,
> Coleen
>
>>
>> ------------------------------------------------------------------------------ 
>>
>>
>


From david.holmes at oracle.com  Sat Oct 14 12:32:14 2017
From: david.holmes at oracle.com (David Holmes)
Date: Sat, 14 Oct 2017 22:32:14 +1000
Subject: RFR (L, but tedious) 8188220: Remove Atomic::*_ptr() uses and
 overloads from hotspot
In-Reply-To: <f50c2c96-11a8-2d02-5eaf-292451b6c6ff@oracle.com>
References: <b6d75bd8-ca77-6507-f1f0-db57bbb99ae0@oracle.com>
 <7A475565-84D9-4F98-AE7B-2FDB206CC6E1@oracle.com>
 <49b7c5f7-2f6d-16ac-0b60-140619d0fffd@oracle.com>
 <f50c2c96-11a8-2d02-5eaf-292451b6c6ff@oracle.com>
Message-ID: <7265c30d-946b-19c4-a1b3-c3314a869ee8@oracle.com>

Hi Coleen,

These changes all seem okay to me - except I can't comment on the 
Atomic::sub implementation. :)

Thanks for adding the assert to header_addr(). FYI from objectMonitor.hpp:

// ObjectMonitor Layout Overview/Highlights/Restrictions:
//
// - The _header field must be at offset 0 because the displaced header
//   from markOop is stored there. We do not want markOop.hpp to include
//   ObjectMonitor.hpp to avoid exposing ObjectMonitor everywhere. This
//   means that ObjectMonitor cannot inherit from any other class nor can
//   it use any virtual member functions. This restriction is critical to
//   the proper functioning of the VM.

so it is important we ensure this holds.

Thanks,
David

On 14/10/2017 4:34 AM, coleen.phillimore at oracle.com wrote:
> 
> Hi, Here is the version with the changes from Kim's comments that has 
> passed at least testing with JPRT and tier1, locally.?? More testing 
> (tier2-5) is in progress.
> 
> Also includes a corrected version of Atomic::sub care of Erik Osterlund.
> 
> open webrev at 
> http://cr.openjdk.java.net/~coleenp/8188220.kim-review-changes/webrev
> open webrev at 
> http://cr.openjdk.java.net/~coleenp/8188220.review-comments/webrev
> 
> Full version:
> 
> http://cr.openjdk.java.net/~coleenp/8188220.03/webrev
> 
> Thanks!
> Coleen
> 
> On 10/13/17 9:25 AM, coleen.phillimore at oracle.com wrote:
>>
>> Hi Kim, Thank you for the detailed review and the time you've spent on 
>> it, and discussion yesterday.
>>
>> On 10/12/17 7:17 PM, Kim Barrett wrote:
>>>> On Oct 10, 2017, at 6:01 PM, coleen.phillimore at oracle.com wrote:
>>>>
>>>> Summary: With the new template functions these are unnecessary.
>>>>
>>>> The changes are mostly s/_ptr// and removing the cast to return 
>>>> type.? There weren't many types that needed to be improved to match 
>>>> the template version of the function.?? Some notes:
>>>> 1. replaced CASPTR with Atomic::cmpxchg() in mutex.cpp, rearranging 
>>>> arguments.
>>>> 2. renamed Atomic::replace_if_null to Atomic::cmpxchg_if_null.? I 
>>>> disliked the first name because it's not explicit from the callers 
>>>> that there's an underlying cas.? If people want to fight, I'll 
>>>> remove the function and use cmpxchg because there are only a couple 
>>>> places where this is a little nicer.
>>>> 3. Added Atomic::sub()
>>>>
>>>> Tested with JPRT, mach5 tier1-5 on linux,windows and solaris.
>>>>
>>>> open webrev at http://cr.openjdk.java.net/~coleenp/8188220.01/webrev
>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8188220
>>>>
>>>> Thanks,
>>>> Coleen
>>> I looked harder at the potential ABA problems, and believe they are
>>> okay.? There can be multiple threads doing pushes, and there can be
>>> multiple threads doing pops, but not both at the same time.
>>>
>>> ------------------------------------------------------------------------------ 
>>>
>>> src/hotspot/cpu/zero/cppInterpreter_zero.cpp
>>> ? 279???? if (Atomic::cmpxchg(monitor, lockee->mark_addr(), disp) != 
>>> disp) {
>>>
>>> How does this work?? monitor and disp seem like they have unrelated
>>> types?? Given that this is zero-specific code, maybe this hasn't been
>>> tested?
>>>
>>> Similarly here:
>>> ? 423?????? if (Atomic::cmpxchg(header, rcvr->mark_addr(), lock) != 
>>> lock) {
>>
>> I haven't built zero.? I don't know how to do this anymore (help?) I 
>> fixed the obvious type mismatches here and in 
>> bytecodeInterpreter.cpp.? I'll try to build it.
>>>
>>> ------------------------------------------------------------------------------ 
>>>
>>> src/hotspot/share/asm/assembler.cpp
>>> ? 239???????? dcon->value_fn = cfn;
>>>
>>> Is it actually safe to remove the atomic update?? If multiple threads
>>> performing the assignment *are* possible (and I don't understand the
>>> context yet, so don't know the answer to that), then a bare non-atomic
>>> assignment is a race, e.g. undefined behavior.
>>>
>>> Regardless of that, I think the CAST_FROM_FN_PTR should be retained.
>>
>> I can find no uses of this code, ie. looking for "delayed_value". I 
>> think it was early jsr292 code.? I could also not find any combination 
>> of casts that would make it compile, so in the end I believed the 
>> comment and took out the cmpxchg.?? The code appears to be intended to 
>> for bootstrapping, see the call to update_delayed_values() in 
>> JavaClasses::compute_offsets().
>>
>> The CAST_FROM_FN_PTR was to get it to compile with cmpxchg, the new 
>> code does not require a cast.? If you can help with finding the right 
>> set of casts, I'd be happy to put the cmpxchg back in. I just couldn't 
>> find one.
>>>
>>> ------------------------------------------------------------------------------ 
>>>
>>> src/hotspot/share/classfile/classLoaderData.cpp
>>> ? 167?? Chunk* head = (Chunk*) OrderAccess::load_acquire(&_head);
>>>
>>> I think the cast to Chunk* is no longer needed.
>>
>> Missed another, thanks.? No that's the same one David found.
>>>
>>> ------------------------------------------------------------------------------ 
>>>
>>> src/hotspot/share/classfile/classLoaderData.cpp
>>> ? 946???? ClassLoaderData* old = Atomic::cmpxchg(cld, cld_addr, 
>>> (ClassLoaderData*)NULL);
>>> ? 947???? if (old != NULL) {
>>> ? 948?????? delete cld;
>>> ? 949?????? // Returns the data.
>>> ? 950?????? return old;
>>> ? 951???? }
>>>
>>> That could instead be
>>>
>>> ?? if (!Atomic::replace_if_null(cld, cld_addr)) {
>>> ???? delete cld;?????????? // Lost the race.
>>> ???? return *cld_addr;???? // Use the winner's value.
>>> ?? }
>>>
>>> And apparently the caller of CLDG::add doesn't care whether the
>>> returned CLD has actually been added to the graph yet.? If that's not
>>> true, then there's a bug here, since a race loser might return a
>>> winner's value before the winner has actually done the insertion.
>>
>> True, the race loser doesn't care whether the CLD has been added to 
>> the graph.
>> Your instead code requires a comment that replace_if_null is really a 
>> compare exchange and has an extra read of the original value, so I am 
>> leaving what I have which is clearer to me.
>>
>>>
>>> ------------------------------------------------------------------------------ 
>>>
>>> src/hotspot/share/classfile/verifier.cpp
>>> ?? 71 static void* verify_byte_codes_fn() {
>>> ?? 72?? if (OrderAccess::load_acquire(&_verify_byte_codes_fn) == NULL) {
>>> ?? 73???? void *lib_handle = os::native_java_library();
>>> ?? 74???? void *func = os::dll_lookup(lib_handle, 
>>> "VerifyClassCodesForMajorVersion");
>>> ?? 75???? OrderAccess::release_store(&_verify_byte_codes_fn, func);
>>> ?? 76???? if (func == NULL) {
>>> ?? 77?????? _is_new_verify_byte_codes_fn = false;
>>> ?? 78?????? func = os::dll_lookup(lib_handle, "VerifyClassCodes");
>>> ?? 79 OrderAccess::release_store(&_verify_byte_codes_fn, func);
>>> ?? 80???? }
>>> ?? 81?? }
>>> ?? 82?? return (void*)_verify_byte_codes_fn;
>>> ?? 83 }
>>>
>>> [pre-existing]
>>>
>>> I think this code has race problems; a caller could unexpectedly and
>>> inappropriately return NULL.? Consider the case where there is no
>>> VerifyClassCodesForMajorVersion, but there is VerifyClassCodes.
>>>
>>> The variable is initially NULL.
>>>
>>> Both Thread1 and Thread2 reach line 73, having both seen a NULL value
>>> for the variable.
>>>
>>> Thread1 reaches line 80, setting the variable to VerifyClassCodes.
>>>
>>> Thread2 reaches line 76, resetting the variable to NULL.
>>>
>>> Thread1 reads the now (momentarily) NULL value and returns it.
>>>
>>> I think the first release_store should be conditional on func != NULL.
>>> Also, the usage of _is_new_verify_byte_codes_fn seems suspect.
>>> And a minor additional nit: the cast in the return is unnecessary.
>>
>> Yes, this looks like a bug.?? I'll cut/paste this and file it. It may 
>> be that this is support for the old verifier in old jdk versions that 
>> can be cleaned up.
>>>
>>> ------------------------------------------------------------------------------ 
>>>
>>> src/hotspot/share/code/nmethod.cpp
>>> 1664?? nmethod* observed_mark_link = _oops_do_mark_link;
>>> 1665?? if (observed_mark_link == NULL) {
>>> 1666???? // Claim this nmethod for this thread to mark.
>>> 1667???? if (Atomic::cmpxchg_if_null(NMETHOD_SENTINEL, 
>>> &_oops_do_mark_link)) {
>>>
>>> With these changes, the only use of observed_mark_link is in the if.
>>> I'm not sure that variable is really useful anymore, e.g. just use
>>>
>>> ?? if (_oops_do_mark_link == NULL) {
>>
>> Ok fixed.
>>>
>>> ------------------------------------------------------------------------------ 
>>>
>>> src/hotspot/share/gc/cms/concurrentMarkSweepGeneration.cpp
>>>
>>> In CMSCollector::par_take_from_overflow_list, if BUSY and prefix were
>>> of type oopDesc*, I think there would be a whole lot fewer casts and
>>> cast_to_oop's.? Later on, I think suffix_head, observed_overflow_list,
>>> and curr_overflow_list could also be oopDesc* instead of oop to
>>> eliminate more casts.
>>
>> I actually tried to make this change but ran into more fan out that 
>> way, so went back and just fixed the cmpxchg calls to cast oops to 
>> oopDesc* and things were less perturbed that way.
>>>
>>> And some similar changes in CMSCollector::par_push_on_overflow_list.
>>>
>>> And similarly in parNewGeneration.cpp, in push_on_overflow_list and
>>> take_from_overflow_list_work.
>>>
>>> As noted in the comments for JDK-8165857, the lists and "objects"
>>> involved here aren't really oops, but rather the shattered remains of
>>
>> Yes, somewhat horrified at the value of BUSY.
>>> oops.? The suggestion there was to use HeapWord* and carry through the
>>> fanout; what was actually done was to change _overflow_list to
>>> oopDesc* to minimize fanout, even though that's kind of lying to the
>>> type system.? Now, with the cleanup of cmpxchg_ptr and such, we're
>>> paying the price of doing the minimal thing back then.
>>
>> I will file an RFE about cleaning this up.? I think what I've done was 
>> the minimal thing.
>>>
>>> ------------------------------------------------------------------------------ 
>>>
>>> src/hotspot/share/gc/cms/concurrentMarkSweepGeneration.cpp
>>> 7960?? Atomic::add(-n, &_num_par_pushes);
>>>
>>> Atomic::sub
>>
>> fixed.
>>>
>>> ------------------------------------------------------------------------------ 
>>>
>>> src/hotspot/share/gc/cms/parNewGeneration.cpp
>>> 1455?? Atomic::add(-n, &_num_par_pushes);
>> fixed.
>>> Atomic::sub
>>>
>>> ------------------------------------------------------------------------------ 
>>>
>>> src/hotspot/share/gc/g1/dirtyCardQueue.cpp
>>> ? 283???? void* actual = Atomic::cmpxchg(next, &_cur_par_buffer_node, 
>>> nd);
>>> ...
>>> ? 289?????? nd = static_cast<BufferNode*>(actual);
>>>
>>> Change actual's type to BufferNode* and remove the cast on line 289.
>>
>> fixed.? missed that one. gross.
>>>
>>> ------------------------------------------------------------------------------ 
>>>
>>> src/hotspot/share/gc/g1/g1CollectedHeap.cpp
>>>
>>> [pre-existing]
>>> 3499???????? old = (CompiledMethod*)_postponed_list;
>>>
>>> I think that cast is only needed because
>>> G1CodeCacheUnloadingTask::_postponed_list is incorrectly typed as
>>> "volatile CompiledMethod*", when I think it ought to be
>>> "CompiledMethod* volatile".
>>>
>>> I think G1CodeCacheUnloading::_claimed_nmethod is similarly mis-typed,
>>> with a similar should not be needed cast:
>>> 3530?????? first = (CompiledMethod*)_claimed_nmethod;
>>>
>>> and another for _postponed_list here:
>>> 3552?????? claim = (CompiledMethod*)_postponed_list;
>>
>> I've fixed this.?? C++ is so confusing about where to put the 
>> volatile.?? Everyone has been tripped up by it.
>>>
>>> ------------------------------------------------------------------------------ 
>>>
>>> src/hotspot/share/gc/g1/g1HotCardCache.cpp
>>> ?? 77?? jbyte* previous_ptr = (jbyte*)Atomic::cmpxchg(card_ptr,
>>>
>>> I think the cast of the cmpxchg result is no longer needed.
>>
>> fixed.
>>>
>>> ------------------------------------------------------------------------------ 
>>>
>>> src/hotspot/share/gc/g1/g1PageBasedVirtualSpace.cpp
>>> ? 254?????? char* touch_addr = (char*)Atomic::add(actual_chunk_size, 
>>> &_cur_addr) - actual_chunk_size;
>>>
>>> I think the cast of the add result is no longer needed.
>> got it already.
>>>
>>> ------------------------------------------------------------------------------ 
>>>
>>> src/hotspot/share/gc/g1/g1StringDedup.cpp
>>> ? 213?? return (size_t)Atomic::add(partition_size, &_next_bucket) - 
>>> partition_size;
>>>
>>> I think the cast of the add result is no longer needed.
>>
>> I was slacking in the g1 files.? fixed.
>>>
>>> ------------------------------------------------------------------------------ 
>>>
>>> src/hotspot/share/gc/g1/heapRegionRemSet.cpp
>>> ? 200?????? PerRegionTable* res =
>>> ? 201???????? Atomic::cmpxchg(nxt, &_free_list, fl);
>>>
>>> Please remove the line break, now that the code has been simplified.
>>>
>>> But wait, doesn't this alloc exhibit classic ABA problems?? I *think*
>>> this works because alloc and bulk_free are called in different phases,
>>> never overlapping.
>>
>> I don't know.? Do you want to file a bug to investigate this?
>> fixed.
>>>
>>> ------------------------------------------------------------------------------ 
>>>
>>> src/hotspot/share/gc/g1/sparsePRT.cpp
>>> ? 295???? SparsePRT* res =
>>> ? 296?????? Atomic::cmpxchg(sprt, &_head_expanded_list, hd);
>>> and
>>> ? 307???? SparsePRT* res =
>>> ? 308?????? Atomic::cmpxchg(next, &_head_expanded_list, hd);
>>>
>>> I'd rather not have the line breaks in these either.
>>>
>>> And get_from_expanded_list also appears to have classic ABA problems.
>>> I *think* this works because add_to_expanded_list and
>>> get_from_expanded_list are called in different phases, never
>>> overlapping.
>>
>> Fixed, same question as above?? Or one bug to investigate both?
>>>
>>> ------------------------------------------------------------------------------ 
>>>
>>> src/hotspot/share/gc/shared/taskqueue.inline.hpp
>>> ? 262?? return (size_t) Atomic::cmpxchg((intptr_t)new_age._data,
>>> ? 263?????????????????????????????????? (volatile intptr_t *)&_data,
>>> ? 264 (intptr_t)old_age._data);
>>>
>>> This should be
>>>
>>> ?? return Atomic::cmpxchg(new_age._data, &_data, old_age._data);
>>
>> fixed.
>>>
>>> ------------------------------------------------------------------------------ 
>>>
>>> src/hotspot/share/interpreter/bytecodeInterpreter.cpp
>>> This doesn't have any casts, which I think is correct.
>>> ? 708???????????? if (Atomic::cmpxchg(header, rcvr->mark_addr(), 
>>> mark) == mark) {
>>>
>>> but these do.
>>> ? 718???????????? if (Atomic::cmpxchg((void*)new_header, 
>>> rcvr->mark_addr(), mark) == mark) {
>>> ? 737???????????? if (Atomic::cmpxchg((void*)new_header, 
>>> rcvr->mark_addr(), header) == header) {
>>>
>>> I'm not sure how the ones with casts even compile?? mark_addr() seems
>>> to be a markOop*, which is a markOopDesc**, where markOopDesc is a
>>> class.? void* is not implicitly convertible to markOopDesc*.
>>>
>>> Hm, this entire file is #ifdef CC_INTERP.? Is this zero-only code?? Or
>>> something like that?
>>>
>>> Similarly here:
>>> ? 906?????????? if (Atomic::cmpxchg(header, lockee->mark_addr(), 
>>> mark) == mark) {
>>> and
>>> ? 917?????????? if (Atomic::cmpxchg((void*)new_header, 
>>> lockee->mark_addr(), mark) == mark) {
>>> ? 935?????????? if (Atomic::cmpxchg((void*)new_header, 
>>> lockee->mark_addr(), header) == header) {
>>>
>>> and here:
>>> 1847?????????????? if (Atomic::cmpxchg(header, lockee->mark_addr(), 
>>> mark) == mark) {
>>> 1858?????????????? if (Atomic::cmpxchg((void*)new_header, 
>>> lockee->mark_addr(), mark) == mark) {
>>> 1878?????????????? if (Atomic::cmpxchg((void*)new_header, 
>>> lockee->mark_addr(), header) == header) {
>>>
>>> and here:
>>> 1847?????????????? if (Atomic::cmpxchg(header, lockee->mark_addr(), 
>>> mark) == mark) {
>>> 1858?????????????? if (Atomic::cmpxchg((void*)new_header, 
>>> lockee->mark_addr(), mark) == mark) {
>>> 1878?????????????? if (Atomic::cmpxchg((void*)new_header, 
>>> lockee->mark_addr(), header) == header) {
>>
>> I've changed all these.?? This is part of Zero.
>>>
>>> ------------------------------------------------------------------------------ 
>>>
>>> src/hotspot/share/memory/metaspace.cpp
>>> 1502?? size_t value = OrderAccess::load_acquire(&_capacity_until_GC);
>>> ...
>>> 1537?? return (size_t)Atomic::sub((intptr_t)v, &_capacity_until_GC);
>>>
>>> These and other uses of _capacity_until_GC suggest that variable's
>>> type should be size_t rather than intptr_t.? Note that I haven't done
>>> a careful check of uses to see if there are any places where such a
>>> change would cause problems.
>>
>> Yes, I had a hard time with metaspace.cpp because I agree 
>> _capacity_until_GC should be size_t.?? Tried to make this change and 
>> it cascaded a bit.? I'll file an RFE to change this type separately.
>>
>>>
>>> ------------------------------------------------------------------------------ 
>>>
>>> src/hotspot/share/oops/constantPool.cpp
>>> ? 229?? OrderAccess::release_store((Klass* volatile *)adr, k);
>>> ? 246?? OrderAccess::release_store((Klass* volatile *)adr, k);
>>> ? 514?? OrderAccess::release_store((Klass* volatile *)adr, k);
>>>
>>> Casts are not needed.
>>
>> fixed.
>>>
>>> ------------------------------------------------------------------------------ 
>>>
>>> src/hotspot/share/oops/constantPool.hpp
>>> ? 148???? volatile intptr_t adr = 
>>> OrderAccess::load_acquire(obj_at_addr_raw(which));
>>>
>>> [pre-existing]
>>> Why is adr declared volatile?
>>
>> golly beats me.? concurrency is scary, especially in the constant pool.
>> The load_acquire() should make sure the value is fetched from memory 
>> so volatile is unneeded.
>>
>>>
>>> ------------------------------------------------------------------------------ 
>>>
>>> src/hotspot/share/oops/cpCache.cpp
>>> ? 157???? intx newflags = (value & parameter_size_mask);
>>> ? 158???? Atomic::cmpxchg(newflags, &_flags, (intx)0);
>>>
>>> This is a nice demonstration of why I wanted to include some value
>>> preserving integral conversions in cmpxchg, rather than requiring
>>> exact type matching in the integral case.? There have been some others
>>> that I haven't commented on.? Apparently we (I) got away with
>>> including such conversions in Atomic::add, which I'd forgotten about.
>>> And see comment regarding Atomic::sub below.
>>>
>>> ------------------------------------------------------------------------------ 
>>>
>>> src/hotspot/share/oops/cpCache.hpp
>>> ? 139?? volatile Metadata*?? _f1;?????? // entry specific metadata field
>>>
>>> [pre-existing]
>>> I suspect the type should be Metadata* volatile.? And that would
>>> eliminate the need for the cast here:
>>>
>>> ? 339?? Metadata* f1_ord() const?????????????????????? { return 
>>> (Metadata *)OrderAccess::load_acquire(&_f1); }
>>>
>>> I don't know if there are any other changes needed or desirable around
>>> _f1 usage.
>>
>> yes, fixed this.
>>>
>>> ------------------------------------------------------------------------------ 
>>>
>>> src/hotspot/share/oops/method.hpp
>>> ? 139?? volatile address from_compiled_entry() const?? { return 
>>> OrderAccess::load_acquire(&_from_compiled_entry); }
>>> ? 140?? volatile address from_compiled_entry_no_trampoline() const;
>>> ? 141?? volatile address from_interpreted_entry() const{ return 
>>> OrderAccess::load_acquire(&_from_interpreted_entry); }
>>>
>>> [pre-existing]
>>> The volatile qualifiers here seem suspect to me.
>>
>> Again much suspicion about concurrency and giant pain, which I 
>> remember, of debugging these when they were broken.
>>>
>>> ------------------------------------------------------------------------------ 
>>>
>>> src/hotspot/share/oops/oop.inline.hpp
>>> ? 391???? narrowOop old = (narrowOop)Atomic::xchg(val, 
>>> (narrowOop*)dest);
>>>
>>> Cast of return type is not needed.
>>
>> fixed.
>>
>>>
>>> ------------------------------------------------------------------------------ 
>>>
>>> src/hotspot/share/prims/jni.cpp
>>>
>>> [pre-existing]
>>>
>>> copy_jni_function_table should be using Copy::disjoint_words_atomic.
>>
>> yuck.
>>>
>>> ------------------------------------------------------------------------------ 
>>>
>>> src/hotspot/share/prims/jni.cpp
>>>
>>> [pre-existing]
>>>
>>> 3892?? // We're about to use Atomic::xchg for synchronization. Some Zero
>>> 3893?? // platforms use the GCC builtin __sync_lock_test_and_set for 
>>> this,
>>> 3894?? // but __sync_lock_test_and_set is not guaranteed to do what 
>>> we want
>>> 3895?? // on all architectures.? So we check it works before relying 
>>> on it.
>>> 3896 #if defined(ZERO) && defined(ASSERT)
>>> 3897?? {
>>> 3898???? jint a = 0xcafebabe;
>>> 3899???? jint b = Atomic::xchg(0xdeadbeef, &a);
>>> 3900???? void *c = &a;
>>> 3901???? void *d = Atomic::xchg(&b, &c);
>>> 3902???? assert(a == (jint) 0xdeadbeef && b == (jint) 0xcafebabe, 
>>> "Atomic::xchg() works");
>>> 3903???? assert(c == &b && d == &a, "Atomic::xchg() works");
>>> 3904?? }
>>> 3905 #endif // ZERO && ASSERT
>>>
>>> It seems rather strange to be testing Atomic::xchg() here, rather than
>>> as part of unit testing Atomic?? Fail unit testing => don't try to
>>> use...
>>
>> This is zero.? I'm not touching this.
>>>
>>> ------------------------------------------------------------------------------ 
>>>
>>> src/hotspot/share/prims/jvmtiRawMonitor.cpp
>>> ? 130???? if (Atomic::cmpxchg_if_null((void*)Self, &_owner)) {
>>> ? 142???? if (_owner == NULL && Atomic::cmpxchg_if_null((void*)Self, 
>>> &_owner)) {
>>>
>>> I think these casts aren't needed. _owner is void*, and Self is
>>> Thread*, which is implicitly convertible to void*.
>>>
>>> Similarly here, for the THREAD argument:
>>> ? 280???? Contended = Atomic::cmpxchg((void*)THREAD, &_owner, 
>>> (void*)NULL);
>>> ? 283???? Contended = Atomic::cmpxchg((void*)THREAD, &_owner, 
>>> (void*)NULL);
>>
>> Okay, let me see if the compiler(s) eat that. (yes they do)
>>>
>>> ------------------------------------------------------------------------------ 
>>>
>>> src/hotspot/share/prims/jvmtiRawMonitor.hpp
>>>
>>> This file is in the webrev, but seems to be unchanged.
>>
>> It'll be cleaned up with the the commit and not be part of the changeset.
>>>
>>> ------------------------------------------------------------------------------ 
>>>
>>> src/hotspot/share/runtime/atomic.hpp
>>> ? 520 template<typename I, typename D>
>>> ? 521 inline D Atomic::sub(I sub_value, D volatile* dest) {
>>> ? 522?? STATIC_ASSERT(IsPointer<D>::value || IsIntegral<D>::value);
>>> ? 523?? // Assumes two's complement integer representation.
>>> ? 524?? #pragma warning(suppress: 4146)
>>> ? 525?? return Atomic::add(-sub_value, dest);
>>> ? 526 }
>>>
>>> I'm pretty sure this implementation is incorrect.? I think it produces
>>> the wrong result when I and D are both unsigned integer types and
>>> sizeof(I) < sizeof(D).
>>
>> Can you suggest a correction?? I just copied Atomic::dec().
>>>
>>> ------------------------------------------------------------------------------ 
>>>
>>> src/hotspot/share/runtime/mutex.cpp
>>> ? 304?? intptr_t v = Atomic::cmpxchg((intptr_t)_LBIT, 
>>> &_LockWord.FullWord, (intptr_t)0);? // agro ...
>>>
>>> _LBIT should probably be intptr_t, rather than an enum.? Note that the
>>> enum type is unused.? The old value here is another place where an
>>> implicit widening of same signedness would have been nice. (Such
>>> implicit widening doesn't work for enums, since it's unspecified
>>> whether they default to signed or unsigned representation, and
>>> implementatinos differ.)
>>
>> This would be a good/simple cleanup.? I changed it to const intptr_t 
>> _LBIT = 1;
>>>
>>> ------------------------------------------------------------------------------ 
>>>
>>> src/hotspot/share/runtime/mutex.hpp
>>>
>>> [pre-existing]
>>>
>>> I think the Address member of the SplitWord union is unused. Looking
>>> at AcquireOrPush (and others), I'm wondering whether it *should* be
>>> used there, or whether just using intptr_t casts and doing integral
>>> arithmetic (as is presently being done) is easier and clearer.
>>>
>>> Also the _LSBINDEX macro probably ought to be defined in mutex.cpp
>>> rather than polluting the global namespace.? And technically, that
>>> name is reserved word.
>>
>> I moved both this and _LBIT into the top of mutex.cpp since they are 
>> used there.
>> Cant define const intptr_t _LBIT =1; in a class in our version of C++.
>>>
>>> ------------------------------------------------------------------------------ 
>>>
>>> src/hotspot/share/runtime/objectMonitor.cpp
>>> ? 252?? void * cur = Atomic::cmpxchg((void*)Self, &_owner, (void*)NULL);
>>> ? 409?? if (Atomic::cmpxchg_if_null((void*)Self, &_owner)) {
>>> 1983?????? ox = (Thread*)Atomic::cmpxchg((void*)Self, &_owner, 
>>> (void*)NULL);
>>>
>>> I think the casts of Self aren't needed.
>>
>> fixed.
>>>
>>> ------------------------------------------------------------------------------ 
>>>
>>> src/hotspot/share/runtime/objectMonitor.cpp
>>> ? 995?????? if (!Atomic::cmpxchg_if_null((void*)THREAD, &_owner)) {
>>> 1020???????? if (!Atomic::cmpxchg_if_null((void*)THREAD, &_owner)) {
>>>
>>> I think the casts of THREAD aren't needed.
>>
>> nope, fixed.
>>> ------------------------------------------------------------------------------ 
>>>
>>> src/hotspot/share/runtime/objectMonitor.hpp
>>> ? 254?? markOopDesc* volatile* header_addr();
>>>
>>> Why isn't this volatile markOop* ?
>>
>> fixed.
>>>
>>> ------------------------------------------------------------------------------ 
>>>
>>> src/hotspot/share/runtime/synchronizer.cpp
>>> ? 242???????? Atomic::cmpxchg_if_null((void*)Self, &(m->_owner))) {
>>>
>>> I think the cast of Self isn't needed.
>>
>> fixed.
>>>
>>> ------------------------------------------------------------------------------ 
>>>
>>> src/hotspot/share/runtime/synchronizer.cpp
>>> ? 992?? for (; block != NULL; block = (PaddedEnd<ObjectMonitor> 
>>> *)next(block)) {
>>> 1734???? for (; block != NULL; block = (PaddedEnd<ObjectMonitor> 
>>> *)next(block)) {
>>>
>>> [pre-existing]
>>> All calls to next() pass a PaddedEnd<ObjectMonitor>* and cast the
>>> result.? How about moving all that behavior into next().
>>
>> I fixed this next() function, but it necessitated a cast to FreeNext 
>> field.? The PaddedEnd<> type was intentionally not propagated to all 
>> the things that use it.?? Which is a shame because there are a lot 
>> more casts to PaddedEnd<ObjectMonitor> that could have been removed.
>>>
>>> ------------------------------------------------------------------------------ 
>>>
>>> src/hotspot/share/runtime/synchronizer.cpp
>>> 1970???? if (monitor > (ObjectMonitor *)&block[0] &&
>>> 1971???????? monitor < (ObjectMonitor *)&block[_BLOCKSIZE]) {
>>>
>>> [pre-existing]
>>> Are the casts needed here?? I think PaddedEnd<ObjectMonitor> is
>>> derived from ObjectMonitor, so implicit conversions should apply.
>>
>> prob not.? removed them.
>>>
>>> ------------------------------------------------------------------------------ 
>>>
>>> src/hotspot/share/runtime/synchronizer.hpp
>>> ?? 28 #include "memory/padded.hpp"
>>> ? 163?? static PaddedEnd<ObjectMonitor> * volatile gBlockList;
>>>
>>> I was going to suggest as an alternative just making gBlockList a file
>>> scoped variable in synchronizer.cpp, since it isn't used outside of
>>> that file. Except that it is referenced by vmStructs.? Curses!
>>
>> It's also used by the SA.
>>>
>>> ------------------------------------------------------------------------------ 
>>>
>>> src/hotspot/share/runtime/thread.cpp
>>> 4707?? intptr_t w = Atomic::cmpxchg((intptr_t)LOCKBIT, Lock, 
>>> (intptr_t)0);
>>>
>>> This and other places suggest LOCKBIT should be defined as intptr_t,
>>> rather than as an enum value.? The MuxBits enum type is unused.
>>>
>>> And the cast of 0 is another case where implicit widening would be nice.
>>
>> Making LOCKBIT a const intptr_t = 1 removes a lot of casts.
>>>
>>> ------------------------------------------------------------------------------ 
>>>
>>> src/hotspot/share/services/mallocSiteTable.cpp
>>> ? 261 bool MallocSiteHashtableEntry::atomic_insert(const 
>>> MallocSiteHashtableEntry* entry) {
>>> ? 262?? return Atomic::cmpxchg_if_null(entry, (const 
>>> MallocSiteHashtableEntry**)&_next);
>>> ? 263 }
>>>
>>> I think the problem here that is leading to the cast is that
>>> atomic_insert is taking a const T*.? Note that it's only caller passes
>>> a non-const T*.
>>
>> I'll change the type to non-const.? We try to use consts...
>>
>> Thanks for the detailed review!? The gcc compiler seems happy so far, 
>> I'll post a webrev of the result of these changes after fixing 
>> Atomic::sub() and seeing how the other compilers deal with these changes.
>>
>> Thanks,
>> Coleen
>>
>>>
>>> ------------------------------------------------------------------------------ 
>>>
>>>
>>
> 

From rkennke at redhat.com  Sat Oct 14 22:41:05 2017
From: rkennke at redhat.com (Roman Kennke)
Date: Sun, 15 Oct 2017 00:41:05 +0200
Subject: RFR: 8171853: Remove Shark compiler
Message-ID: <92a2fec1-88f1-2579-1202-299b13062b7b@redhat.com>

The JEP to remove the Shark compiler has received exclusively positive 
feedback (JDK-8189173) on zero-dev. So here comes the big patch to 
remove it.

What I have done:

grep -i -R shark src
grep -i -R shark make
grep -i -R shark doc
grep -i -R shark doc

and purged any reference to shark. Almost everything was straightforward.

The only things I wasn't really sure of:

- in globals.hpp, I re-arranged the KIND_* bits to account for the gap 
that removing KIND_SHARK left. I hope that's good?
- in relocInfo_zero.hpp I put a ShouldNotCallThis() in 
pd_address_in_code(), I am not sure it is the right thing to do. If not, 
what *would* be the right thing?

Then of course I did:

rm -rf src/hotspot/share/shark

I also went through the build machinery and removed stuff related to 
Shark and LLVM libs.

Now the only references in the whole JDK tree to shark is a 'Shark Bay' 
in a timezone file, and 'Wireshark' in some tests ;-)

I tested by building a regular x86 JVM and running JTREG tests. All 
looks fine.

- I could not build zero because it seems broken because of the recent 
Atomic::* changes
- I could not test any of the other arches that seemed to reference 
Shark (arm and sparc)

Here's the full webrev:

http://cr.openjdk.java.net/~rkennke/8171853/webrev.00/ 
<http://cr.openjdk.java.net/%7Erkennke/8171853/webrev.00/>

Can I get a review on this?

Thanks, Roman


From kim.barrett at oracle.com  Sat Oct 14 23:36:44 2017
From: kim.barrett at oracle.com (Kim Barrett)
Date: Sat, 14 Oct 2017 19:36:44 -0400
Subject: RFR (L, but tedious) 8188220: Remove Atomic::*_ptr() uses and
 overloads from hotspot
In-Reply-To: <f50c2c96-11a8-2d02-5eaf-292451b6c6ff@oracle.com>
References: <b6d75bd8-ca77-6507-f1f0-db57bbb99ae0@oracle.com>
 <7A475565-84D9-4F98-AE7B-2FDB206CC6E1@oracle.com>
 <49b7c5f7-2f6d-16ac-0b60-140619d0fffd@oracle.com>
 <f50c2c96-11a8-2d02-5eaf-292451b6c6ff@oracle.com>
Message-ID: <0784FA88-3D00-4DBA-8726-3A3B23C91B3E@oracle.com>

> On Oct 13, 2017, at 2:34 PM, coleen.phillimore at oracle.com wrote:
> 
> 
> Hi, Here is the version with the changes from Kim's comments that has passed at least testing with JPRT and tier1, locally.   More testing (tier2-5) is in progress.
> 
> Also includes a corrected version of Atomic::sub care of Erik Osterlund.
> 
> open webrev at http://cr.openjdk.java.net/~coleenp/8188220.kim-review-changes/webrev
> open webrev at http://cr.openjdk.java.net/~coleenp/8188220.review-comments/webrev
> 
> Full version:
> 
> http://cr.openjdk.java.net/~coleenp/8188220.03/webrev
> 
> Thanks!
> Coleen

I still dislike and disagree with what is being proposed regarding replace_if_null.

------------------------------------------------------------------------------
I forgot that I'd promised you an updated Atomic::sub definition.
Unfortunately, the new one still has problems, performing some
conversions that should not be permitted (and are disallowed by
Atomic::add).  Try this instead.  (This hasn't been tested, not even
compiled; hopefully I don't have any typos or anything.)  The intent
is that this supports the same conversions as Atomic::add.

template<typename I, typename D>
inline D Atomic::sub(I sub_value, D volatile* dest) {
  STATIC_ASSERT(IsPointer<D>::value || IsIntegral<D>::value);
  STATIC_ASSERT(IsIntegral<I>::value);
  // If D is a pointer type, use [u]intptr_t as the addend type,
  // matching signedness of I.  Otherwise, use D as the addend type.
  typedef typename Conditional<IsSigned<I>::value, intptr_t, uintptr_t>::type PI;
  typedef typename Conditional<IsPointer<D>::value, PI, D>::type AddendType;
  // Only allow conversions that can't change the value.
  STATIC_ASSERT(IsSigned<I>::value == IsSigned<AddendType>::value); 
  STATIC_ASSERT(sizeof(I) <= sizeof(AddendType));
  AddendType addend = sub_value;
  // Assumes two's complement integer representation.
  #pragma warning(suppress: 4146) // In case AddendType is not signed.
  return Atomic::add(-addend, dest);
}

>>> src/hotspot/share/gc/cms/concurrentMarkSweepGeneration.cpp
>>> 7960   Atomic::add(-n, &_num_par_pushes);
>>> 
>>> Atomic::sub
>> 
>> fixed.

Nope, not fixed in http://cr.openjdk.java.net/~coleenp/8188220.03/webrev

>>> src/hotspot/share/gc/g1/heapRegionRemSet.cpp
>>>   200       PerRegionTable* res =
>>>   201         Atomic::cmpxchg(nxt, &_free_list, fl);
>>> 
>>> Please remove the line break, now that the code has been simplified.
>>> 
>>> But wait, doesn't this alloc exhibit classic ABA problems?  I *think*
>>> this works because alloc and bulk_free are called in different phases,
>>> never overlapping.
>> 
>> I don't know.  Do you want to file a bug to investigate this?
>> fixed.

No, I now think it?s ok, though confusing.

>>> src/hotspot/share/gc/g1/sparsePRT.cpp
>>>   295     SparsePRT* res =
>>>   296       Atomic::cmpxchg(sprt, &_head_expanded_list, hd);
>>> and
>>>   307     SparsePRT* res =
>>>   308       Atomic::cmpxchg(next, &_head_expanded_list, hd);
>>> 
>>> I'd rather not have the line breaks in these either.
>>> 
>>> And get_from_expanded_list also appears to have classic ABA problems.
>>> I *think* this works because add_to_expanded_list and
>>> get_from_expanded_list are called in different phases, never
>>> overlapping.
>> 
>> Fixed, same question as above?  Or one bug to investigate both?

Again, I think it?s ok, though confusing.

>>> src/hotspot/share/gc/shared/taskqueue.inline.hpp
>>>   262   return (size_t) Atomic::cmpxchg((intptr_t)new_age._data,
>>>   263                                   (volatile intptr_t *)&_data,
>>>   264 (intptr_t)old_age._data);
>>> 
>>> This should be
>>> 
>>>    return Atomic::cmpxchg(new_age._data, &_data, old_age._data);
>> 
>> fixed.

Still casting the result.

>>> src/hotspot/share/oops/method.hpp
>>>   139   volatile address from_compiled_entry() const   { return OrderAccess::load_acquire(&_from_compiled_entry); }
>>>   140   volatile address from_compiled_entry_no_trampoline() const;
>>>   141   volatile address from_interpreted_entry() const{ return OrderAccess::load_acquire(&_from_interpreted_entry); }
>>> 
>>> [pre-existing]
>>> The volatile qualifiers here seem suspect to me.
>> 
>> Again much suspicion about concurrency and giant pain, which I remember, of debugging these when they were broken.

Let me be more direct: the volatile qualifiers for the function return
types are bogus and confusing, and should be removed.

>>> src/hotspot/share/prims/jni.cpp
>>> 
>>> [pre-existing]
>>> 
>>> copy_jni_function_table should be using Copy::disjoint_words_atomic.
>> 
>> yuck.

Of course, neither is entirely technically correct, since both are
treating conversion of function pointers to void* as okay in shared
code, e.g. violating some of the raison d'etre of CAST_{TO,FROM}_FN_PTR.
For way more detail than you probably care about, see the discussion
starting here:
http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2016-March/018578.html
through (5 messages in total)
http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2016-March/018623.html

Oh well.

>>> src/hotspot/share/runtime/mutex.hpp
>>> 
>>> [pre-existing]
>>> 
>>> I think the Address member of the SplitWord union is unused. Looking
>>> at AcquireOrPush (and others), I'm wondering whether it *should* be
>>> used there, or whether just using intptr_t casts and doing integral
>>> arithmetic (as is presently being done) is easier and clearer.
>>> 
>>> Also the _LSBINDEX macro probably ought to be defined in mutex.cpp
>>> rather than polluting the global namespace.  And technically, that
>>> name is reserved word.
>> 
>> I moved both this and _LBIT into the top of mutex.cpp since they are used there.

Good.

>> Cant define const intptr_t _LBIT =1; in a class in our version of C++.

Sorry, please explain?  If you tried to move it into SplitWord, that doesn?t work;
unions are not permitted to have static data members (I don?t off-hand know why,
just that it?s explicitly forbidden).

And you left the seemingly unused Address member in SplitWord.

>>> src/hotspot/share/runtime/thread.cpp
>>> 4707   intptr_t w = Atomic::cmpxchg((intptr_t)LOCKBIT, Lock, (intptr_t)0);
>>> 
>>> This and other places suggest LOCKBIT should be defined as intptr_t,
>>> rather than as an enum value.  The MuxBits enum type is unused.
>>> 
>>> And the cast of 0 is another case where implicit widening would be nice.
>> 
>> Making LOCKBIT a const intptr_t = 1 removes a lot of casts.

Because of the new definition of LOCKBIT I noticed the immediately
preceeding typedef for MutexT, which seems to be unused.

------------------------------------------------------------------------------
src/hotspot/share/oops/cpCache.cpp
 114 bool ConstantPoolCacheEntry::init_flags_atomic(intx flags) {
 115   intptr_t result = Atomic::cmpxchg(flags, &_flags, (intx)0);
 116   return (result == 0);
 117 }

[I missed this on earlier pass.]

Should be

bool ConstantPoolCacheEntry::init_flags_atomic(intx flags) {
  return Atomic::cmpxchg(flags, &_flags, (intx)0) == 0;
}

Otherwise, I end up asking why result is intptr_t when the cmpxchg is
dealing with intx.  Yeah, one's a typedef of the other, but mixing
them like that in the same expression is not helpful.


From glaubitz at physik.fu-berlin.de  Sun Oct 15 06:06:12 2017
From: glaubitz at physik.fu-berlin.de (John Paul Adrian Glaubitz)
Date: Sun, 15 Oct 2017 08:06:12 +0200
Subject: RFR: 8171853: Remove Shark compiler
In-Reply-To: <92a2fec1-88f1-2579-1202-299b13062b7b@redhat.com>
References: <92a2fec1-88f1-2579-1202-299b13062b7b@redhat.com>
Message-ID: <87BC5241-9C27-457F-9856-3D969831DABC@physik.fu-berlin.de>

Hi Roman!

Please let me look at SPARC next week first before merging this.

And thanks for notifying me that Zero is broken again *sigh*.

People, please test your changes. Yes, I know you all just care about Hotspot. But please understand that there are many people out there who rely on Zero, i.e. they are using it. Breaking code that people actively use is not nice and should not happen in a project like OpenJDK.

Building Zero takes maybe 5 minutes on a fast x86 machine, so I would like to ask everyone to please test their changes against Zero as well. These tests will keep the headaches for people relying on Zero low and also avoids that distributions have to ship many patches on top of OpenJDK upstream.

If you cannot test your patch on a given platform X, please let me know. I have access to every platform supported by OpenJDK except AIX/PPC.

Thanks,
Adrian

> On Oct 15, 2017, at 12:41 AM, Roman Kennke <rkennke at redhat.com> wrote:
> 
> The JEP to remove the Shark compiler has received exclusively positive feedback (JDK-8189173) on zero-dev. So here comes the big patch to remove it.
> 
> What I have done:
> 
> grep -i -R shark src
> grep -i -R shark make
> grep -i -R shark doc
> grep -i -R shark doc
> 
> and purged any reference to shark. Almost everything was straightforward.
> 
> The only things I wasn't really sure of:
> 
> - in globals.hpp, I re-arranged the KIND_* bits to account for the gap that removing KIND_SHARK left. I hope that's good?
> - in relocInfo_zero.hpp I put a ShouldNotCallThis() in pd_address_in_code(), I am not sure it is the right thing to do. If not, what *would* be the right thing?
> 
> Then of course I did:
> 
> rm -rf src/hotspot/share/shark
> 
> I also went through the build machinery and removed stuff related to Shark and LLVM libs.
> 
> Now the only references in the whole JDK tree to shark is a 'Shark Bay' in a timezone file, and 'Wireshark' in some tests ;-)
> 
> I tested by building a regular x86 JVM and running JTREG tests. All looks fine.
> 
> - I could not build zero because it seems broken because of the recent Atomic::* changes
> - I could not test any of the other arches that seemed to reference Shark (arm and sparc)
> 
> Here's the full webrev:
> 
> http://cr.openjdk.java.net/~rkennke/8171853/webrev.00/ <http://cr.openjdk.java.net/%7Erkennke/8171853/webrev.00/>
> 
> Can I get a review on this?
> 
> Thanks, Roman


From rkennke at redhat.com  Sun Oct 15 20:20:17 2017
From: rkennke at redhat.com (Roman Kennke)
Date: Sun, 15 Oct 2017 22:20:17 +0200
Subject: RFR: 8171853: Remove Shark compiler
In-Reply-To: <87BC5241-9C27-457F-9856-3D969831DABC@physik.fu-berlin.de>
References: <92a2fec1-88f1-2579-1202-299b13062b7b@redhat.com>
 <87BC5241-9C27-457F-9856-3D969831DABC@physik.fu-berlin.de>
Message-ID: <d5b9c0f3-66a8-7f1b-4d8f-ce9e9e5f373f@redhat.com>

Hi Adrian,
> Please let me look at SPARC next week first before merging this.
Thanks! Will wait for your feedback!
> And thanks for notifying me that Zero is broken again *sigh*.
It seems to be only a little thing. I have a fix that I'm currently 
testing. Will file another bug and an RFR soon.

Thanks, Roman


From glaubitz at physik.fu-berlin.de  Sun Oct 15 20:26:58 2017
From: glaubitz at physik.fu-berlin.de (John Paul Adrian Glaubitz)
Date: Sun, 15 Oct 2017 22:26:58 +0200
Subject: RFR: 8171853: Remove Shark compiler
In-Reply-To: <92a2fec1-88f1-2579-1202-299b13062b7b@redhat.com>
References: <92a2fec1-88f1-2579-1202-299b13062b7b@redhat.com>
Message-ID: <d334d7e0-bc0c-3cae-f9e4-259a34e080d6@physik.fu-berlin.de>

Hi Roman!

On 10/15/2017 12:41 AM, Roman Kennke wrote:
> The JEP to remove the Shark compiler has received exclusively positive
> feedback (JDK-8189173) on zero-dev. So here comes the big patch to remove it.

I have now read through the JEP and I have to say, I'm sad to see Shark go.

In my opinion, Shark should be a supported version of the JVM as LLVM is gaining
code generation support for more and more architectures. I have always liked the
idea to split out the code generation of compilers into a separate project and,
in fact, the compilers for many other languages like Rust and Julia rely on LLVM.

It's a pity that this value is not seen within the OpenJDK project.

> I tested by building a regular x86 JVM and running JTREG tests. All looks fine.
> 
> - I could not build zero because it seems broken because of the recent Atomic::* changes

I just performed a Zero test build with the current HG revision of OpenJDK on x86_64
without any problems and Zero on SPARC builds fine as well, so the problem you are
seeing has apparently been fixed now. I have not tested your patch yet though, I just
wanted to verify whether Zero still builds fine.

> - I could not test any of the other arches that seemed to reference Shark (arm and sparc)

I will test this later. I am currently waiting for JDK-8186579 to get merged which fixes
the last problem on Linux-SPARC.

Adrian

-- 
  .''`.  John Paul Adrian Glaubitz
: :' :  Debian Developer - glaubitz at debian.org
`. `'   Freie Universitaet Berlin - glaubitz at physik.fu-berlin.de
   `-    GPG: 62FF 8A75 84E0 2956 9546  0006 7426 3B37 F5B5 F913

From rkennke at redhat.com  Sun Oct 15 20:34:46 2017
From: rkennke at redhat.com (Roman Kennke)
Date: Sun, 15 Oct 2017 22:34:46 +0200
Subject: RFR: 8171853: Remove Shark compiler
In-Reply-To: <d334d7e0-bc0c-3cae-f9e4-259a34e080d6@physik.fu-berlin.de>
References: <92a2fec1-88f1-2579-1202-299b13062b7b@redhat.com>
 <d334d7e0-bc0c-3cae-f9e4-259a34e080d6@physik.fu-berlin.de>
Message-ID: <152a7a54-d30f-3c82-313a-608ef118628a@redhat.com>

Am 15.10.2017 um 22:26 schrieb John Paul Adrian Glaubitz:
> Hi Roman!
>
> On 10/15/2017 12:41 AM, Roman Kennke wrote:
>> The JEP to remove the Shark compiler has received exclusively positive
>> feedback (JDK-8189173) on zero-dev. So here comes the big patch to 
>> remove it.
>
> I have now read through the JEP and I have to say, I'm sad to see 
> Shark go.
>
> In my opinion, Shark should be a supported version of the JVM as LLVM 
> is gaining
> code generation support for more and more architectures. I have always 
> liked the
> idea to split out the code generation of compilers into a separate 
> project and,
> in fact, the compilers for many other languages like Rust and Julia 
> rely on LLVM.
>
> It's a pity that this value is not seen within the OpenJDK project.

Yes, I agree with you. However, at this point, fixing Shark amounts to 
almost complete rewrite of it. It would nowadays be based on jvmci. It 
would use the new and presumably much better JIT interface of LLVM. It 
would not use a shadow stack and a sane interface between LLVM and the 
GC (which hasn't existed back then). It's a project I'd personally like 
to do just for the fun of it, but I simply don't have enough time and 
the nerve to pull it off alone. In any case, as I said, it would 
probably make sense to start it from scratch.

>> I tested by building a regular x86 JVM and running JTREG tests. All 
>> looks fine.
>>
>> - I could not build zero because it seems broken because of the 
>> recent Atomic::* changes
>
> I just performed a Zero test build with the current HG revision of 
> OpenJDK on x86_64
> without any problems and Zero on SPARC builds fine as well, so the 
> problem you are
> seeing has apparently been fixed now. I have not tested your patch yet 
> though, I just
> wanted to verify whether Zero still builds fine.
I checked and noticed that it only affects debug builds. That's probably 
why it slipped through.

I filed https://bugs.openjdk.java.net/browse/JDK-8189333 and will post 
an RFR later.

>> - I could not test any of the other arches that seemed to reference 
>> Shark (arm and sparc)
>
> I will test this later. I am currently waiting for JDK-8186579 to get 
> merged which fixes
> the last problem on Linux-SPARC.

Okidoki, thanks a lot!!

Cheers, Roman

From glaubitz at physik.fu-berlin.de  Sun Oct 15 20:44:16 2017
From: glaubitz at physik.fu-berlin.de (John Paul Adrian Glaubitz)
Date: Sun, 15 Oct 2017 22:44:16 +0200
Subject: RFR: 8171853: Remove Shark compiler
In-Reply-To: <152a7a54-d30f-3c82-313a-608ef118628a@redhat.com>
References: <92a2fec1-88f1-2579-1202-299b13062b7b@redhat.com>
 <d334d7e0-bc0c-3cae-f9e4-259a34e080d6@physik.fu-berlin.de>
 <152a7a54-d30f-3c82-313a-608ef118628a@redhat.com>
Message-ID: <c5e29fac-4006-6d4c-421d-5932ad9ff9ef@physik.fu-berlin.de>

On 10/15/2017 10:34 PM, Roman Kennke wrote:
>> It's a pity that this value is not seen within the OpenJDK project.
> 
> Yes, I agree with you. However, at this point, fixing Shark amounts to almost
> complete rewrite of it. It would nowadays be based on jvmci. It would use the
> new and presumably much better JIT interface of LLVM. It would not use a shadow
> stack and a sane interface between LLVM and the GC (which hasn't existed back then).

Ok, that gives me some consolation, although I'm still sad about this decision.

> It's a project I'd personally like to do just for the fun of it, but I simply don't
> have enough time and the nerve to pull it off alone. In any case, as I said, it
> would probably make sense to start it from scratch.

FWIW, there are actually quite a number of users for Zero who would be happy to
have a JIT-version of it. One major user for Zero is MIPS (big-, little-endian,
32 and 64 bit) which still doesn't have a native code generator in Hotspot.

But we're also using Zero on architectures like m68k (yes, that still exists
as people are upgrading their Amigas and Ataris with fast FPGA accelerators)
and SuperH and it works fine.

I have also contributed several patches already to get Zero into a better
shape which allows it to build within Debian without additional patches,
I would definitely be interested in helping with a new Shark JVM although
I understand that would be a bigger project :).

> I checked and noticed that it only affects debug builds. That's probably why it slipped through.
> 
> I filed https://bugs.openjdk.java.net/browse/JDK-8189333 and will post an RFR later.

Ok, I'll test it once you've posted it.

>>> - I could not test any of the other arches that seemed to reference Shark (arm and sparc)
>>
>> I will test this later. I am currently waiting for JDK-8186579 to get merged which fixes
>> the last problem on Linux-SPARC.
> 
> Okidoki, thanks a lot!!

Let me pull this in and test Zero and Server on Linux SPARC.

Adrian

-- 
  .''`.  John Paul Adrian Glaubitz
: :' :  Debian Developer - glaubitz at debian.org
`. `'   Freie Universitaet Berlin - glaubitz at physik.fu-berlin.de
   `-    GPG: 62FF 8A75 84E0 2956 9546  0006 7426 3B37 F5B5 F913

From david.holmes at oracle.com  Sun Oct 15 20:48:23 2017
From: david.holmes at oracle.com (David Holmes)
Date: Mon, 16 Oct 2017 06:48:23 +1000
Subject: RFR: 8171853: Remove Shark compiler
In-Reply-To: <92a2fec1-88f1-2579-1202-299b13062b7b@redhat.com>
References: <92a2fec1-88f1-2579-1202-299b13062b7b@redhat.com>
Message-ID: <836cd06d-d10a-a98e-d996-b5b92de94c4b@oracle.com>

Hi Roman,

The build changes must be reviewed on build-dev - now cc'd.

Thanks,
David

On 15/10/2017 8:41 AM, Roman Kennke wrote:
> The JEP to remove the Shark compiler has received exclusively positive 
> feedback (JDK-8189173) on zero-dev. So here comes the big patch to 
> remove it.
> 
> What I have done:
> 
> grep -i -R shark src
> grep -i -R shark make
> grep -i -R shark doc
> grep -i -R shark doc
> 
> and purged any reference to shark. Almost everything was straightforward.
> 
> The only things I wasn't really sure of:
> 
> - in globals.hpp, I re-arranged the KIND_* bits to account for the gap 
> that removing KIND_SHARK left. I hope that's good?
> - in relocInfo_zero.hpp I put a ShouldNotCallThis() in 
> pd_address_in_code(), I am not sure it is the right thing to do. If not, 
> what *would* be the right thing?
> 
> Then of course I did:
> 
> rm -rf src/hotspot/share/shark
> 
> I also went through the build machinery and removed stuff related to 
> Shark and LLVM libs.
> 
> Now the only references in the whole JDK tree to shark is a 'Shark Bay' 
> in a timezone file, and 'Wireshark' in some tests ;-)
> 
> I tested by building a regular x86 JVM and running JTREG tests. All 
> looks fine.
> 
> - I could not build zero because it seems broken because of the recent 
> Atomic::* changes
> - I could not test any of the other arches that seemed to reference 
> Shark (arm and sparc)
> 
> Here's the full webrev:
> 
> http://cr.openjdk.java.net/~rkennke/8171853/webrev.00/ 
> <http://cr.openjdk.java.net/%7Erkennke/8171853/webrev.00/>
> 
> Can I get a review on this?
> 
> Thanks, Roman
> 

From glaubitz at physik.fu-berlin.de  Sun Oct 15 20:51:08 2017
From: glaubitz at physik.fu-berlin.de (John Paul Adrian Glaubitz)
Date: Sun, 15 Oct 2017 22:51:08 +0200
Subject: RFR: 8171853: Remove Shark compiler
In-Reply-To: <92a2fec1-88f1-2579-1202-299b13062b7b@redhat.com>
References: <92a2fec1-88f1-2579-1202-299b13062b7b@redhat.com>
Message-ID: <6e68e18a-f13a-bfd5-f486-d75448538ceb@physik.fu-berlin.de>

On 10/15/2017 12:41 AM, Roman Kennke wrote:
> Here's the full webrev:
> 
> http://cr.openjdk.java.net/~rkennke/8171853/webrev.00/ <http://cr.openjdk.java.net/%7Erkennke/8171853/webrev.00/>

Hmm, I just tried importing it:

glaubitz at deb4g:~/openjdk/hs$ hg import http://cr.openjdk.java.net/~rkennke/8171853/webrev.00/jdk10-hs-single.changeset
applying http://cr.openjdk.java.net/~rkennke/8171853/webrev.00/jdk10-hs-single.changeset
patching file make/autoconf/generated-configure.sh
Hunk #7 FAILED at 5104
1 out of 19 hunks FAILED -- saving rejects to file make/autoconf/generated-configure.sh.rej
abort: patch failed to apply
glaubitz at deb4g:~/openjdk/hs$

Does it need to be rebased?

Adrian

-- 
  .''`.  John Paul Adrian Glaubitz
: :' :  Debian Developer - glaubitz at debian.org
`. `'   Freie Universitaet Berlin - glaubitz at physik.fu-berlin.de
   `-    GPG: 62FF 8A75 84E0 2956 9546  0006 7426 3B37 F5B5 F913

From rkennke at redhat.com  Sun Oct 15 20:52:43 2017
From: rkennke at redhat.com (Roman Kennke)
Date: Sun, 15 Oct 2017 22:52:43 +0200
Subject: RFR: 8171853: Remove Shark compiler
In-Reply-To: <c5e29fac-4006-6d4c-421d-5932ad9ff9ef@physik.fu-berlin.de>
References: <92a2fec1-88f1-2579-1202-299b13062b7b@redhat.com>
 <d334d7e0-bc0c-3cae-f9e4-259a34e080d6@physik.fu-berlin.de>
 <152a7a54-d30f-3c82-313a-608ef118628a@redhat.com>
 <c5e29fac-4006-6d4c-421d-5932ad9ff9ef@physik.fu-berlin.de>
Message-ID: <cf829082-f2a8-e14d-5e49-9bf195007154@redhat.com>

Am 15.10.2017 um 22:44 schrieb John Paul Adrian Glaubitz:
> On 10/15/2017 10:34 PM, Roman Kennke wrote:
>>> It's a pity that this value is not seen within the OpenJDK project.
>>
>> Yes, I agree with you. However, at this point, fixing Shark amounts 
>> to almost
>> complete rewrite of it. It would nowadays be based on jvmci. It would 
>> use the
>> new and presumably much better JIT interface of LLVM. It would not 
>> use a shadow
>> stack and a sane interface between LLVM and the GC (which hasn't 
>> existed back then).
>
> Ok, that gives me some consolation, although I'm still sad about this 
> decision.
>
>> It's a project I'd personally like to do just for the fun of it, but 
>> I simply don't
>> have enough time and the nerve to pull it off alone. In any case, as 
>> I said, it
>> would probably make sense to start it from scratch.
>
> FWIW, there are actually quite a number of users for Zero who would be 
> happy to
> have a JIT-version of it. One major user for Zero is MIPS (big-, 
> little-endian,
> 32 and 64 bit) which still doesn't have a native code generator in 
> Hotspot.
>
> But we're also using Zero on architectures like m68k (yes, that still 
> exists
> as people are upgrading their Amigas and Ataris with fast FPGA 
> accelerators)
> and SuperH and it works fine.
And here is another complication: the last time I checked, the LLVM JIT 
only support very few platforms. I don't remember from the top off my 
head, but I'm pretty sure it's a subset of those supported natively by 
hotspot now (x86, arm and probably ppc). I doubt that MIPS and m68k are 
on the list of LLVM JIT supported platforms. A quick search yields no 
current information about this though.

> I have also contributed several patches already to get Zero into a better
> shape which allows it to build within Debian without additional patches,
> I would definitely be interested in helping with a new Shark JVM although
> I understand that would be a bigger project :).
Ok cool! If/when I ever get to do it (or somebody else) this will be 
very welcome :-)

Cheers, Roman

From rkennke at redhat.com  Sun Oct 15 21:00:15 2017
From: rkennke at redhat.com (Roman Kennke)
Date: Sun, 15 Oct 2017 23:00:15 +0200
Subject: RFR: 8171853: Remove Shark compiler
In-Reply-To: <6e68e18a-f13a-bfd5-f486-d75448538ceb@physik.fu-berlin.de>
References: <92a2fec1-88f1-2579-1202-299b13062b7b@redhat.com>
 <6e68e18a-f13a-bfd5-f486-d75448538ceb@physik.fu-berlin.de>
Message-ID: <67a4e380-64d3-d863-5b8f-53554158082f@redhat.com>

Am 15.10.2017 um 22:51 schrieb John Paul Adrian Glaubitz:
> On 10/15/2017 12:41 AM, Roman Kennke wrote:
>> Here's the full webrev:
>>
>> http://cr.openjdk.java.net/~rkennke/8171853/webrev.00/ 
>> <http://cr.openjdk.java.net/%7Erkennke/8171853/webrev.00/>
>
> Hmm, I just tried importing it:
>
> glaubitz at deb4g:~/openjdk/hs$ hg import 
> http://cr.openjdk.java.net/~rkennke/8171853/webrev.00/jdk10-hs-single.changeset
> applying 
> http://cr.openjdk.java.net/~rkennke/8171853/webrev.00/jdk10-hs-single.changeset
> patching file make/autoconf/generated-configure.sh
> Hunk #7 FAILED at 5104
> 1 out of 19 hunks FAILED -- saving rejects to file 
> make/autoconf/generated-configure.sh.rej
> abort: patch failed to apply
> glaubitz at deb4g:~/openjdk/hs$
>
> Does it need to be rebased?

Shouldn't be the case, but just to be sure, my patch is based on:

http://hg.openjdk.java.net/jdk10/hs/

Also, I've made a small fix that was related to Zero (now that I can 
actually build it), and I'm currently uploading to:

http://cr.openjdk.java.net/~rkennke/8171853/webrev.01/ 
<http://cr.openjdk.java.net/%7Erkennke/8171853/webrev.01/>

Notice that I have removed the generated-configure.sh part, which means 
you will be prompted to re-generate it.

Roman

From glaubitz at physik.fu-berlin.de  Sun Oct 15 21:01:15 2017
From: glaubitz at physik.fu-berlin.de (John Paul Adrian Glaubitz)
Date: Sun, 15 Oct 2017 23:01:15 +0200
Subject: RFR: 8171853: Remove Shark compiler
In-Reply-To: <67a4e380-64d3-d863-5b8f-53554158082f@redhat.com>
References: <92a2fec1-88f1-2579-1202-299b13062b7b@redhat.com>
 <6e68e18a-f13a-bfd5-f486-d75448538ceb@physik.fu-berlin.de>
 <67a4e380-64d3-d863-5b8f-53554158082f@redhat.com>
Message-ID: <094e215b-150a-4859-427d-85a201f118e4@physik.fu-berlin.de>

On 10/15/2017 11:00 PM, Roman Kennke wrote:
>> Does it need to be rebased?
> 
> Shouldn't be the case, but just to be sure, my patch is based on:
> 
> http://hg.openjdk.java.net/jdk10/hs/
> 
> Also, I've made a small fix that was related to Zero (now that I can actually build it), and I'm currently uploading to:
> 
> http://cr.openjdk.java.net/~rkennke/8171853/webrev.01/ <http://cr.openjdk.java.net/%7Erkennke/8171853/webrev.01/>
> 
> Notice that I have removed the generated-configure.sh part, which means you will be prompted to re-generate it.

Ok, I will pull that.

Adrian

-- 
  .''`.  John Paul Adrian Glaubitz
: :' :  Debian Developer - glaubitz at debian.org
`. `'   Freie Universitaet Berlin - glaubitz at physik.fu-berlin.de
   `-    GPG: 62FF 8A75 84E0 2956 9546  0006 7426 3B37 F5B5 F913

From rkennke at redhat.com  Sun Oct 15 21:01:42 2017
From: rkennke at redhat.com (Roman Kennke)
Date: Sun, 15 Oct 2017 23:01:42 +0200
Subject: RFR: 8171853: Remove Shark compiler
In-Reply-To: <836cd06d-d10a-a98e-d996-b5b92de94c4b@oracle.com>
References: <92a2fec1-88f1-2579-1202-299b13062b7b@redhat.com>
 <836cd06d-d10a-a98e-d996-b5b92de94c4b@oracle.com>
Message-ID: <980aa9ae-9c50-c1dc-d52f-e00234a2e3ca@redhat.com>

Hi David,

thanks!

I'm uploading a 2nd revision of the patch that excludes the 
generated-configure.sh part, and adds a smallish Zero-related fix.

http://cr.openjdk.java.net/~rkennke/8171853/webrev.01/ 
<http://cr.openjdk.java.net/%7Erkennke/8171853/webrev.01/>

Thanks, Roman


> Hi Roman,
>
> The build changes must be reviewed on build-dev - now cc'd.
>
> Thanks,
> David
>
> On 15/10/2017 8:41 AM, Roman Kennke wrote:
>> The JEP to remove the Shark compiler has received exclusively 
>> positive feedback (JDK-8189173) on zero-dev. So here comes the big 
>> patch to remove it.
>>
>> What I have done:
>>
>> grep -i -R shark src
>> grep -i -R shark make
>> grep -i -R shark doc
>> grep -i -R shark doc
>>
>> and purged any reference to shark. Almost everything was 
>> straightforward.
>>
>> The only things I wasn't really sure of:
>>
>> - in globals.hpp, I re-arranged the KIND_* bits to account for the 
>> gap that removing KIND_SHARK left. I hope that's good?
>> - in relocInfo_zero.hpp I put a ShouldNotCallThis() in 
>> pd_address_in_code(), I am not sure it is the right thing to do. If 
>> not, what *would* be the right thing?
>>
>> Then of course I did:
>>
>> rm -rf src/hotspot/share/shark
>>
>> I also went through the build machinery and removed stuff related to 
>> Shark and LLVM libs.
>>
>> Now the only references in the whole JDK tree to shark is a 'Shark 
>> Bay' in a timezone file, and 'Wireshark' in some tests ;-)
>>
>> I tested by building a regular x86 JVM and running JTREG tests. All 
>> looks fine.
>>
>> - I could not build zero because it seems broken because of the 
>> recent Atomic::* changes
>> - I could not test any of the other arches that seemed to reference 
>> Shark (arm and sparc)
>>
>> Here's the full webrev:
>>
>> http://cr.openjdk.java.net/~rkennke/8171853/webrev.00/ 
>> <http://cr.openjdk.java.net/%7Erkennke/8171853/webrev.00/>
>>
>> Can I get a review on this?
>>
>> Thanks, Roman
>>


From rkennke at redhat.com  Sun Oct 15 21:12:23 2017
From: rkennke at redhat.com (Roman Kennke)
Date: Sun, 15 Oct 2017 23:12:23 +0200
Subject: RFR: 8189333: Fix Zero build after Atomic::xchg changes
Message-ID: <003ff7d9-759f-1ef5-f580-18c2571b63e5@redhat.com>

Zero debug build has been broken by: JDK-8187977: Generalize 
Atomic::xchg to use templates.

This patch fixes it by casting the unsigned literal to jint:

http://cr.openjdk.java.net/~rkennke/8189333/webrev.00/ 
<http://cr.openjdk.java.net/%7Erkennke/8189333/webrev.00/>

Tested by building zero fastdebug and running some small test programs.

Ok?


Roman


From david.holmes at oracle.com  Sun Oct 15 21:23:52 2017
From: david.holmes at oracle.com (David Holmes)
Date: Mon, 16 Oct 2017 07:23:52 +1000
Subject: RFR: 8171853: Remove Shark compiler
In-Reply-To: <836cd06d-d10a-a98e-d996-b5b92de94c4b@oracle.com>
References: <92a2fec1-88f1-2579-1202-299b13062b7b@redhat.com>
 <836cd06d-d10a-a98e-d996-b5b92de94c4b@oracle.com>
Message-ID: <9d0b0656-c168-7e72-e272-893d0b475d56@oracle.com>

Hi Roman,

I've looked at all the changes for the build and hotspot and everything 
appears okay to me. Still need someone from compiler team and build team 
to sign off on this though.

One observation in src/hotspot/cpu/zero/sharedRuntime_zero.cpp, these 
includes would seem to be impossible:

   38 #ifdef COMPILER1
   39 #include "c1/c1_Runtime1.hpp"
   40 #endif
   41 #ifdef COMPILER2
   42 #include "opto/runtime.hpp"
   43 #endif

no?

In src/hotspot/share/ci/ciEnv.cpp you can just delete the comment 
entirely as it's obviously C2:

if (is_c2_compile(comp_level)) { // C2

Ditto in src/hotspot/share/compiler/compileBroker.cpp

!     // C2
       make_thread(name_buffer, _c2_compile_queue, counters, 
_compilers[1], compiler_thread, CHECK);

Thanks,
David
-----

On 16/10/2017 6:48 AM, David Holmes wrote:
> Hi Roman,
> 
> The build changes must be reviewed on build-dev - now cc'd.
> 
> Thanks,
> David
> 
> On 15/10/2017 8:41 AM, Roman Kennke wrote:
>> The JEP to remove the Shark compiler has received exclusively positive 
>> feedback (JDK-8189173) on zero-dev. So here comes the big patch to 
>> remove it.
>>
>> What I have done:
>>
>> grep -i -R shark src
>> grep -i -R shark make
>> grep -i -R shark doc
>> grep -i -R shark doc
>>
>> and purged any reference to shark. Almost everything was straightforward.
>>
>> The only things I wasn't really sure of:
>>
>> - in globals.hpp, I re-arranged the KIND_* bits to account for the gap 
>> that removing KIND_SHARK left. I hope that's good?
>> - in relocInfo_zero.hpp I put a ShouldNotCallThis() in 
>> pd_address_in_code(), I am not sure it is the right thing to do. If 
>> not, what *would* be the right thing?
>>
>> Then of course I did:
>>
>> rm -rf src/hotspot/share/shark
>>
>> I also went through the build machinery and removed stuff related to 
>> Shark and LLVM libs.
>>
>> Now the only references in the whole JDK tree to shark is a 'Shark 
>> Bay' in a timezone file, and 'Wireshark' in some tests ;-)
>>
>> I tested by building a regular x86 JVM and running JTREG tests. All 
>> looks fine.
>>
>> - I could not build zero because it seems broken because of the recent 
>> Atomic::* changes
>> - I could not test any of the other arches that seemed to reference 
>> Shark (arm and sparc)
>>
>> Here's the full webrev:
>>
>> http://cr.openjdk.java.net/~rkennke/8171853/webrev.00/ 
>> <http://cr.openjdk.java.net/%7Erkennke/8171853/webrev.00/>
>>
>> Can I get a review on this?
>>
>> Thanks, Roman
>>

From david.holmes at oracle.com  Sun Oct 15 21:25:04 2017
From: david.holmes at oracle.com (David Holmes)
Date: Mon, 16 Oct 2017 07:25:04 +1000
Subject: RFR: 8171853: Remove Shark compiler
In-Reply-To: <980aa9ae-9c50-c1dc-d52f-e00234a2e3ca@redhat.com>
References: <92a2fec1-88f1-2579-1202-299b13062b7b@redhat.com>
 <836cd06d-d10a-a98e-d996-b5b92de94c4b@oracle.com>
 <980aa9ae-9c50-c1dc-d52f-e00234a2e3ca@redhat.com>
Message-ID: <e883f832-2e33-c6a0-5966-040d8a90447e@oracle.com>

On 16/10/2017 7:01 AM, Roman Kennke wrote:
> Hi David,
> 
> thanks!
> 
> I'm uploading a 2nd revision of the patch that excludes the 
> generated-configure.sh part, and adds a smallish Zero-related fix.
> 
> http://cr.openjdk.java.net/~rkennke/8171853/webrev.01/ 
> <http://cr.openjdk.java.net/%7Erkennke/8171853/webrev.01/>

Can you point me to the exact change please as I don't want to 
re-examine it all. :)

I'll pull this in and do a test build run internally.

Thanks,
David

> Thanks, Roman
> 
> 
>> Hi Roman,
>>
>> The build changes must be reviewed on build-dev - now cc'd.
>>
>> Thanks,
>> David
>>
>> On 15/10/2017 8:41 AM, Roman Kennke wrote:
>>> The JEP to remove the Shark compiler has received exclusively 
>>> positive feedback (JDK-8189173) on zero-dev. So here comes the big 
>>> patch to remove it.
>>>
>>> What I have done:
>>>
>>> grep -i -R shark src
>>> grep -i -R shark make
>>> grep -i -R shark doc
>>> grep -i -R shark doc
>>>
>>> and purged any reference to shark. Almost everything was 
>>> straightforward.
>>>
>>> The only things I wasn't really sure of:
>>>
>>> - in globals.hpp, I re-arranged the KIND_* bits to account for the 
>>> gap that removing KIND_SHARK left. I hope that's good?
>>> - in relocInfo_zero.hpp I put a ShouldNotCallThis() in 
>>> pd_address_in_code(), I am not sure it is the right thing to do. If 
>>> not, what *would* be the right thing?
>>>
>>> Then of course I did:
>>>
>>> rm -rf src/hotspot/share/shark
>>>
>>> I also went through the build machinery and removed stuff related to 
>>> Shark and LLVM libs.
>>>
>>> Now the only references in the whole JDK tree to shark is a 'Shark 
>>> Bay' in a timezone file, and 'Wireshark' in some tests ;-)
>>>
>>> I tested by building a regular x86 JVM and running JTREG tests. All 
>>> looks fine.
>>>
>>> - I could not build zero because it seems broken because of the 
>>> recent Atomic::* changes
>>> - I could not test any of the other arches that seemed to reference 
>>> Shark (arm and sparc)
>>>
>>> Here's the full webrev:
>>>
>>> http://cr.openjdk.java.net/~rkennke/8171853/webrev.00/ 
>>> <http://cr.openjdk.java.net/%7Erkennke/8171853/webrev.00/>
>>>
>>> Can I get a review on this?
>>>
>>> Thanks, Roman
>>>
> 

From david.holmes at oracle.com  Sun Oct 15 21:29:33 2017
From: david.holmes at oracle.com (David Holmes)
Date: Mon, 16 Oct 2017 07:29:33 +1000
Subject: RFR: 8171853: Remove Shark compiler
In-Reply-To: <e883f832-2e33-c6a0-5966-040d8a90447e@oracle.com>
References: <92a2fec1-88f1-2579-1202-299b13062b7b@redhat.com>
 <836cd06d-d10a-a98e-d996-b5b92de94c4b@oracle.com>
 <980aa9ae-9c50-c1dc-d52f-e00234a2e3ca@redhat.com>
 <e883f832-2e33-c6a0-5966-040d8a90447e@oracle.com>
Message-ID: <ec5ad498-ce5b-0aae-1825-a26af1717a32@oracle.com>

Just spotted this:

./hotspot/jtreg/compiler/whitebox/CompilerWhiteBoxTest.java:    /** 
{@code CompLevel::CompLevel_full_optimization} -- C2 or Shark */

David

On 16/10/2017 7:25 AM, David Holmes wrote:
> On 16/10/2017 7:01 AM, Roman Kennke wrote:
>> Hi David,
>>
>> thanks!
>>
>> I'm uploading a 2nd revision of the patch that excludes the 
>> generated-configure.sh part, and adds a smallish Zero-related fix.
>>
>> http://cr.openjdk.java.net/~rkennke/8171853/webrev.01/ 
>> <http://cr.openjdk.java.net/%7Erkennke/8171853/webrev.01/>
> 
> Can you point me to the exact change please as I don't want to 
> re-examine it all. :)
> 
> I'll pull this in and do a test build run internally.
> 
> Thanks,
> David
> 
>> Thanks, Roman
>>
>>
>>> Hi Roman,
>>>
>>> The build changes must be reviewed on build-dev - now cc'd.
>>>
>>> Thanks,
>>> David
>>>
>>> On 15/10/2017 8:41 AM, Roman Kennke wrote:
>>>> The JEP to remove the Shark compiler has received exclusively 
>>>> positive feedback (JDK-8189173) on zero-dev. So here comes the big 
>>>> patch to remove it.
>>>>
>>>> What I have done:
>>>>
>>>> grep -i -R shark src
>>>> grep -i -R shark make
>>>> grep -i -R shark doc
>>>> grep -i -R shark doc
>>>>
>>>> and purged any reference to shark. Almost everything was 
>>>> straightforward.
>>>>
>>>> The only things I wasn't really sure of:
>>>>
>>>> - in globals.hpp, I re-arranged the KIND_* bits to account for the 
>>>> gap that removing KIND_SHARK left. I hope that's good?
>>>> - in relocInfo_zero.hpp I put a ShouldNotCallThis() in 
>>>> pd_address_in_code(), I am not sure it is the right thing to do. If 
>>>> not, what *would* be the right thing?
>>>>
>>>> Then of course I did:
>>>>
>>>> rm -rf src/hotspot/share/shark
>>>>
>>>> I also went through the build machinery and removed stuff related to 
>>>> Shark and LLVM libs.
>>>>
>>>> Now the only references in the whole JDK tree to shark is a 'Shark 
>>>> Bay' in a timezone file, and 'Wireshark' in some tests ;-)
>>>>
>>>> I tested by building a regular x86 JVM and running JTREG tests. All 
>>>> looks fine.
>>>>
>>>> - I could not build zero because it seems broken because of the 
>>>> recent Atomic::* changes
>>>> - I could not test any of the other arches that seemed to reference 
>>>> Shark (arm and sparc)
>>>>
>>>> Here's the full webrev:
>>>>
>>>> http://cr.openjdk.java.net/~rkennke/8171853/webrev.00/ 
>>>> <http://cr.openjdk.java.net/%7Erkennke/8171853/webrev.00/>
>>>>
>>>> Can I get a review on this?
>>>>
>>>> Thanks, Roman
>>>>
>>

From rkennke at redhat.com  Sun Oct 15 21:31:51 2017
From: rkennke at redhat.com (Roman Kennke)
Date: Sun, 15 Oct 2017 23:31:51 +0200
Subject: RFR: 8171853: Remove Shark compiler
In-Reply-To: <9d0b0656-c168-7e72-e272-893d0b475d56@oracle.com>
References: <92a2fec1-88f1-2579-1202-299b13062b7b@redhat.com>
 <836cd06d-d10a-a98e-d996-b5b92de94c4b@oracle.com>
 <9d0b0656-c168-7e72-e272-893d0b475d56@oracle.com>
Message-ID: <0c0f0e20-86d5-bd3e-06ec-5b4c103eb3e7@redhat.com>

Hi David,

thanks for reviewing!

>
> One observation in src/hotspot/cpu/zero/sharedRuntime_zero.cpp, these 
> includes would seem to be impossible:
>
> ? 38 #ifdef COMPILER1
> ? 39 #include "c1/c1_Runtime1.hpp"
> ? 40 #endif
> ? 41 #ifdef COMPILER2
> ? 42 #include "opto/runtime.hpp"
> ? 43 #endif
>
> no?

I have no idea. It is at least theoretically possible to have a platform 
with C1 and/or C2 support based on the Zero interpreter? I'm leaving 
that in for now as it was pre-existing and not related to Shark removal, ok?

>
> In src/hotspot/share/ci/ciEnv.cpp you can just delete the comment 
> entirely as it's obviously C2:
>
> if (is_c2_compile(comp_level)) { // C2
>
> Ditto in src/hotspot/share/compiler/compileBroker.cpp
>
> !???? // C2
> ????? make_thread(name_buffer, _c2_compile_queue, counters, 
> _compilers[1], compiler_thread, CHECK);

Ok, right. For consistency, I also remove // C1 in ciEnv.cpp similarily 
obvious is_c1_compile() call :-)

New webrev:

http://cr.openjdk.java.net/~rkennke/8171853/webrev.02/ 
<http://cr.openjdk.java.net/%7Erkennke/8171853/webrev.02/>

Roman

From david.holmes at oracle.com  Sun Oct 15 21:32:26 2017
From: david.holmes at oracle.com (David Holmes)
Date: Mon, 16 Oct 2017 07:32:26 +1000
Subject: RFR: 8189333: Fix Zero build after Atomic::xchg changes
In-Reply-To: <003ff7d9-759f-1ef5-f580-18c2571b63e5@redhat.com>
References: <003ff7d9-759f-1ef5-f580-18c2571b63e5@redhat.com>
Message-ID: <eba9e499-71e5-b092-b5e9-4a766cca4f83@oracle.com>

Hi Roman,

On 16/10/2017 7:12 AM, Roman Kennke wrote:
> Zero debug build has been broken by: JDK-8187977: Generalize 
> Atomic::xchg to use templates.
> 
> This patch fixes it by casting the unsigned literal to jint:
> 
> http://cr.openjdk.java.net/~rkennke/8189333/webrev.00/ 
> <http://cr.openjdk.java.net/%7Erkennke/8189333/webrev.00/>

Looks fine.

I can push this for you straight away (relatively speaking :) ) under 
the trivial rule.

Thanks,
David

> Tested by building zero fastdebug and running some small test programs.
> 
> Ok?
> 
> 
> Roman
> 

From david.holmes at oracle.com  Sun Oct 15 21:33:44 2017
From: david.holmes at oracle.com (David Holmes)
Date: Mon, 16 Oct 2017 07:33:44 +1000
Subject: RFR: 8171853: Remove Shark compiler
In-Reply-To: <0c0f0e20-86d5-bd3e-06ec-5b4c103eb3e7@redhat.com>
References: <92a2fec1-88f1-2579-1202-299b13062b7b@redhat.com>
 <836cd06d-d10a-a98e-d996-b5b92de94c4b@oracle.com>
 <9d0b0656-c168-7e72-e272-893d0b475d56@oracle.com>
 <0c0f0e20-86d5-bd3e-06ec-5b4c103eb3e7@redhat.com>
Message-ID: <86c02492-ecf5-197b-7ca1-a411f68000c5@oracle.com>

On 16/10/2017 7:31 AM, Roman Kennke wrote:
> Hi David,
> 
> thanks for reviewing!
> 
>>
>> One observation in src/hotspot/cpu/zero/sharedRuntime_zero.cpp, these 
>> includes would seem to be impossible:
>>
>> ? 38 #ifdef COMPILER1
>> ? 39 #include "c1/c1_Runtime1.hpp"
>> ? 40 #endif
>> ? 41 #ifdef COMPILER2
>> ? 42 #include "opto/runtime.hpp"
>> ? 43 #endif
>>
>> no?
> 
> I have no idea. It is at least theoretically possible to have a platform 
> with C1 and/or C2 support based on the Zero interpreter? I'm leaving 
> that in for now as it was pre-existing and not related to Shark removal, 
> ok?

Yep that's fine.

Thanks.

David

>>
>> In src/hotspot/share/ci/ciEnv.cpp you can just delete the comment 
>> entirely as it's obviously C2:
>>
>> if (is_c2_compile(comp_level)) { // C2
>>
>> Ditto in src/hotspot/share/compiler/compileBroker.cpp
>>
>> !???? // C2
>> ????? make_thread(name_buffer, _c2_compile_queue, counters, 
>> _compilers[1], compiler_thread, CHECK);
> 
> Ok, right. For consistency, I also remove // C1 in ciEnv.cpp similarily 
> obvious is_c1_compile() call :-)
> 
> New webrev:
> 
> http://cr.openjdk.java.net/~rkennke/8171853/webrev.02/ 
> <http://cr.openjdk.java.net/%7Erkennke/8171853/webrev.02/>
> 
> Roman

From rkennke at redhat.com  Sun Oct 15 21:39:54 2017
From: rkennke at redhat.com (Roman Kennke)
Date: Sun, 15 Oct 2017 23:39:54 +0200
Subject: RFR: 8171853: Remove Shark compiler
In-Reply-To: <e883f832-2e33-c6a0-5966-040d8a90447e@oracle.com>
References: <92a2fec1-88f1-2579-1202-299b13062b7b@redhat.com>
 <836cd06d-d10a-a98e-d996-b5b92de94c4b@oracle.com>
 <980aa9ae-9c50-c1dc-d52f-e00234a2e3ca@redhat.com>
 <e883f832-2e33-c6a0-5966-040d8a90447e@oracle.com>
Message-ID: <7deb690b-8b74-1ca2-948e-d76d0d133814@redhat.com>

Am 15.10.2017 um 23:25 schrieb David Holmes:
> On 16/10/2017 7:01 AM, Roman Kennke wrote:
>> Hi David,
>>
>> thanks!
>>
>> I'm uploading a 2nd revision of the patch that excludes the 
>> generated-configure.sh part, and adds a smallish Zero-related fix.
>>
>> http://cr.openjdk.java.net/~rkennke/8171853/webrev.01/ 
>> <http://cr.openjdk.java.net/%7Erkennke/8171853/webrev.01/>
>
> Can you point me to the exact change please as I don't want to 
> re-examine it all. :)
Oops, sorry. The diff between 00 and 01 is this (apart from 
generated-configure.sh):

diff --git a/src/hotspot/share/utilities/vmError.cpp 
b/src/hotspot/share/utilities/vmError.cpp
--- a/src/hotspot/share/utilities/vmError.cpp
+++ b/src/hotspot/share/utilities/vmError.cpp
@@ -192,6 +192,7 @@
 ???? st->cr();

 ???? // Print the frames
+??? StackFrameStream sfs(jt);
 ???? for(int i = 0; !sfs.is_done(); sfs.next(), i++) {
 ?????? sfs.current()->zero_print_on_error(i, st, buf, buflen);
 ?????? st->cr();

I.e. I added back the sfs variable that I accidentally removed in webrev.00.


From rkennke at redhat.com  Sun Oct 15 21:40:21 2017
From: rkennke at redhat.com (Roman Kennke)
Date: Sun, 15 Oct 2017 23:40:21 +0200
Subject: RFR: 8189333: Fix Zero build after Atomic::xchg changes
In-Reply-To: <eba9e499-71e5-b092-b5e9-4a766cca4f83@oracle.com>
References: <003ff7d9-759f-1ef5-f580-18c2571b63e5@redhat.com>
 <eba9e499-71e5-b092-b5e9-4a766cca4f83@oracle.com>
Message-ID: <b0c31664-fc25-8608-8c23-882711691c13@redhat.com>

Am 15.10.2017 um 23:32 schrieb David Holmes:
> Hi Roman,
>
> On 16/10/2017 7:12 AM, Roman Kennke wrote:
>> Zero debug build has been broken by: JDK-8187977: Generalize 
>> Atomic::xchg to use templates.
>>
>> This patch fixes it by casting the unsigned literal to jint:
>>
>> http://cr.openjdk.java.net/~rkennke/8189333/webrev.00/ 
>> <http://cr.openjdk.java.net/%7Erkennke/8189333/webrev.00/>
>
> Looks fine.
>
> I can push this for you straight away (relatively speaking :) ) under 
> the trivial rule.
Thanks!

Roman

From david.holmes at oracle.com  Sun Oct 15 21:44:04 2017
From: david.holmes at oracle.com (David Holmes)
Date: Mon, 16 Oct 2017 07:44:04 +1000
Subject: RFR: 8171853: Remove Shark compiler
In-Reply-To: <7deb690b-8b74-1ca2-948e-d76d0d133814@redhat.com>
References: <92a2fec1-88f1-2579-1202-299b13062b7b@redhat.com>
 <836cd06d-d10a-a98e-d996-b5b92de94c4b@oracle.com>
 <980aa9ae-9c50-c1dc-d52f-e00234a2e3ca@redhat.com>
 <e883f832-2e33-c6a0-5966-040d8a90447e@oracle.com>
 <7deb690b-8b74-1ca2-948e-d76d0d133814@redhat.com>
Message-ID: <aa5a5672-62d3-e938-512f-cc33503c97f7@oracle.com>

On 16/10/2017 7:39 AM, Roman Kennke wrote:
> Am 15.10.2017 um 23:25 schrieb David Holmes:
>> On 16/10/2017 7:01 AM, Roman Kennke wrote:
>>> Hi David,
>>>
>>> thanks!
>>>
>>> I'm uploading a 2nd revision of the patch that excludes the 
>>> generated-configure.sh part, and adds a smallish Zero-related fix.
>>>
>>> http://cr.openjdk.java.net/~rkennke/8171853/webrev.01/ 
>>> <http://cr.openjdk.java.net/%7Erkennke/8171853/webrev.01/>
>>
>> Can you point me to the exact change please as I don't want to 
>> re-examine it all. :)
> Oops, sorry. The diff between 00 and 01 is this (apart from 
> generated-configure.sh):
> 
> diff --git a/src/hotspot/share/utilities/vmError.cpp 
> b/src/hotspot/share/utilities/vmError.cpp
> --- a/src/hotspot/share/utilities/vmError.cpp
> +++ b/src/hotspot/share/utilities/vmError.cpp
> @@ -192,6 +192,7 @@
>  ???? st->cr();
> 
>  ???? // Print the frames
> +??? StackFrameStream sfs(jt);
>  ???? for(int i = 0; !sfs.is_done(); sfs.next(), i++) {
>  ?????? sfs.current()->zero_print_on_error(i, st, buf, buflen);
>  ?????? st->cr();
> 
> I.e. I added back the sfs variable that I accidentally removed in 
> webrev.00.

Looks good!

David

From rkennke at redhat.com  Sun Oct 15 22:00:15 2017
From: rkennke at redhat.com (Roman Kennke)
Date: Mon, 16 Oct 2017 00:00:15 +0200
Subject: RFR: 8171853: Remove Shark compiler
In-Reply-To: <ec5ad498-ce5b-0aae-1825-a26af1717a32@oracle.com>
References: <92a2fec1-88f1-2579-1202-299b13062b7b@redhat.com>
 <836cd06d-d10a-a98e-d996-b5b92de94c4b@oracle.com>
 <980aa9ae-9c50-c1dc-d52f-e00234a2e3ca@redhat.com>
 <e883f832-2e33-c6a0-5966-040d8a90447e@oracle.com>
 <ec5ad498-ce5b-0aae-1825-a26af1717a32@oracle.com>
Message-ID: <f0deb71e-df99-6e9b-e3fd-29a955a7fe02@redhat.com>


Ok, I fixed all the comments you mentioned.

Differential (against webrev.01):
http://cr.openjdk.java.net/~rkennke/8171853/webrev.03.diff/ 
<http://cr.openjdk.java.net/%7Erkennke/8171853/webrev.03.diff/>
Full webrev:
http://cr.openjdk.java.net/~rkennke/8171853/webrev.03/ 
<http://cr.openjdk.java.net/%7Erkennke/8171853/webrev.03/>

Roman

> Just spotted this:
>
> ./hotspot/jtreg/compiler/whitebox/CompilerWhiteBoxTest.java: /** 
> {@code CompLevel::CompLevel_full_optimization} -- C2 or Shark */
>
> David
>
> On 16/10/2017 7:25 AM, David Holmes wrote:
>> On 16/10/2017 7:01 AM, Roman Kennke wrote:
>>> Hi David,
>>>
>>> thanks!
>>>
>>> I'm uploading a 2nd revision of the patch that excludes the 
>>> generated-configure.sh part, and adds a smallish Zero-related fix.
>>>
>>> http://cr.openjdk.java.net/~rkennke/8171853/webrev.01/ 
>>> <http://cr.openjdk.java.net/%7Erkennke/8171853/webrev.01/>
>>
>> Can you point me to the exact change please as I don't want to 
>> re-examine it all. :)
>>
>> I'll pull this in and do a test build run internally.
>>
>> Thanks,
>> David
>>
>>> Thanks, Roman
>>>
>>>
>>>> Hi Roman,
>>>>
>>>> The build changes must be reviewed on build-dev - now cc'd.
>>>>
>>>> Thanks,
>>>> David
>>>>
>>>> On 15/10/2017 8:41 AM, Roman Kennke wrote:
>>>>> The JEP to remove the Shark compiler has received exclusively 
>>>>> positive feedback (JDK-8189173) on zero-dev. So here comes the big 
>>>>> patch to remove it.
>>>>>
>>>>> What I have done:
>>>>>
>>>>> grep -i -R shark src
>>>>> grep -i -R shark make
>>>>> grep -i -R shark doc
>>>>> grep -i -R shark doc
>>>>>
>>>>> and purged any reference to shark. Almost everything was 
>>>>> straightforward.
>>>>>
>>>>> The only things I wasn't really sure of:
>>>>>
>>>>> - in globals.hpp, I re-arranged the KIND_* bits to account for the 
>>>>> gap that removing KIND_SHARK left. I hope that's good?
>>>>> - in relocInfo_zero.hpp I put a ShouldNotCallThis() in 
>>>>> pd_address_in_code(), I am not sure it is the right thing to do. 
>>>>> If not, what *would* be the right thing?
>>>>>
>>>>> Then of course I did:
>>>>>
>>>>> rm -rf src/hotspot/share/shark
>>>>>
>>>>> I also went through the build machinery and removed stuff related 
>>>>> to Shark and LLVM libs.
>>>>>
>>>>> Now the only references in the whole JDK tree to shark is a 'Shark 
>>>>> Bay' in a timezone file, and 'Wireshark' in some tests ;-)
>>>>>
>>>>> I tested by building a regular x86 JVM and running JTREG tests. 
>>>>> All looks fine.
>>>>>
>>>>> - I could not build zero because it seems broken because of the 
>>>>> recent Atomic::* changes
>>>>> - I could not test any of the other arches that seemed to 
>>>>> reference Shark (arm and sparc)
>>>>>
>>>>> Here's the full webrev:
>>>>>
>>>>> http://cr.openjdk.java.net/~rkennke/8171853/webrev.00/ 
>>>>> <http://cr.openjdk.java.net/%7Erkennke/8171853/webrev.00/>
>>>>>
>>>>> Can I get a review on this?
>>>>>
>>>>> Thanks, Roman
>>>>>
>>>


From david.holmes at oracle.com  Sun Oct 15 22:08:52 2017
From: david.holmes at oracle.com (David Holmes)
Date: Mon, 16 Oct 2017 08:08:52 +1000
Subject: RFR: 8171853: Remove Shark compiler
In-Reply-To: <f0deb71e-df99-6e9b-e3fd-29a955a7fe02@redhat.com>
References: <92a2fec1-88f1-2579-1202-299b13062b7b@redhat.com>
 <836cd06d-d10a-a98e-d996-b5b92de94c4b@oracle.com>
 <980aa9ae-9c50-c1dc-d52f-e00234a2e3ca@redhat.com>
 <e883f832-2e33-c6a0-5966-040d8a90447e@oracle.com>
 <ec5ad498-ce5b-0aae-1825-a26af1717a32@oracle.com>
 <f0deb71e-df99-6e9b-e3fd-29a955a7fe02@redhat.com>
Message-ID: <321ac223-1dc8-841c-93c0-c39c770b7e20@oracle.com>

Looks good.

Thanks,
David

On 16/10/2017 8:00 AM, Roman Kennke wrote:
> 
> Ok, I fixed all the comments you mentioned.
> 
> Differential (against webrev.01):
> http://cr.openjdk.java.net/~rkennke/8171853/webrev.03.diff/ 
> <http://cr.openjdk.java.net/%7Erkennke/8171853/webrev.03.diff/>
> Full webrev:
> http://cr.openjdk.java.net/~rkennke/8171853/webrev.03/ 
> <http://cr.openjdk.java.net/%7Erkennke/8171853/webrev.03/>
> 
> Roman
> 
>> Just spotted this:
>>
>> ./hotspot/jtreg/compiler/whitebox/CompilerWhiteBoxTest.java: /** 
>> {@code CompLevel::CompLevel_full_optimization} -- C2 or Shark */
>>
>> David
>>
>> On 16/10/2017 7:25 AM, David Holmes wrote:
>>> On 16/10/2017 7:01 AM, Roman Kennke wrote:
>>>> Hi David,
>>>>
>>>> thanks!
>>>>
>>>> I'm uploading a 2nd revision of the patch that excludes the 
>>>> generated-configure.sh part, and adds a smallish Zero-related fix.
>>>>
>>>> http://cr.openjdk.java.net/~rkennke/8171853/webrev.01/ 
>>>> <http://cr.openjdk.java.net/%7Erkennke/8171853/webrev.01/>
>>>
>>> Can you point me to the exact change please as I don't want to 
>>> re-examine it all. :)
>>>
>>> I'll pull this in and do a test build run internally.
>>>
>>> Thanks,
>>> David
>>>
>>>> Thanks, Roman
>>>>
>>>>
>>>>> Hi Roman,
>>>>>
>>>>> The build changes must be reviewed on build-dev - now cc'd.
>>>>>
>>>>> Thanks,
>>>>> David
>>>>>
>>>>> On 15/10/2017 8:41 AM, Roman Kennke wrote:
>>>>>> The JEP to remove the Shark compiler has received exclusively 
>>>>>> positive feedback (JDK-8189173) on zero-dev. So here comes the big 
>>>>>> patch to remove it.
>>>>>>
>>>>>> What I have done:
>>>>>>
>>>>>> grep -i -R shark src
>>>>>> grep -i -R shark make
>>>>>> grep -i -R shark doc
>>>>>> grep -i -R shark doc
>>>>>>
>>>>>> and purged any reference to shark. Almost everything was 
>>>>>> straightforward.
>>>>>>
>>>>>> The only things I wasn't really sure of:
>>>>>>
>>>>>> - in globals.hpp, I re-arranged the KIND_* bits to account for the 
>>>>>> gap that removing KIND_SHARK left. I hope that's good?
>>>>>> - in relocInfo_zero.hpp I put a ShouldNotCallThis() in 
>>>>>> pd_address_in_code(), I am not sure it is the right thing to do. 
>>>>>> If not, what *would* be the right thing?
>>>>>>
>>>>>> Then of course I did:
>>>>>>
>>>>>> rm -rf src/hotspot/share/shark
>>>>>>
>>>>>> I also went through the build machinery and removed stuff related 
>>>>>> to Shark and LLVM libs.
>>>>>>
>>>>>> Now the only references in the whole JDK tree to shark is a 'Shark 
>>>>>> Bay' in a timezone file, and 'Wireshark' in some tests ;-)
>>>>>>
>>>>>> I tested by building a regular x86 JVM and running JTREG tests. 
>>>>>> All looks fine.
>>>>>>
>>>>>> - I could not build zero because it seems broken because of the 
>>>>>> recent Atomic::* changes
>>>>>> - I could not test any of the other arches that seemed to 
>>>>>> reference Shark (arm and sparc)
>>>>>>
>>>>>> Here's the full webrev:
>>>>>>
>>>>>> http://cr.openjdk.java.net/~rkennke/8171853/webrev.00/ 
>>>>>> <http://cr.openjdk.java.net/%7Erkennke/8171853/webrev.00/>
>>>>>>
>>>>>> Can I get a review on this?
>>>>>>
>>>>>> Thanks, Roman
>>>>>>
>>>>
> 

From vladimir.kozlov at oracle.com  Sun Oct 15 22:14:53 2017
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Sun, 15 Oct 2017 15:14:53 -0700
Subject: RFR: 8171853: Remove Shark compiler
In-Reply-To: <321ac223-1dc8-841c-93c0-c39c770b7e20@oracle.com>
References: <92a2fec1-88f1-2579-1202-299b13062b7b@redhat.com>
 <836cd06d-d10a-a98e-d996-b5b92de94c4b@oracle.com>
 <980aa9ae-9c50-c1dc-d52f-e00234a2e3ca@redhat.com>
 <e883f832-2e33-c6a0-5966-040d8a90447e@oracle.com>
 <ec5ad498-ce5b-0aae-1825-a26af1717a32@oracle.com>
 <f0deb71e-df99-6e9b-e3fd-29a955a7fe02@redhat.com>
 <321ac223-1dc8-841c-93c0-c39c770b7e20@oracle.com>
Message-ID: <85b68a77-f418-c619-0a51-c7389d7c5a86@oracle.com>

+1

Thanks,
Vladimir

On 10/15/17 3:08 PM, David Holmes wrote:
> Looks good.
> 
> Thanks,
> David
> 
> On 16/10/2017 8:00 AM, Roman Kennke wrote:
>>
>> Ok, I fixed all the comments you mentioned.
>>
>> Differential (against webrev.01):
>> http://cr.openjdk.java.net/~rkennke/8171853/webrev.03.diff/ 
>> <http://cr.openjdk.java.net/%7Erkennke/8171853/webrev.03.diff/>
>> Full webrev:
>> http://cr.openjdk.java.net/~rkennke/8171853/webrev.03/ <http://cr.openjdk.java.net/%7Erkennke/8171853/webrev.03/>
>>
>> Roman
>>
>>> Just spotted this:
>>>
>>> ./hotspot/jtreg/compiler/whitebox/CompilerWhiteBoxTest.java: /** {@code CompLevel::CompLevel_full_optimization} -- C2 
>>> or Shark */
>>>
>>> David
>>>
>>> On 16/10/2017 7:25 AM, David Holmes wrote:
>>>> On 16/10/2017 7:01 AM, Roman Kennke wrote:
>>>>> Hi David,
>>>>>
>>>>> thanks!
>>>>>
>>>>> I'm uploading a 2nd revision of the patch that excludes the generated-configure.sh part, and adds a smallish 
>>>>> Zero-related fix.
>>>>>
>>>>> http://cr.openjdk.java.net/~rkennke/8171853/webrev.01/ <http://cr.openjdk.java.net/%7Erkennke/8171853/webrev.01/>
>>>>
>>>> Can you point me to the exact change please as I don't want to re-examine it all. :)
>>>>
>>>> I'll pull this in and do a test build run internally.
>>>>
>>>> Thanks,
>>>> David
>>>>
>>>>> Thanks, Roman
>>>>>
>>>>>
>>>>>> Hi Roman,
>>>>>>
>>>>>> The build changes must be reviewed on build-dev - now cc'd.
>>>>>>
>>>>>> Thanks,
>>>>>> David
>>>>>>
>>>>>> On 15/10/2017 8:41 AM, Roman Kennke wrote:
>>>>>>> The JEP to remove the Shark compiler has received exclusively positive feedback (JDK-8189173) on zero-dev. So 
>>>>>>> here comes the big patch to remove it.
>>>>>>>
>>>>>>> What I have done:
>>>>>>>
>>>>>>> grep -i -R shark src
>>>>>>> grep -i -R shark make
>>>>>>> grep -i -R shark doc
>>>>>>> grep -i -R shark doc
>>>>>>>
>>>>>>> and purged any reference to shark. Almost everything was straightforward.
>>>>>>>
>>>>>>> The only things I wasn't really sure of:
>>>>>>>
>>>>>>> - in globals.hpp, I re-arranged the KIND_* bits to account for the gap that removing KIND_SHARK left. I hope 
>>>>>>> that's good?
>>>>>>> - in relocInfo_zero.hpp I put a ShouldNotCallThis() in pd_address_in_code(), I am not sure it is the right thing 
>>>>>>> to do. If not, what *would* be the right thing?
>>>>>>>
>>>>>>> Then of course I did:
>>>>>>>
>>>>>>> rm -rf src/hotspot/share/shark
>>>>>>>
>>>>>>> I also went through the build machinery and removed stuff related to Shark and LLVM libs.
>>>>>>>
>>>>>>> Now the only references in the whole JDK tree to shark is a 'Shark Bay' in a timezone file, and 'Wireshark' in 
>>>>>>> some tests ;-)
>>>>>>>
>>>>>>> I tested by building a regular x86 JVM and running JTREG tests. All looks fine.
>>>>>>>
>>>>>>> - I could not build zero because it seems broken because of the recent Atomic::* changes
>>>>>>> - I could not test any of the other arches that seemed to reference Shark (arm and sparc)
>>>>>>>
>>>>>>> Here's the full webrev:
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~rkennke/8171853/webrev.00/ <http://cr.openjdk.java.net/%7Erkennke/8171853/webrev.00/>
>>>>>>>
>>>>>>> Can I get a review on this?
>>>>>>>
>>>>>>> Thanks, Roman
>>>>>>>
>>>>>
>>

From david.holmes at oracle.com  Mon Oct 16 00:31:55 2017
From: david.holmes at oracle.com (David Holmes)
Date: Mon, 16 Oct 2017 10:31:55 +1000
Subject: RFR: 8171853: Remove Shark compiler
In-Reply-To: <321ac223-1dc8-841c-93c0-c39c770b7e20@oracle.com>
References: <92a2fec1-88f1-2579-1202-299b13062b7b@redhat.com>
 <836cd06d-d10a-a98e-d996-b5b92de94c4b@oracle.com>
 <980aa9ae-9c50-c1dc-d52f-e00234a2e3ca@redhat.com>
 <e883f832-2e33-c6a0-5966-040d8a90447e@oracle.com>
 <ec5ad498-ce5b-0aae-1825-a26af1717a32@oracle.com>
 <f0deb71e-df99-6e9b-e3fd-29a955a7fe02@redhat.com>
 <321ac223-1dc8-841c-93c0-c39c770b7e20@oracle.com>
Message-ID: <331579a0-29de-f152-2dd4-66987896c463@oracle.com>

My internal JPRT run went fine. So this just needs a build team signoff 
from the perspective of the patch.

However, as this has had a JEP submitted for it, the code changes can 
not be pushed until the JEP has been targeted.

Thanks,
David

On 16/10/2017 8:08 AM, David Holmes wrote:
> Looks good.
> 
> Thanks,
> David
> 
> On 16/10/2017 8:00 AM, Roman Kennke wrote:
>>
>> Ok, I fixed all the comments you mentioned.
>>
>> Differential (against webrev.01):
>> http://cr.openjdk.java.net/~rkennke/8171853/webrev.03.diff/ 
>> <http://cr.openjdk.java.net/%7Erkennke/8171853/webrev.03.diff/>
>> Full webrev:
>> http://cr.openjdk.java.net/~rkennke/8171853/webrev.03/ 
>> <http://cr.openjdk.java.net/%7Erkennke/8171853/webrev.03/>
>>
>> Roman
>>
>>> Just spotted this:
>>>
>>> ./hotspot/jtreg/compiler/whitebox/CompilerWhiteBoxTest.java: /** 
>>> {@code CompLevel::CompLevel_full_optimization} -- C2 or Shark */
>>>
>>> David
>>>
>>> On 16/10/2017 7:25 AM, David Holmes wrote:
>>>> On 16/10/2017 7:01 AM, Roman Kennke wrote:
>>>>> Hi David,
>>>>>
>>>>> thanks!
>>>>>
>>>>> I'm uploading a 2nd revision of the patch that excludes the 
>>>>> generated-configure.sh part, and adds a smallish Zero-related fix.
>>>>>
>>>>> http://cr.openjdk.java.net/~rkennke/8171853/webrev.01/ 
>>>>> <http://cr.openjdk.java.net/%7Erkennke/8171853/webrev.01/>
>>>>
>>>> Can you point me to the exact change please as I don't want to 
>>>> re-examine it all. :)
>>>>
>>>> I'll pull this in and do a test build run internally.
>>>>
>>>> Thanks,
>>>> David
>>>>
>>>>> Thanks, Roman
>>>>>
>>>>>
>>>>>> Hi Roman,
>>>>>>
>>>>>> The build changes must be reviewed on build-dev - now cc'd.
>>>>>>
>>>>>> Thanks,
>>>>>> David
>>>>>>
>>>>>> On 15/10/2017 8:41 AM, Roman Kennke wrote:
>>>>>>> The JEP to remove the Shark compiler has received exclusively 
>>>>>>> positive feedback (JDK-8189173) on zero-dev. So here comes the 
>>>>>>> big patch to remove it.
>>>>>>>
>>>>>>> What I have done:
>>>>>>>
>>>>>>> grep -i -R shark src
>>>>>>> grep -i -R shark make
>>>>>>> grep -i -R shark doc
>>>>>>> grep -i -R shark doc
>>>>>>>
>>>>>>> and purged any reference to shark. Almost everything was 
>>>>>>> straightforward.
>>>>>>>
>>>>>>> The only things I wasn't really sure of:
>>>>>>>
>>>>>>> - in globals.hpp, I re-arranged the KIND_* bits to account for 
>>>>>>> the gap that removing KIND_SHARK left. I hope that's good?
>>>>>>> - in relocInfo_zero.hpp I put a ShouldNotCallThis() in 
>>>>>>> pd_address_in_code(), I am not sure it is the right thing to do. 
>>>>>>> If not, what *would* be the right thing?
>>>>>>>
>>>>>>> Then of course I did:
>>>>>>>
>>>>>>> rm -rf src/hotspot/share/shark
>>>>>>>
>>>>>>> I also went through the build machinery and removed stuff related 
>>>>>>> to Shark and LLVM libs.
>>>>>>>
>>>>>>> Now the only references in the whole JDK tree to shark is a 
>>>>>>> 'Shark Bay' in a timezone file, and 'Wireshark' in some tests ;-)
>>>>>>>
>>>>>>> I tested by building a regular x86 JVM and running JTREG tests. 
>>>>>>> All looks fine.
>>>>>>>
>>>>>>> - I could not build zero because it seems broken because of the 
>>>>>>> recent Atomic::* changes
>>>>>>> - I could not test any of the other arches that seemed to 
>>>>>>> reference Shark (arm and sparc)
>>>>>>>
>>>>>>> Here's the full webrev:
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~rkennke/8171853/webrev.00/ 
>>>>>>> <http://cr.openjdk.java.net/%7Erkennke/8171853/webrev.00/>
>>>>>>>
>>>>>>> Can I get a review on this?
>>>>>>>
>>>>>>> Thanks, Roman
>>>>>>>
>>>>>
>>

From david.holmes at oracle.com  Mon Oct 16 01:18:10 2017
From: david.holmes at oracle.com (David Holmes)
Date: Mon, 16 Oct 2017 11:18:10 +1000
Subject: RFR (L, but tedious) 8188220: Remove Atomic::*_ptr() uses and
 overloads from hotspot
In-Reply-To: <7265c30d-946b-19c4-a1b3-c3314a869ee8@oracle.com>
References: <b6d75bd8-ca77-6507-f1f0-db57bbb99ae0@oracle.com>
 <7A475565-84D9-4F98-AE7B-2FDB206CC6E1@oracle.com>
 <49b7c5f7-2f6d-16ac-0b60-140619d0fffd@oracle.com>
 <f50c2c96-11a8-2d02-5eaf-292451b6c6ff@oracle.com>
 <7265c30d-946b-19c4-a1b3-c3314a869ee8@oracle.com>
Message-ID: <33af17b9-6dce-5a5e-cb94-b3c1afbe8532@oracle.com>

One tiny follow up as I was looking at this code:

src/hotspot/share/services/mallocSiteTable.hpp

65   MallocSiteHashtableEntry* _next;

should be

65   MallocSiteHashtableEntry* volatile _next;

as we operate on it with CAS.

Thanks,
David

On 14/10/2017 10:32 PM, David Holmes wrote:
> Hi Coleen,
> 
> These changes all seem okay to me - except I can't comment on the 
> Atomic::sub implementation. :)
> 
> Thanks for adding the assert to header_addr(). FYI from objectMonitor.hpp:
> 
> // ObjectMonitor Layout Overview/Highlights/Restrictions:
> //
> // - The _header field must be at offset 0 because the displaced header
> //?? from markOop is stored there. We do not want markOop.hpp to include
> //?? ObjectMonitor.hpp to avoid exposing ObjectMonitor everywhere. This
> //?? means that ObjectMonitor cannot inherit from any other class nor can
> //?? it use any virtual member functions. This restriction is critical to
> //?? the proper functioning of the VM.
> 
> so it is important we ensure this holds.
> 
> Thanks,
> David
> 
> On 14/10/2017 4:34 AM, coleen.phillimore at oracle.com wrote:
>>
>> Hi, Here is the version with the changes from Kim's comments that has 
>> passed at least testing with JPRT and tier1, locally.?? More testing 
>> (tier2-5) is in progress.
>>
>> Also includes a corrected version of Atomic::sub care of Erik Osterlund.
>>
>> open webrev at 
>> http://cr.openjdk.java.net/~coleenp/8188220.kim-review-changes/webrev
>> open webrev at 
>> http://cr.openjdk.java.net/~coleenp/8188220.review-comments/webrev
>>
>> Full version:
>>
>> http://cr.openjdk.java.net/~coleenp/8188220.03/webrev
>>
>> Thanks!
>> Coleen
>>
>> On 10/13/17 9:25 AM, coleen.phillimore at oracle.com wrote:
>>>
>>> Hi Kim, Thank you for the detailed review and the time you've spent 
>>> on it, and discussion yesterday.
>>>
>>> On 10/12/17 7:17 PM, Kim Barrett wrote:
>>>>> On Oct 10, 2017, at 6:01 PM, coleen.phillimore at oracle.com wrote:
>>>>>
>>>>> Summary: With the new template functions these are unnecessary.
>>>>>
>>>>> The changes are mostly s/_ptr// and removing the cast to return 
>>>>> type.? There weren't many types that needed to be improved to match 
>>>>> the template version of the function.?? Some notes:
>>>>> 1. replaced CASPTR with Atomic::cmpxchg() in mutex.cpp, rearranging 
>>>>> arguments.
>>>>> 2. renamed Atomic::replace_if_null to Atomic::cmpxchg_if_null.? I 
>>>>> disliked the first name because it's not explicit from the callers 
>>>>> that there's an underlying cas.? If people want to fight, I'll 
>>>>> remove the function and use cmpxchg because there are only a couple 
>>>>> places where this is a little nicer.
>>>>> 3. Added Atomic::sub()
>>>>>
>>>>> Tested with JPRT, mach5 tier1-5 on linux,windows and solaris.
>>>>>
>>>>> open webrev at http://cr.openjdk.java.net/~coleenp/8188220.01/webrev
>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8188220
>>>>>
>>>>> Thanks,
>>>>> Coleen
>>>> I looked harder at the potential ABA problems, and believe they are
>>>> okay.? There can be multiple threads doing pushes, and there can be
>>>> multiple threads doing pops, but not both at the same time.
>>>>
>>>> ------------------------------------------------------------------------------ 
>>>>
>>>> src/hotspot/cpu/zero/cppInterpreter_zero.cpp
>>>> ? 279???? if (Atomic::cmpxchg(monitor, lockee->mark_addr(), disp) != 
>>>> disp) {
>>>>
>>>> How does this work?? monitor and disp seem like they have unrelated
>>>> types?? Given that this is zero-specific code, maybe this hasn't been
>>>> tested?
>>>>
>>>> Similarly here:
>>>> ? 423?????? if (Atomic::cmpxchg(header, rcvr->mark_addr(), lock) != 
>>>> lock) {
>>>
>>> I haven't built zero.? I don't know how to do this anymore (help?) I 
>>> fixed the obvious type mismatches here and in 
>>> bytecodeInterpreter.cpp.? I'll try to build it.
>>>>
>>>> ------------------------------------------------------------------------------ 
>>>>
>>>> src/hotspot/share/asm/assembler.cpp
>>>> ? 239???????? dcon->value_fn = cfn;
>>>>
>>>> Is it actually safe to remove the atomic update?? If multiple threads
>>>> performing the assignment *are* possible (and I don't understand the
>>>> context yet, so don't know the answer to that), then a bare non-atomic
>>>> assignment is a race, e.g. undefined behavior.
>>>>
>>>> Regardless of that, I think the CAST_FROM_FN_PTR should be retained.
>>>
>>> I can find no uses of this code, ie. looking for "delayed_value". I 
>>> think it was early jsr292 code.? I could also not find any 
>>> combination of casts that would make it compile, so in the end I 
>>> believed the comment and took out the cmpxchg.?? The code appears to 
>>> be intended to for bootstrapping, see the call to 
>>> update_delayed_values() in JavaClasses::compute_offsets().
>>>
>>> The CAST_FROM_FN_PTR was to get it to compile with cmpxchg, the new 
>>> code does not require a cast.? If you can help with finding the right 
>>> set of casts, I'd be happy to put the cmpxchg back in. I just 
>>> couldn't find one.
>>>>
>>>> ------------------------------------------------------------------------------ 
>>>>
>>>> src/hotspot/share/classfile/classLoaderData.cpp
>>>> ? 167?? Chunk* head = (Chunk*) OrderAccess::load_acquire(&_head);
>>>>
>>>> I think the cast to Chunk* is no longer needed.
>>>
>>> Missed another, thanks.? No that's the same one David found.
>>>>
>>>> ------------------------------------------------------------------------------ 
>>>>
>>>> src/hotspot/share/classfile/classLoaderData.cpp
>>>> ? 946???? ClassLoaderData* old = Atomic::cmpxchg(cld, cld_addr, 
>>>> (ClassLoaderData*)NULL);
>>>> ? 947???? if (old != NULL) {
>>>> ? 948?????? delete cld;
>>>> ? 949?????? // Returns the data.
>>>> ? 950?????? return old;
>>>> ? 951???? }
>>>>
>>>> That could instead be
>>>>
>>>> ?? if (!Atomic::replace_if_null(cld, cld_addr)) {
>>>> ???? delete cld;?????????? // Lost the race.
>>>> ???? return *cld_addr;???? // Use the winner's value.
>>>> ?? }
>>>>
>>>> And apparently the caller of CLDG::add doesn't care whether the
>>>> returned CLD has actually been added to the graph yet.? If that's not
>>>> true, then there's a bug here, since a race loser might return a
>>>> winner's value before the winner has actually done the insertion.
>>>
>>> True, the race loser doesn't care whether the CLD has been added to 
>>> the graph.
>>> Your instead code requires a comment that replace_if_null is really a 
>>> compare exchange and has an extra read of the original value, so I am 
>>> leaving what I have which is clearer to me.
>>>
>>>>
>>>> ------------------------------------------------------------------------------ 
>>>>
>>>> src/hotspot/share/classfile/verifier.cpp
>>>> ?? 71 static void* verify_byte_codes_fn() {
>>>> ?? 72?? if (OrderAccess::load_acquire(&_verify_byte_codes_fn) == 
>>>> NULL) {
>>>> ?? 73???? void *lib_handle = os::native_java_library();
>>>> ?? 74???? void *func = os::dll_lookup(lib_handle, 
>>>> "VerifyClassCodesForMajorVersion");
>>>> ?? 75???? OrderAccess::release_store(&_verify_byte_codes_fn, func);
>>>> ?? 76???? if (func == NULL) {
>>>> ?? 77?????? _is_new_verify_byte_codes_fn = false;
>>>> ?? 78?????? func = os::dll_lookup(lib_handle, "VerifyClassCodes");
>>>> ?? 79 OrderAccess::release_store(&_verify_byte_codes_fn, func);
>>>> ?? 80???? }
>>>> ?? 81?? }
>>>> ?? 82?? return (void*)_verify_byte_codes_fn;
>>>> ?? 83 }
>>>>
>>>> [pre-existing]
>>>>
>>>> I think this code has race problems; a caller could unexpectedly and
>>>> inappropriately return NULL.? Consider the case where there is no
>>>> VerifyClassCodesForMajorVersion, but there is VerifyClassCodes.
>>>>
>>>> The variable is initially NULL.
>>>>
>>>> Both Thread1 and Thread2 reach line 73, having both seen a NULL value
>>>> for the variable.
>>>>
>>>> Thread1 reaches line 80, setting the variable to VerifyClassCodes.
>>>>
>>>> Thread2 reaches line 76, resetting the variable to NULL.
>>>>
>>>> Thread1 reads the now (momentarily) NULL value and returns it.
>>>>
>>>> I think the first release_store should be conditional on func != NULL.
>>>> Also, the usage of _is_new_verify_byte_codes_fn seems suspect.
>>>> And a minor additional nit: the cast in the return is unnecessary.
>>>
>>> Yes, this looks like a bug.?? I'll cut/paste this and file it. It may 
>>> be that this is support for the old verifier in old jdk versions that 
>>> can be cleaned up.
>>>>
>>>> ------------------------------------------------------------------------------ 
>>>>
>>>> src/hotspot/share/code/nmethod.cpp
>>>> 1664?? nmethod* observed_mark_link = _oops_do_mark_link;
>>>> 1665?? if (observed_mark_link == NULL) {
>>>> 1666???? // Claim this nmethod for this thread to mark.
>>>> 1667???? if (Atomic::cmpxchg_if_null(NMETHOD_SENTINEL, 
>>>> &_oops_do_mark_link)) {
>>>>
>>>> With these changes, the only use of observed_mark_link is in the if.
>>>> I'm not sure that variable is really useful anymore, e.g. just use
>>>>
>>>> ?? if (_oops_do_mark_link == NULL) {
>>>
>>> Ok fixed.
>>>>
>>>> ------------------------------------------------------------------------------ 
>>>>
>>>> src/hotspot/share/gc/cms/concurrentMarkSweepGeneration.cpp
>>>>
>>>> In CMSCollector::par_take_from_overflow_list, if BUSY and prefix were
>>>> of type oopDesc*, I think there would be a whole lot fewer casts and
>>>> cast_to_oop's.? Later on, I think suffix_head, observed_overflow_list,
>>>> and curr_overflow_list could also be oopDesc* instead of oop to
>>>> eliminate more casts.
>>>
>>> I actually tried to make this change but ran into more fan out that 
>>> way, so went back and just fixed the cmpxchg calls to cast oops to 
>>> oopDesc* and things were less perturbed that way.
>>>>
>>>> And some similar changes in CMSCollector::par_push_on_overflow_list.
>>>>
>>>> And similarly in parNewGeneration.cpp, in push_on_overflow_list and
>>>> take_from_overflow_list_work.
>>>>
>>>> As noted in the comments for JDK-8165857, the lists and "objects"
>>>> involved here aren't really oops, but rather the shattered remains of
>>>
>>> Yes, somewhat horrified at the value of BUSY.
>>>> oops.? The suggestion there was to use HeapWord* and carry through the
>>>> fanout; what was actually done was to change _overflow_list to
>>>> oopDesc* to minimize fanout, even though that's kind of lying to the
>>>> type system.? Now, with the cleanup of cmpxchg_ptr and such, we're
>>>> paying the price of doing the minimal thing back then.
>>>
>>> I will file an RFE about cleaning this up.? I think what I've done 
>>> was the minimal thing.
>>>>
>>>> ------------------------------------------------------------------------------ 
>>>>
>>>> src/hotspot/share/gc/cms/concurrentMarkSweepGeneration.cpp
>>>> 7960?? Atomic::add(-n, &_num_par_pushes);
>>>>
>>>> Atomic::sub
>>>
>>> fixed.
>>>>
>>>> ------------------------------------------------------------------------------ 
>>>>
>>>> src/hotspot/share/gc/cms/parNewGeneration.cpp
>>>> 1455?? Atomic::add(-n, &_num_par_pushes);
>>> fixed.
>>>> Atomic::sub
>>>>
>>>> ------------------------------------------------------------------------------ 
>>>>
>>>> src/hotspot/share/gc/g1/dirtyCardQueue.cpp
>>>> ? 283???? void* actual = Atomic::cmpxchg(next, 
>>>> &_cur_par_buffer_node, nd);
>>>> ...
>>>> ? 289?????? nd = static_cast<BufferNode*>(actual);
>>>>
>>>> Change actual's type to BufferNode* and remove the cast on line 289.
>>>
>>> fixed.? missed that one. gross.
>>>>
>>>> ------------------------------------------------------------------------------ 
>>>>
>>>> src/hotspot/share/gc/g1/g1CollectedHeap.cpp
>>>>
>>>> [pre-existing]
>>>> 3499???????? old = (CompiledMethod*)_postponed_list;
>>>>
>>>> I think that cast is only needed because
>>>> G1CodeCacheUnloadingTask::_postponed_list is incorrectly typed as
>>>> "volatile CompiledMethod*", when I think it ought to be
>>>> "CompiledMethod* volatile".
>>>>
>>>> I think G1CodeCacheUnloading::_claimed_nmethod is similarly mis-typed,
>>>> with a similar should not be needed cast:
>>>> 3530?????? first = (CompiledMethod*)_claimed_nmethod;
>>>>
>>>> and another for _postponed_list here:
>>>> 3552?????? claim = (CompiledMethod*)_postponed_list;
>>>
>>> I've fixed this.?? C++ is so confusing about where to put the 
>>> volatile.?? Everyone has been tripped up by it.
>>>>
>>>> ------------------------------------------------------------------------------ 
>>>>
>>>> src/hotspot/share/gc/g1/g1HotCardCache.cpp
>>>> ?? 77?? jbyte* previous_ptr = (jbyte*)Atomic::cmpxchg(card_ptr,
>>>>
>>>> I think the cast of the cmpxchg result is no longer needed.
>>>
>>> fixed.
>>>>
>>>> ------------------------------------------------------------------------------ 
>>>>
>>>> src/hotspot/share/gc/g1/g1PageBasedVirtualSpace.cpp
>>>> ? 254?????? char* touch_addr = (char*)Atomic::add(actual_chunk_size, 
>>>> &_cur_addr) - actual_chunk_size;
>>>>
>>>> I think the cast of the add result is no longer needed.
>>> got it already.
>>>>
>>>> ------------------------------------------------------------------------------ 
>>>>
>>>> src/hotspot/share/gc/g1/g1StringDedup.cpp
>>>> ? 213?? return (size_t)Atomic::add(partition_size, &_next_bucket) - 
>>>> partition_size;
>>>>
>>>> I think the cast of the add result is no longer needed.
>>>
>>> I was slacking in the g1 files.? fixed.
>>>>
>>>> ------------------------------------------------------------------------------ 
>>>>
>>>> src/hotspot/share/gc/g1/heapRegionRemSet.cpp
>>>> ? 200?????? PerRegionTable* res =
>>>> ? 201???????? Atomic::cmpxchg(nxt, &_free_list, fl);
>>>>
>>>> Please remove the line break, now that the code has been simplified.
>>>>
>>>> But wait, doesn't this alloc exhibit classic ABA problems?? I *think*
>>>> this works because alloc and bulk_free are called in different phases,
>>>> never overlapping.
>>>
>>> I don't know.? Do you want to file a bug to investigate this?
>>> fixed.
>>>>
>>>> ------------------------------------------------------------------------------ 
>>>>
>>>> src/hotspot/share/gc/g1/sparsePRT.cpp
>>>> ? 295???? SparsePRT* res =
>>>> ? 296?????? Atomic::cmpxchg(sprt, &_head_expanded_list, hd);
>>>> and
>>>> ? 307???? SparsePRT* res =
>>>> ? 308?????? Atomic::cmpxchg(next, &_head_expanded_list, hd);
>>>>
>>>> I'd rather not have the line breaks in these either.
>>>>
>>>> And get_from_expanded_list also appears to have classic ABA problems.
>>>> I *think* this works because add_to_expanded_list and
>>>> get_from_expanded_list are called in different phases, never
>>>> overlapping.
>>>
>>> Fixed, same question as above?? Or one bug to investigate both?
>>>>
>>>> ------------------------------------------------------------------------------ 
>>>>
>>>> src/hotspot/share/gc/shared/taskqueue.inline.hpp
>>>> ? 262?? return (size_t) Atomic::cmpxchg((intptr_t)new_age._data,
>>>> ? 263?????????????????????????????????? (volatile intptr_t *)&_data,
>>>> ? 264 (intptr_t)old_age._data);
>>>>
>>>> This should be
>>>>
>>>> ?? return Atomic::cmpxchg(new_age._data, &_data, old_age._data);
>>>
>>> fixed.
>>>>
>>>> ------------------------------------------------------------------------------ 
>>>>
>>>> src/hotspot/share/interpreter/bytecodeInterpreter.cpp
>>>> This doesn't have any casts, which I think is correct.
>>>> ? 708???????????? if (Atomic::cmpxchg(header, rcvr->mark_addr(), 
>>>> mark) == mark) {
>>>>
>>>> but these do.
>>>> ? 718???????????? if (Atomic::cmpxchg((void*)new_header, 
>>>> rcvr->mark_addr(), mark) == mark) {
>>>> ? 737???????????? if (Atomic::cmpxchg((void*)new_header, 
>>>> rcvr->mark_addr(), header) == header) {
>>>>
>>>> I'm not sure how the ones with casts even compile?? mark_addr() seems
>>>> to be a markOop*, which is a markOopDesc**, where markOopDesc is a
>>>> class.? void* is not implicitly convertible to markOopDesc*.
>>>>
>>>> Hm, this entire file is #ifdef CC_INTERP.? Is this zero-only code?? Or
>>>> something like that?
>>>>
>>>> Similarly here:
>>>> ? 906?????????? if (Atomic::cmpxchg(header, lockee->mark_addr(), 
>>>> mark) == mark) {
>>>> and
>>>> ? 917?????????? if (Atomic::cmpxchg((void*)new_header, 
>>>> lockee->mark_addr(), mark) == mark) {
>>>> ? 935?????????? if (Atomic::cmpxchg((void*)new_header, 
>>>> lockee->mark_addr(), header) == header) {
>>>>
>>>> and here:
>>>> 1847?????????????? if (Atomic::cmpxchg(header, lockee->mark_addr(), 
>>>> mark) == mark) {
>>>> 1858?????????????? if (Atomic::cmpxchg((void*)new_header, 
>>>> lockee->mark_addr(), mark) == mark) {
>>>> 1878?????????????? if (Atomic::cmpxchg((void*)new_header, 
>>>> lockee->mark_addr(), header) == header) {
>>>>
>>>> and here:
>>>> 1847?????????????? if (Atomic::cmpxchg(header, lockee->mark_addr(), 
>>>> mark) == mark) {
>>>> 1858?????????????? if (Atomic::cmpxchg((void*)new_header, 
>>>> lockee->mark_addr(), mark) == mark) {
>>>> 1878?????????????? if (Atomic::cmpxchg((void*)new_header, 
>>>> lockee->mark_addr(), header) == header) {
>>>
>>> I've changed all these.?? This is part of Zero.
>>>>
>>>> ------------------------------------------------------------------------------ 
>>>>
>>>> src/hotspot/share/memory/metaspace.cpp
>>>> 1502?? size_t value = OrderAccess::load_acquire(&_capacity_until_GC);
>>>> ...
>>>> 1537?? return (size_t)Atomic::sub((intptr_t)v, &_capacity_until_GC);
>>>>
>>>> These and other uses of _capacity_until_GC suggest that variable's
>>>> type should be size_t rather than intptr_t.? Note that I haven't done
>>>> a careful check of uses to see if there are any places where such a
>>>> change would cause problems.
>>>
>>> Yes, I had a hard time with metaspace.cpp because I agree 
>>> _capacity_until_GC should be size_t.?? Tried to make this change and 
>>> it cascaded a bit.? I'll file an RFE to change this type separately.
>>>
>>>>
>>>> ------------------------------------------------------------------------------ 
>>>>
>>>> src/hotspot/share/oops/constantPool.cpp
>>>> ? 229?? OrderAccess::release_store((Klass* volatile *)adr, k);
>>>> ? 246?? OrderAccess::release_store((Klass* volatile *)adr, k);
>>>> ? 514?? OrderAccess::release_store((Klass* volatile *)adr, k);
>>>>
>>>> Casts are not needed.
>>>
>>> fixed.
>>>>
>>>> ------------------------------------------------------------------------------ 
>>>>
>>>> src/hotspot/share/oops/constantPool.hpp
>>>> ? 148???? volatile intptr_t adr = 
>>>> OrderAccess::load_acquire(obj_at_addr_raw(which));
>>>>
>>>> [pre-existing]
>>>> Why is adr declared volatile?
>>>
>>> golly beats me.? concurrency is scary, especially in the constant pool.
>>> The load_acquire() should make sure the value is fetched from memory 
>>> so volatile is unneeded.
>>>
>>>>
>>>> ------------------------------------------------------------------------------ 
>>>>
>>>> src/hotspot/share/oops/cpCache.cpp
>>>> ? 157???? intx newflags = (value & parameter_size_mask);
>>>> ? 158???? Atomic::cmpxchg(newflags, &_flags, (intx)0);
>>>>
>>>> This is a nice demonstration of why I wanted to include some value
>>>> preserving integral conversions in cmpxchg, rather than requiring
>>>> exact type matching in the integral case.? There have been some others
>>>> that I haven't commented on.? Apparently we (I) got away with
>>>> including such conversions in Atomic::add, which I'd forgotten about.
>>>> And see comment regarding Atomic::sub below.
>>>>
>>>> ------------------------------------------------------------------------------ 
>>>>
>>>> src/hotspot/share/oops/cpCache.hpp
>>>> ? 139?? volatile Metadata*?? _f1;?????? // entry specific metadata 
>>>> field
>>>>
>>>> [pre-existing]
>>>> I suspect the type should be Metadata* volatile.? And that would
>>>> eliminate the need for the cast here:
>>>>
>>>> ? 339?? Metadata* f1_ord() const?????????????????????? { return 
>>>> (Metadata *)OrderAccess::load_acquire(&_f1); }
>>>>
>>>> I don't know if there are any other changes needed or desirable around
>>>> _f1 usage.
>>>
>>> yes, fixed this.
>>>>
>>>> ------------------------------------------------------------------------------ 
>>>>
>>>> src/hotspot/share/oops/method.hpp
>>>> ? 139?? volatile address from_compiled_entry() const?? { return 
>>>> OrderAccess::load_acquire(&_from_compiled_entry); }
>>>> ? 140?? volatile address from_compiled_entry_no_trampoline() const;
>>>> ? 141?? volatile address from_interpreted_entry() const{ return 
>>>> OrderAccess::load_acquire(&_from_interpreted_entry); }
>>>>
>>>> [pre-existing]
>>>> The volatile qualifiers here seem suspect to me.
>>>
>>> Again much suspicion about concurrency and giant pain, which I 
>>> remember, of debugging these when they were broken.
>>>>
>>>> ------------------------------------------------------------------------------ 
>>>>
>>>> src/hotspot/share/oops/oop.inline.hpp
>>>> ? 391???? narrowOop old = (narrowOop)Atomic::xchg(val, 
>>>> (narrowOop*)dest);
>>>>
>>>> Cast of return type is not needed.
>>>
>>> fixed.
>>>
>>>>
>>>> ------------------------------------------------------------------------------ 
>>>>
>>>> src/hotspot/share/prims/jni.cpp
>>>>
>>>> [pre-existing]
>>>>
>>>> copy_jni_function_table should be using Copy::disjoint_words_atomic.
>>>
>>> yuck.
>>>>
>>>> ------------------------------------------------------------------------------ 
>>>>
>>>> src/hotspot/share/prims/jni.cpp
>>>>
>>>> [pre-existing]
>>>>
>>>> 3892?? // We're about to use Atomic::xchg for synchronization. Some 
>>>> Zero
>>>> 3893?? // platforms use the GCC builtin __sync_lock_test_and_set for 
>>>> this,
>>>> 3894?? // but __sync_lock_test_and_set is not guaranteed to do what 
>>>> we want
>>>> 3895?? // on all architectures.? So we check it works before relying 
>>>> on it.
>>>> 3896 #if defined(ZERO) && defined(ASSERT)
>>>> 3897?? {
>>>> 3898???? jint a = 0xcafebabe;
>>>> 3899???? jint b = Atomic::xchg(0xdeadbeef, &a);
>>>> 3900???? void *c = &a;
>>>> 3901???? void *d = Atomic::xchg(&b, &c);
>>>> 3902???? assert(a == (jint) 0xdeadbeef && b == (jint) 0xcafebabe, 
>>>> "Atomic::xchg() works");
>>>> 3903???? assert(c == &b && d == &a, "Atomic::xchg() works");
>>>> 3904?? }
>>>> 3905 #endif // ZERO && ASSERT
>>>>
>>>> It seems rather strange to be testing Atomic::xchg() here, rather than
>>>> as part of unit testing Atomic?? Fail unit testing => don't try to
>>>> use...
>>>
>>> This is zero.? I'm not touching this.
>>>>
>>>> ------------------------------------------------------------------------------ 
>>>>
>>>> src/hotspot/share/prims/jvmtiRawMonitor.cpp
>>>> ? 130???? if (Atomic::cmpxchg_if_null((void*)Self, &_owner)) {
>>>> ? 142???? if (_owner == NULL && Atomic::cmpxchg_if_null((void*)Self, 
>>>> &_owner)) {
>>>>
>>>> I think these casts aren't needed. _owner is void*, and Self is
>>>> Thread*, which is implicitly convertible to void*.
>>>>
>>>> Similarly here, for the THREAD argument:
>>>> ? 280???? Contended = Atomic::cmpxchg((void*)THREAD, &_owner, 
>>>> (void*)NULL);
>>>> ? 283???? Contended = Atomic::cmpxchg((void*)THREAD, &_owner, 
>>>> (void*)NULL);
>>>
>>> Okay, let me see if the compiler(s) eat that. (yes they do)
>>>>
>>>> ------------------------------------------------------------------------------ 
>>>>
>>>> src/hotspot/share/prims/jvmtiRawMonitor.hpp
>>>>
>>>> This file is in the webrev, but seems to be unchanged.
>>>
>>> It'll be cleaned up with the the commit and not be part of the 
>>> changeset.
>>>>
>>>> ------------------------------------------------------------------------------ 
>>>>
>>>> src/hotspot/share/runtime/atomic.hpp
>>>> ? 520 template<typename I, typename D>
>>>> ? 521 inline D Atomic::sub(I sub_value, D volatile* dest) {
>>>> ? 522?? STATIC_ASSERT(IsPointer<D>::value || IsIntegral<D>::value);
>>>> ? 523?? // Assumes two's complement integer representation.
>>>> ? 524?? #pragma warning(suppress: 4146)
>>>> ? 525?? return Atomic::add(-sub_value, dest);
>>>> ? 526 }
>>>>
>>>> I'm pretty sure this implementation is incorrect.? I think it produces
>>>> the wrong result when I and D are both unsigned integer types and
>>>> sizeof(I) < sizeof(D).
>>>
>>> Can you suggest a correction?? I just copied Atomic::dec().
>>>>
>>>> ------------------------------------------------------------------------------ 
>>>>
>>>> src/hotspot/share/runtime/mutex.cpp
>>>> ? 304?? intptr_t v = Atomic::cmpxchg((intptr_t)_LBIT, 
>>>> &_LockWord.FullWord, (intptr_t)0);? // agro ...
>>>>
>>>> _LBIT should probably be intptr_t, rather than an enum.? Note that the
>>>> enum type is unused.? The old value here is another place where an
>>>> implicit widening of same signedness would have been nice. (Such
>>>> implicit widening doesn't work for enums, since it's unspecified
>>>> whether they default to signed or unsigned representation, and
>>>> implementatinos differ.)
>>>
>>> This would be a good/simple cleanup.? I changed it to const intptr_t 
>>> _LBIT = 1;
>>>>
>>>> ------------------------------------------------------------------------------ 
>>>>
>>>> src/hotspot/share/runtime/mutex.hpp
>>>>
>>>> [pre-existing]
>>>>
>>>> I think the Address member of the SplitWord union is unused. Looking
>>>> at AcquireOrPush (and others), I'm wondering whether it *should* be
>>>> used there, or whether just using intptr_t casts and doing integral
>>>> arithmetic (as is presently being done) is easier and clearer.
>>>>
>>>> Also the _LSBINDEX macro probably ought to be defined in mutex.cpp
>>>> rather than polluting the global namespace.? And technically, that
>>>> name is reserved word.
>>>
>>> I moved both this and _LBIT into the top of mutex.cpp since they are 
>>> used there.
>>> Cant define const intptr_t _LBIT =1; in a class in our version of C++.
>>>>
>>>> ------------------------------------------------------------------------------ 
>>>>
>>>> src/hotspot/share/runtime/objectMonitor.cpp
>>>> ? 252?? void * cur = Atomic::cmpxchg((void*)Self, &_owner, 
>>>> (void*)NULL);
>>>> ? 409?? if (Atomic::cmpxchg_if_null((void*)Self, &_owner)) {
>>>> 1983?????? ox = (Thread*)Atomic::cmpxchg((void*)Self, &_owner, 
>>>> (void*)NULL);
>>>>
>>>> I think the casts of Self aren't needed.
>>>
>>> fixed.
>>>>
>>>> ------------------------------------------------------------------------------ 
>>>>
>>>> src/hotspot/share/runtime/objectMonitor.cpp
>>>> ? 995?????? if (!Atomic::cmpxchg_if_null((void*)THREAD, &_owner)) {
>>>> 1020???????? if (!Atomic::cmpxchg_if_null((void*)THREAD, &_owner)) {
>>>>
>>>> I think the casts of THREAD aren't needed.
>>>
>>> nope, fixed.
>>>> ------------------------------------------------------------------------------ 
>>>>
>>>> src/hotspot/share/runtime/objectMonitor.hpp
>>>> ? 254?? markOopDesc* volatile* header_addr();
>>>>
>>>> Why isn't this volatile markOop* ?
>>>
>>> fixed.
>>>>
>>>> ------------------------------------------------------------------------------ 
>>>>
>>>> src/hotspot/share/runtime/synchronizer.cpp
>>>> ? 242???????? Atomic::cmpxchg_if_null((void*)Self, &(m->_owner))) {
>>>>
>>>> I think the cast of Self isn't needed.
>>>
>>> fixed.
>>>>
>>>> ------------------------------------------------------------------------------ 
>>>>
>>>> src/hotspot/share/runtime/synchronizer.cpp
>>>> ? 992?? for (; block != NULL; block = (PaddedEnd<ObjectMonitor> 
>>>> *)next(block)) {
>>>> 1734???? for (; block != NULL; block = (PaddedEnd<ObjectMonitor> 
>>>> *)next(block)) {
>>>>
>>>> [pre-existing]
>>>> All calls to next() pass a PaddedEnd<ObjectMonitor>* and cast the
>>>> result.? How about moving all that behavior into next().
>>>
>>> I fixed this next() function, but it necessitated a cast to FreeNext 
>>> field.? The PaddedEnd<> type was intentionally not propagated to all 
>>> the things that use it.?? Which is a shame because there are a lot 
>>> more casts to PaddedEnd<ObjectMonitor> that could have been removed.
>>>>
>>>> ------------------------------------------------------------------------------ 
>>>>
>>>> src/hotspot/share/runtime/synchronizer.cpp
>>>> 1970???? if (monitor > (ObjectMonitor *)&block[0] &&
>>>> 1971???????? monitor < (ObjectMonitor *)&block[_BLOCKSIZE]) {
>>>>
>>>> [pre-existing]
>>>> Are the casts needed here?? I think PaddedEnd<ObjectMonitor> is
>>>> derived from ObjectMonitor, so implicit conversions should apply.
>>>
>>> prob not.? removed them.
>>>>
>>>> ------------------------------------------------------------------------------ 
>>>>
>>>> src/hotspot/share/runtime/synchronizer.hpp
>>>> ?? 28 #include "memory/padded.hpp"
>>>> ? 163?? static PaddedEnd<ObjectMonitor> * volatile gBlockList;
>>>>
>>>> I was going to suggest as an alternative just making gBlockList a file
>>>> scoped variable in synchronizer.cpp, since it isn't used outside of
>>>> that file. Except that it is referenced by vmStructs.? Curses!
>>>
>>> It's also used by the SA.
>>>>
>>>> ------------------------------------------------------------------------------ 
>>>>
>>>> src/hotspot/share/runtime/thread.cpp
>>>> 4707?? intptr_t w = Atomic::cmpxchg((intptr_t)LOCKBIT, Lock, 
>>>> (intptr_t)0);
>>>>
>>>> This and other places suggest LOCKBIT should be defined as intptr_t,
>>>> rather than as an enum value.? The MuxBits enum type is unused.
>>>>
>>>> And the cast of 0 is another case where implicit widening would be 
>>>> nice.
>>>
>>> Making LOCKBIT a const intptr_t = 1 removes a lot of casts.
>>>>
>>>> ------------------------------------------------------------------------------ 
>>>>
>>>> src/hotspot/share/services/mallocSiteTable.cpp
>>>> ? 261 bool MallocSiteHashtableEntry::atomic_insert(const 
>>>> MallocSiteHashtableEntry* entry) {
>>>> ? 262?? return Atomic::cmpxchg_if_null(entry, (const 
>>>> MallocSiteHashtableEntry**)&_next);
>>>> ? 263 }
>>>>
>>>> I think the problem here that is leading to the cast is that
>>>> atomic_insert is taking a const T*.? Note that it's only caller passes
>>>> a non-const T*.
>>>
>>> I'll change the type to non-const.? We try to use consts...
>>>
>>> Thanks for the detailed review!? The gcc compiler seems happy so far, 
>>> I'll post a webrev of the result of these changes after fixing 
>>> Atomic::sub() and seeing how the other compilers deal with these 
>>> changes.
>>>
>>> Thanks,
>>> Coleen
>>>
>>>>
>>>> ------------------------------------------------------------------------------ 
>>>>
>>>>
>>>
>>

From rkennke at redhat.com  Mon Oct 16 05:49:26 2017
From: rkennke at redhat.com (Roman Kennke)
Date: Mon, 16 Oct 2017 07:49:26 +0200
Subject: RFR: 8171853: Remove Shark compiler
In-Reply-To: <331579a0-29de-f152-2dd4-66987896c463@oracle.com>
References: <92a2fec1-88f1-2579-1202-299b13062b7b@redhat.com>
 <836cd06d-d10a-a98e-d996-b5b92de94c4b@oracle.com>
 <980aa9ae-9c50-c1dc-d52f-e00234a2e3ca@redhat.com>
 <e883f832-2e33-c6a0-5966-040d8a90447e@oracle.com>
 <ec5ad498-ce5b-0aae-1825-a26af1717a32@oracle.com>
 <f0deb71e-df99-6e9b-e3fd-29a955a7fe02@redhat.com>
 <321ac223-1dc8-841c-93c0-c39c770b7e20@oracle.com>
 <331579a0-29de-f152-2dd4-66987896c463@oracle.com>
Message-ID: <a57fe2df-be5d-be92-a523-0932d2b3c1b3@redhat.com>


Hi David,

thanks for reviewing and testing!

The interaction between JEPs and patches going in is not really clear to 
me, nor is it well documented. For example, we're already pushing 
patches for JEP 304: Garbage Collection Interface, even though it's only 
in 'candidate' state...

In any case, I'll ping Mark Reinhold about moving the Shark JEP forward.

Thanks again,
Roman

> My internal JPRT run went fine. So this just needs a build team 
> signoff from the perspective of the patch.
>
> However, as this has had a JEP submitted for it, the code changes can 
> not be pushed until the JEP has been targeted.
>
> Thanks,
> David
>
> On 16/10/2017 8:08 AM, David Holmes wrote:
>> Looks good.
>>
>> Thanks,
>> David
>>
>> On 16/10/2017 8:00 AM, Roman Kennke wrote:
>>>
>>> Ok, I fixed all the comments you mentioned.
>>>
>>> Differential (against webrev.01):
>>> http://cr.openjdk.java.net/~rkennke/8171853/webrev.03.diff/ 
>>> <http://cr.openjdk.java.net/%7Erkennke/8171853/webrev.03.diff/>
>>> Full webrev:
>>> http://cr.openjdk.java.net/~rkennke/8171853/webrev.03/ 
>>> <http://cr.openjdk.java.net/%7Erkennke/8171853/webrev.03/>
>>>
>>> Roman
>>>
>>>> Just spotted this:
>>>>
>>>> ./hotspot/jtreg/compiler/whitebox/CompilerWhiteBoxTest.java: /** 
>>>> {@code CompLevel::CompLevel_full_optimization} -- C2 or Shark */
>>>>
>>>> David
>>>>
>>>> On 16/10/2017 7:25 AM, David Holmes wrote:
>>>>> On 16/10/2017 7:01 AM, Roman Kennke wrote:
>>>>>> Hi David,
>>>>>>
>>>>>> thanks!
>>>>>>
>>>>>> I'm uploading a 2nd revision of the patch that excludes the 
>>>>>> generated-configure.sh part, and adds a smallish Zero-related fix.
>>>>>>
>>>>>> http://cr.openjdk.java.net/~rkennke/8171853/webrev.01/ 
>>>>>> <http://cr.openjdk.java.net/%7Erkennke/8171853/webrev.01/>
>>>>>
>>>>> Can you point me to the exact change please as I don't want to 
>>>>> re-examine it all. :)
>>>>>
>>>>> I'll pull this in and do a test build run internally.
>>>>>
>>>>> Thanks,
>>>>> David
>>>>>
>>>>>> Thanks, Roman
>>>>>>
>>>>>>
>>>>>>> Hi Roman,
>>>>>>>
>>>>>>> The build changes must be reviewed on build-dev - now cc'd.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> David
>>>>>>>
>>>>>>> On 15/10/2017 8:41 AM, Roman Kennke wrote:
>>>>>>>> The JEP to remove the Shark compiler has received exclusively 
>>>>>>>> positive feedback (JDK-8189173) on zero-dev. So here comes the 
>>>>>>>> big patch to remove it.
>>>>>>>>
>>>>>>>> What I have done:
>>>>>>>>
>>>>>>>> grep -i -R shark src
>>>>>>>> grep -i -R shark make
>>>>>>>> grep -i -R shark doc
>>>>>>>> grep -i -R shark doc
>>>>>>>>
>>>>>>>> and purged any reference to shark. Almost everything was 
>>>>>>>> straightforward.
>>>>>>>>
>>>>>>>> The only things I wasn't really sure of:
>>>>>>>>
>>>>>>>> - in globals.hpp, I re-arranged the KIND_* bits to account for 
>>>>>>>> the gap that removing KIND_SHARK left. I hope that's good?
>>>>>>>> - in relocInfo_zero.hpp I put a ShouldNotCallThis() in 
>>>>>>>> pd_address_in_code(), I am not sure it is the right thing to 
>>>>>>>> do. If not, what *would* be the right thing?
>>>>>>>>
>>>>>>>> Then of course I did:
>>>>>>>>
>>>>>>>> rm -rf src/hotspot/share/shark
>>>>>>>>
>>>>>>>> I also went through the build machinery and removed stuff 
>>>>>>>> related to Shark and LLVM libs.
>>>>>>>>
>>>>>>>> Now the only references in the whole JDK tree to shark is a 
>>>>>>>> 'Shark Bay' in a timezone file, and 'Wireshark' in some tests ;-)
>>>>>>>>
>>>>>>>> I tested by building a regular x86 JVM and running JTREG tests. 
>>>>>>>> All looks fine.
>>>>>>>>
>>>>>>>> - I could not build zero because it seems broken because of the 
>>>>>>>> recent Atomic::* changes
>>>>>>>> - I could not test any of the other arches that seemed to 
>>>>>>>> reference Shark (arm and sparc)
>>>>>>>>
>>>>>>>> Here's the full webrev:
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~rkennke/8171853/webrev.00/ 
>>>>>>>> <http://cr.openjdk.java.net/%7Erkennke/8171853/webrev.00/>
>>>>>>>>
>>>>>>>> Can I get a review on this?
>>>>>>>>
>>>>>>>> Thanks, Roman
>>>>>>>>
>>>>>>
>>>


From david.holmes at oracle.com  Mon Oct 16 06:10:19 2017
From: david.holmes at oracle.com (David Holmes)
Date: Mon, 16 Oct 2017 16:10:19 +1000
Subject: RFR: 8171853: Remove Shark compiler
In-Reply-To: <a57fe2df-be5d-be92-a523-0932d2b3c1b3@redhat.com>
References: <92a2fec1-88f1-2579-1202-299b13062b7b@redhat.com>
 <836cd06d-d10a-a98e-d996-b5b92de94c4b@oracle.com>
 <980aa9ae-9c50-c1dc-d52f-e00234a2e3ca@redhat.com>
 <e883f832-2e33-c6a0-5966-040d8a90447e@oracle.com>
 <ec5ad498-ce5b-0aae-1825-a26af1717a32@oracle.com>
 <f0deb71e-df99-6e9b-e3fd-29a955a7fe02@redhat.com>
 <321ac223-1dc8-841c-93c0-c39c770b7e20@oracle.com>
 <331579a0-29de-f152-2dd4-66987896c463@oracle.com>
 <a57fe2df-be5d-be92-a523-0932d2b3c1b3@redhat.com>
Message-ID: <456436e4-955c-75f5-ac92-e2fd4a2fb280@oracle.com>

On 16/10/2017 3:49 PM, Roman Kennke wrote:
> 
> Hi David,
> 
> thanks for reviewing and testing!
> 
> The interaction between JEPs and patches going in is not really clear to 
> me, nor is it well documented. For example, we're already pushing 
> patches for JEP 304: Garbage Collection Interface, even though it's only 
> in 'candidate' state...

If patches can be separated out into generally useful cleanup or 
enabling changes then it can be okay to push them independently of the 
JEP AFAIK. That's obviously a little subjective. In this case though 
we're talking about the whole thing at once, so AFAIK the JEP has to be 
targeted before the changes can be pushed.

> In any case, I'll ping Mark Reinhold about moving the Shark JEP forward.

Thanks. Should be simple enough, I hope. :)

Cheers,
David

> Thanks again,
> Roman
> 
>> My internal JPRT run went fine. So this just needs a build team 
>> signoff from the perspective of the patch.
>>
>> However, as this has had a JEP submitted for it, the code changes can 
>> not be pushed until the JEP has been targeted.
>>
>> Thanks,
>> David
>>
>> On 16/10/2017 8:08 AM, David Holmes wrote:
>>> Looks good.
>>>
>>> Thanks,
>>> David
>>>
>>> On 16/10/2017 8:00 AM, Roman Kennke wrote:
>>>>
>>>> Ok, I fixed all the comments you mentioned.
>>>>
>>>> Differential (against webrev.01):
>>>> http://cr.openjdk.java.net/~rkennke/8171853/webrev.03.diff/ 
>>>> <http://cr.openjdk.java.net/%7Erkennke/8171853/webrev.03.diff/>
>>>> Full webrev:
>>>> http://cr.openjdk.java.net/~rkennke/8171853/webrev.03/ 
>>>> <http://cr.openjdk.java.net/%7Erkennke/8171853/webrev.03/>
>>>>
>>>> Roman
>>>>
>>>>> Just spotted this:
>>>>>
>>>>> ./hotspot/jtreg/compiler/whitebox/CompilerWhiteBoxTest.java: /** 
>>>>> {@code CompLevel::CompLevel_full_optimization} -- C2 or Shark */
>>>>>
>>>>> David
>>>>>
>>>>> On 16/10/2017 7:25 AM, David Holmes wrote:
>>>>>> On 16/10/2017 7:01 AM, Roman Kennke wrote:
>>>>>>> Hi David,
>>>>>>>
>>>>>>> thanks!
>>>>>>>
>>>>>>> I'm uploading a 2nd revision of the patch that excludes the 
>>>>>>> generated-configure.sh part, and adds a smallish Zero-related fix.
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~rkennke/8171853/webrev.01/ 
>>>>>>> <http://cr.openjdk.java.net/%7Erkennke/8171853/webrev.01/>
>>>>>>
>>>>>> Can you point me to the exact change please as I don't want to 
>>>>>> re-examine it all. :)
>>>>>>
>>>>>> I'll pull this in and do a test build run internally.
>>>>>>
>>>>>> Thanks,
>>>>>> David
>>>>>>
>>>>>>> Thanks, Roman
>>>>>>>
>>>>>>>
>>>>>>>> Hi Roman,
>>>>>>>>
>>>>>>>> The build changes must be reviewed on build-dev - now cc'd.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> David
>>>>>>>>
>>>>>>>> On 15/10/2017 8:41 AM, Roman Kennke wrote:
>>>>>>>>> The JEP to remove the Shark compiler has received exclusively 
>>>>>>>>> positive feedback (JDK-8189173) on zero-dev. So here comes the 
>>>>>>>>> big patch to remove it.
>>>>>>>>>
>>>>>>>>> What I have done:
>>>>>>>>>
>>>>>>>>> grep -i -R shark src
>>>>>>>>> grep -i -R shark make
>>>>>>>>> grep -i -R shark doc
>>>>>>>>> grep -i -R shark doc
>>>>>>>>>
>>>>>>>>> and purged any reference to shark. Almost everything was 
>>>>>>>>> straightforward.
>>>>>>>>>
>>>>>>>>> The only things I wasn't really sure of:
>>>>>>>>>
>>>>>>>>> - in globals.hpp, I re-arranged the KIND_* bits to account for 
>>>>>>>>> the gap that removing KIND_SHARK left. I hope that's good?
>>>>>>>>> - in relocInfo_zero.hpp I put a ShouldNotCallThis() in 
>>>>>>>>> pd_address_in_code(), I am not sure it is the right thing to 
>>>>>>>>> do. If not, what *would* be the right thing?
>>>>>>>>>
>>>>>>>>> Then of course I did:
>>>>>>>>>
>>>>>>>>> rm -rf src/hotspot/share/shark
>>>>>>>>>
>>>>>>>>> I also went through the build machinery and removed stuff 
>>>>>>>>> related to Shark and LLVM libs.
>>>>>>>>>
>>>>>>>>> Now the only references in the whole JDK tree to shark is a 
>>>>>>>>> 'Shark Bay' in a timezone file, and 'Wireshark' in some tests ;-)
>>>>>>>>>
>>>>>>>>> I tested by building a regular x86 JVM and running JTREG tests. 
>>>>>>>>> All looks fine.
>>>>>>>>>
>>>>>>>>> - I could not build zero because it seems broken because of the 
>>>>>>>>> recent Atomic::* changes
>>>>>>>>> - I could not test any of the other arches that seemed to 
>>>>>>>>> reference Shark (arm and sparc)
>>>>>>>>>
>>>>>>>>> Here's the full webrev:
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~rkennke/8171853/webrev.00/ 
>>>>>>>>> <http://cr.openjdk.java.net/%7Erkennke/8171853/webrev.00/>
>>>>>>>>>
>>>>>>>>> Can I get a review on this?
>>>>>>>>>
>>>>>>>>> Thanks, Roman
>>>>>>>>>
>>>>>>>
>>>>
> 

From aph at redhat.com  Mon Oct 16 07:31:50 2017
From: aph at redhat.com (Andrew Haley)
Date: Mon, 16 Oct 2017 08:31:50 +0100
Subject: RFR: 8171853: Remove Shark compiler
In-Reply-To: <d334d7e0-bc0c-3cae-f9e4-259a34e080d6@physik.fu-berlin.de>
References: <92a2fec1-88f1-2579-1202-299b13062b7b@redhat.com>
 <d334d7e0-bc0c-3cae-f9e4-259a34e080d6@physik.fu-berlin.de>
Message-ID: <75f7713d-e1d6-46a8-820d-f4d76f1d722b@redhat.com>

On 15/10/17 21:26, John Paul Adrian Glaubitz wrote:

> On 10/15/2017 12:41 AM, Roman Kennke wrote:
>> The JEP to remove the Shark compiler has received exclusively positive
>> feedback (JDK-8189173) on zero-dev. So here comes the big patch to remove it.
> 
> I have now read through the JEP and I have to say, I'm sad to see
> Shark go.
> 
> In my opinion, Shark should be a supported version of the JVM as
> LLVM is gaining code generation support for more and more
> architectures. I have always liked the idea to split out the code
> generation of compilers into a separate project and, in fact, the
> compilers for many other languages like Rust and Julia rely on LLVM.

There's no reason that something like Shark couldn't be written again,
but the problem at the time was that LLVM was a work in flux, and its
interface to the JIT continually mutated.  In addition, each LLVM
version had bugs which broke HotSpot; these bugs would be fixed in the
next version, but the next version had more bugs which broke HotSpot.
It was impossible to keep it working.

> It's a pity that this value is not seen within the OpenJDK project.

It's seen, for sure.  Otherwise I wouldn't have wanted us to do it.

There's no reason something like Shark couldn't be done again, but you
wouldn't start from here.

-- 
Andrew Haley
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671

From aph at redhat.com  Mon Oct 16 07:33:37 2017
From: aph at redhat.com (Andrew Haley)
Date: Mon, 16 Oct 2017 08:33:37 +0100
Subject: RFR: 8171853: Remove Shark compiler
In-Reply-To: <c5e29fac-4006-6d4c-421d-5932ad9ff9ef@physik.fu-berlin.de>
References: <92a2fec1-88f1-2579-1202-299b13062b7b@redhat.com>
 <d334d7e0-bc0c-3cae-f9e4-259a34e080d6@physik.fu-berlin.de>
 <152a7a54-d30f-3c82-313a-608ef118628a@redhat.com>
 <c5e29fac-4006-6d4c-421d-5932ad9ff9ef@physik.fu-berlin.de>
Message-ID: <0ff1b913-15d8-3fce-0381-c16076a8b0b5@redhat.com>

On 15/10/17 21:44, John Paul Adrian Glaubitz wrote:
> FWIW, there are actually quite a number of users for Zero who would be happy to
> have a JIT-version of it. One major user for Zero is MIPS (big-, little-endian,
> 32 and 64 bit) which still doesn't have a native code generator in Hotspot.

The problem with LLVM was always that its JIT interface didn't have support
for unpopular targets, thus negating its usefulness.

-- 
Andrew Haley
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671

From erik.joelsson at oracle.com  Mon Oct 16 08:24:56 2017
From: erik.joelsson at oracle.com (Erik Joelsson)
Date: Mon, 16 Oct 2017 10:24:56 +0200
Subject: RFR: 8171853: Remove Shark compiler
In-Reply-To: <f0deb71e-df99-6e9b-e3fd-29a955a7fe02@redhat.com>
References: <92a2fec1-88f1-2579-1202-299b13062b7b@redhat.com>
 <836cd06d-d10a-a98e-d996-b5b92de94c4b@oracle.com>
 <980aa9ae-9c50-c1dc-d52f-e00234a2e3ca@redhat.com>
 <e883f832-2e33-c6a0-5966-040d8a90447e@oracle.com>
 <ec5ad498-ce5b-0aae-1825-a26af1717a32@oracle.com>
 <f0deb71e-df99-6e9b-e3fd-29a955a7fe02@redhat.com>
Message-ID: <872910c6-a17b-d3df-bc80-fa850b9738d9@oracle.com>

Hello Roman,

In hotspot.m4, I believe the check on line 328 (pre changes) is still 
relevant for just the zero case.

Otherwise build changes look good to me.

/Erik


On 2017-10-16 00:00, Roman Kennke wrote:
>
> Ok, I fixed all the comments you mentioned.
>
> Differential (against webrev.01):
> http://cr.openjdk.java.net/~rkennke/8171853/webrev.03.diff/ 
> <http://cr.openjdk.java.net/%7Erkennke/8171853/webrev.03.diff/>
> Full webrev:
> http://cr.openjdk.java.net/~rkennke/8171853/webrev.03/ 
> <http://cr.openjdk.java.net/%7Erkennke/8171853/webrev.03/>
>
> Roman
>
>> Just spotted this:
>>
>> ./hotspot/jtreg/compiler/whitebox/CompilerWhiteBoxTest.java: /** 
>> {@code CompLevel::CompLevel_full_optimization} -- C2 or Shark */
>>
>> David
>>
>> On 16/10/2017 7:25 AM, David Holmes wrote:
>>> On 16/10/2017 7:01 AM, Roman Kennke wrote:
>>>> Hi David,
>>>>
>>>> thanks!
>>>>
>>>> I'm uploading a 2nd revision of the patch that excludes the 
>>>> generated-configure.sh part, and adds a smallish Zero-related fix.
>>>>
>>>> http://cr.openjdk.java.net/~rkennke/8171853/webrev.01/ 
>>>> <http://cr.openjdk.java.net/%7Erkennke/8171853/webrev.01/>
>>>
>>> Can you point me to the exact change please as I don't want to 
>>> re-examine it all. :)
>>>
>>> I'll pull this in and do a test build run internally.
>>>
>>> Thanks,
>>> David
>>>
>>>> Thanks, Roman
>>>>
>>>>
>>>>> Hi Roman,
>>>>>
>>>>> The build changes must be reviewed on build-dev - now cc'd.
>>>>>
>>>>> Thanks,
>>>>> David
>>>>>
>>>>> On 15/10/2017 8:41 AM, Roman Kennke wrote:
>>>>>> The JEP to remove the Shark compiler has received exclusively 
>>>>>> positive feedback (JDK-8189173) on zero-dev. So here comes the 
>>>>>> big patch to remove it.
>>>>>>
>>>>>> What I have done:
>>>>>>
>>>>>> grep -i -R shark src
>>>>>> grep -i -R shark make
>>>>>> grep -i -R shark doc
>>>>>> grep -i -R shark doc
>>>>>>
>>>>>> and purged any reference to shark. Almost everything was 
>>>>>> straightforward.
>>>>>>
>>>>>> The only things I wasn't really sure of:
>>>>>>
>>>>>> - in globals.hpp, I re-arranged the KIND_* bits to account for 
>>>>>> the gap that removing KIND_SHARK left. I hope that's good?
>>>>>> - in relocInfo_zero.hpp I put a ShouldNotCallThis() in 
>>>>>> pd_address_in_code(), I am not sure it is the right thing to do. 
>>>>>> If not, what *would* be the right thing?
>>>>>>
>>>>>> Then of course I did:
>>>>>>
>>>>>> rm -rf src/hotspot/share/shark
>>>>>>
>>>>>> I also went through the build machinery and removed stuff related 
>>>>>> to Shark and LLVM libs.
>>>>>>
>>>>>> Now the only references in the whole JDK tree to shark is a 
>>>>>> 'Shark Bay' in a timezone file, and 'Wireshark' in some tests ;-)
>>>>>>
>>>>>> I tested by building a regular x86 JVM and running JTREG tests. 
>>>>>> All looks fine.
>>>>>>
>>>>>> - I could not build zero because it seems broken because of the 
>>>>>> recent Atomic::* changes
>>>>>> - I could not test any of the other arches that seemed to 
>>>>>> reference Shark (arm and sparc)
>>>>>>
>>>>>> Here's the full webrev:
>>>>>>
>>>>>> http://cr.openjdk.java.net/~rkennke/8171853/webrev.00/ 
>>>>>> <http://cr.openjdk.java.net/%7Erkennke/8171853/webrev.00/>
>>>>>>
>>>>>> Can I get a review on this?
>>>>>>
>>>>>> Thanks, Roman
>>>>>>
>>>>
>


From magnus.ihse.bursie at oracle.com  Mon Oct 16 09:25:59 2017
From: magnus.ihse.bursie at oracle.com (Magnus Ihse Bursie)
Date: Mon, 16 Oct 2017 11:25:59 +0200
Subject: RFR: 8171853: Remove Shark compiler
In-Reply-To: <872910c6-a17b-d3df-bc80-fa850b9738d9@oracle.com>
References: <92a2fec1-88f1-2579-1202-299b13062b7b@redhat.com>
 <836cd06d-d10a-a98e-d996-b5b92de94c4b@oracle.com>
 <980aa9ae-9c50-c1dc-d52f-e00234a2e3ca@redhat.com>
 <e883f832-2e33-c6a0-5966-040d8a90447e@oracle.com>
 <ec5ad498-ce5b-0aae-1825-a26af1717a32@oracle.com>
 <f0deb71e-df99-6e9b-e3fd-29a955a7fe02@redhat.com>
 <872910c6-a17b-d3df-bc80-fa850b9738d9@oracle.com>
Message-ID: <a713f9f6-598b-1cdf-3802-5fea079ebbfc@oracle.com>

On 2017-10-16 10:24, Erik Joelsson wrote:
> Hello Roman,
>
> In hotspot.m4, I believe the check on line 328 (pre changes) is still 
> relevant for just the zero case.
Yes, it is indeed.

>
> Otherwise build changes look good to me.
Agree, looks good.

/Magnus
>
> /Erik
>
>
> On 2017-10-16 00:00, Roman Kennke wrote:
>>
>> Ok, I fixed all the comments you mentioned.
>>
>> Differential (against webrev.01):
>> http://cr.openjdk.java.net/~rkennke/8171853/webrev.03.diff/ 
>> <http://cr.openjdk.java.net/%7Erkennke/8171853/webrev.03.diff/>
>> Full webrev:
>> http://cr.openjdk.java.net/~rkennke/8171853/webrev.03/ 
>> <http://cr.openjdk.java.net/%7Erkennke/8171853/webrev.03/>
>>
>> Roman
>>
>>> Just spotted this:
>>>
>>> ./hotspot/jtreg/compiler/whitebox/CompilerWhiteBoxTest.java: /** 
>>> {@code CompLevel::CompLevel_full_optimization} -- C2 or Shark */
>>>
>>> David
>>>
>>> On 16/10/2017 7:25 AM, David Holmes wrote:
>>>> On 16/10/2017 7:01 AM, Roman Kennke wrote:
>>>>> Hi David,
>>>>>
>>>>> thanks!
>>>>>
>>>>> I'm uploading a 2nd revision of the patch that excludes the 
>>>>> generated-configure.sh part, and adds a smallish Zero-related fix.
>>>>>
>>>>> http://cr.openjdk.java.net/~rkennke/8171853/webrev.01/ 
>>>>> <http://cr.openjdk.java.net/%7Erkennke/8171853/webrev.01/>
>>>>
>>>> Can you point me to the exact change please as I don't want to 
>>>> re-examine it all. :)
>>>>
>>>> I'll pull this in and do a test build run internally.
>>>>
>>>> Thanks,
>>>> David
>>>>
>>>>> Thanks, Roman
>>>>>
>>>>>
>>>>>> Hi Roman,
>>>>>>
>>>>>> The build changes must be reviewed on build-dev - now cc'd.
>>>>>>
>>>>>> Thanks,
>>>>>> David
>>>>>>
>>>>>> On 15/10/2017 8:41 AM, Roman Kennke wrote:
>>>>>>> The JEP to remove the Shark compiler has received exclusively 
>>>>>>> positive feedback (JDK-8189173) on zero-dev. So here comes the 
>>>>>>> big patch to remove it.
>>>>>>>
>>>>>>> What I have done:
>>>>>>>
>>>>>>> grep -i -R shark src
>>>>>>> grep -i -R shark make
>>>>>>> grep -i -R shark doc
>>>>>>> grep -i -R shark doc
>>>>>>>
>>>>>>> and purged any reference to shark. Almost everything was 
>>>>>>> straightforward.
>>>>>>>
>>>>>>> The only things I wasn't really sure of:
>>>>>>>
>>>>>>> - in globals.hpp, I re-arranged the KIND_* bits to account for 
>>>>>>> the gap that removing KIND_SHARK left. I hope that's good?
>>>>>>> - in relocInfo_zero.hpp I put a ShouldNotCallThis() in 
>>>>>>> pd_address_in_code(), I am not sure it is the right thing to do. 
>>>>>>> If not, what *would* be the right thing?
>>>>>>>
>>>>>>> Then of course I did:
>>>>>>>
>>>>>>> rm -rf src/hotspot/share/shark
>>>>>>>
>>>>>>> I also went through the build machinery and removed stuff 
>>>>>>> related to Shark and LLVM libs.
>>>>>>>
>>>>>>> Now the only references in the whole JDK tree to shark is a 
>>>>>>> 'Shark Bay' in a timezone file, and 'Wireshark' in some tests ;-)
>>>>>>>
>>>>>>> I tested by building a regular x86 JVM and running JTREG tests. 
>>>>>>> All looks fine.
>>>>>>>
>>>>>>> - I could not build zero because it seems broken because of the 
>>>>>>> recent Atomic::* changes
>>>>>>> - I could not test any of the other arches that seemed to 
>>>>>>> reference Shark (arm and sparc)
>>>>>>>
>>>>>>> Here's the full webrev:
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~rkennke/8171853/webrev.00/ 
>>>>>>> <http://cr.openjdk.java.net/%7Erkennke/8171853/webrev.00/>
>>>>>>>
>>>>>>> Can I get a review on this?
>>>>>>>
>>>>>>> Thanks, Roman
>>>>>>>
>>>>>
>>
>


From rkennke at redhat.com  Mon Oct 16 10:26:43 2017
From: rkennke at redhat.com (Roman Kennke)
Date: Mon, 16 Oct 2017 12:26:43 +0200
Subject: RFR: 8171853: Remove Shark compiler
In-Reply-To: <872910c6-a17b-d3df-bc80-fa850b9738d9@oracle.com>
References: <92a2fec1-88f1-2579-1202-299b13062b7b@redhat.com>
 <836cd06d-d10a-a98e-d996-b5b92de94c4b@oracle.com>
 <980aa9ae-9c50-c1dc-d52f-e00234a2e3ca@redhat.com>
 <e883f832-2e33-c6a0-5966-040d8a90447e@oracle.com>
 <ec5ad498-ce5b-0aae-1825-a26af1717a32@oracle.com>
 <f0deb71e-df99-6e9b-e3fd-29a955a7fe02@redhat.com>
 <872910c6-a17b-d3df-bc80-fa850b9738d9@oracle.com>
Message-ID: <6ed7e856-baee-a59d-6710-4ff143277dc9@redhat.com>


Hi Erik,

You mean like this?

http://cr.openjdk.java.net/~rkennke/8171853/webrev.04.diff/ 
<http://cr.openjdk.java.net/%7Erkennke/8171853/webrev.04.diff/>

Full webrev here:
http://cr.openjdk.java.net/~rkennke/8171853/webrev.04/ 
<http://cr.openjdk.java.net/%7Erkennke/8171853/webrev.04/>

Thanks,
Roman

> Hello Roman,
>
> In hotspot.m4, I believe the check on line 328 (pre changes) is still 
> relevant for just the zero case.
>
> Otherwise build changes look good to me.
>
> /Erik
>
>
> On 2017-10-16 00:00, Roman Kennke wrote:
>>
>> Ok, I fixed all the comments you mentioned.
>>
>> Differential (against webrev.01):
>> http://cr.openjdk.java.net/~rkennke/8171853/webrev.03.diff/ 
>> <http://cr.openjdk.java.net/%7Erkennke/8171853/webrev.03.diff/>
>> Full webrev:
>> http://cr.openjdk.java.net/~rkennke/8171853/webrev.03/ 
>> <http://cr.openjdk.java.net/%7Erkennke/8171853/webrev.03/>
>>
>> Roman
>>
>>> Just spotted this:
>>>
>>> ./hotspot/jtreg/compiler/whitebox/CompilerWhiteBoxTest.java: /** 
>>> {@code CompLevel::CompLevel_full_optimization} -- C2 or Shark */
>>>
>>> David
>>>
>>> On 16/10/2017 7:25 AM, David Holmes wrote:
>>>> On 16/10/2017 7:01 AM, Roman Kennke wrote:
>>>>> Hi David,
>>>>>
>>>>> thanks!
>>>>>
>>>>> I'm uploading a 2nd revision of the patch that excludes the 
>>>>> generated-configure.sh part, and adds a smallish Zero-related fix.
>>>>>
>>>>> http://cr.openjdk.java.net/~rkennke/8171853/webrev.01/ 
>>>>> <http://cr.openjdk.java.net/%7Erkennke/8171853/webrev.01/>
>>>>
>>>> Can you point me to the exact change please as I don't want to 
>>>> re-examine it all. :)
>>>>
>>>> I'll pull this in and do a test build run internally.
>>>>
>>>> Thanks,
>>>> David
>>>>
>>>>> Thanks, Roman
>>>>>
>>>>>
>>>>>> Hi Roman,
>>>>>>
>>>>>> The build changes must be reviewed on build-dev - now cc'd.
>>>>>>
>>>>>> Thanks,
>>>>>> David
>>>>>>
>>>>>> On 15/10/2017 8:41 AM, Roman Kennke wrote:
>>>>>>> The JEP to remove the Shark compiler has received exclusively 
>>>>>>> positive feedback (JDK-8189173) on zero-dev. So here comes the 
>>>>>>> big patch to remove it.
>>>>>>>
>>>>>>> What I have done:
>>>>>>>
>>>>>>> grep -i -R shark src
>>>>>>> grep -i -R shark make
>>>>>>> grep -i -R shark doc
>>>>>>> grep -i -R shark doc
>>>>>>>
>>>>>>> and purged any reference to shark. Almost everything was 
>>>>>>> straightforward.
>>>>>>>
>>>>>>> The only things I wasn't really sure of:
>>>>>>>
>>>>>>> - in globals.hpp, I re-arranged the KIND_* bits to account for 
>>>>>>> the gap that removing KIND_SHARK left. I hope that's good?
>>>>>>> - in relocInfo_zero.hpp I put a ShouldNotCallThis() in 
>>>>>>> pd_address_in_code(), I am not sure it is the right thing to do. 
>>>>>>> If not, what *would* be the right thing?
>>>>>>>
>>>>>>> Then of course I did:
>>>>>>>
>>>>>>> rm -rf src/hotspot/share/shark
>>>>>>>
>>>>>>> I also went through the build machinery and removed stuff 
>>>>>>> related to Shark and LLVM libs.
>>>>>>>
>>>>>>> Now the only references in the whole JDK tree to shark is a 
>>>>>>> 'Shark Bay' in a timezone file, and 'Wireshark' in some tests ;-)
>>>>>>>
>>>>>>> I tested by building a regular x86 JVM and running JTREG tests. 
>>>>>>> All looks fine.
>>>>>>>
>>>>>>> - I could not build zero because it seems broken because of the 
>>>>>>> recent Atomic::* changes
>>>>>>> - I could not test any of the other arches that seemed to 
>>>>>>> reference Shark (arm and sparc)
>>>>>>>
>>>>>>> Here's the full webrev:
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~rkennke/8171853/webrev.00/ 
>>>>>>> <http://cr.openjdk.java.net/%7Erkennke/8171853/webrev.00/>
>>>>>>>
>>>>>>> Can I get a review on this?
>>>>>>>
>>>>>>> Thanks, Roman
>>>>>>>
>>>>>
>>
>


From erik.joelsson at oracle.com  Mon Oct 16 10:55:28 2017
From: erik.joelsson at oracle.com (Erik Joelsson)
Date: Mon, 16 Oct 2017 12:55:28 +0200
Subject: RFR: 8171853: Remove Shark compiler
In-Reply-To: <6ed7e856-baee-a59d-6710-4ff143277dc9@redhat.com>
References: <92a2fec1-88f1-2579-1202-299b13062b7b@redhat.com>
 <836cd06d-d10a-a98e-d996-b5b92de94c4b@oracle.com>
 <980aa9ae-9c50-c1dc-d52f-e00234a2e3ca@redhat.com>
 <e883f832-2e33-c6a0-5966-040d8a90447e@oracle.com>
 <ec5ad498-ce5b-0aae-1825-a26af1717a32@oracle.com>
 <f0deb71e-df99-6e9b-e3fd-29a955a7fe02@redhat.com>
 <872910c6-a17b-d3df-bc80-fa850b9738d9@oracle.com>
 <6ed7e856-baee-a59d-6710-4ff143277dc9@redhat.com>
Message-ID: <eab94b87-6f23-24d0-b9f5-29749fd68c8b@oracle.com>

That looks correct. Thanks!

/Erik


On 2017-10-16 12:26, Roman Kennke wrote:
>
> Hi Erik,
>
> You mean like this?
>
> http://cr.openjdk.java.net/~rkennke/8171853/webrev.04.diff/ 
> <http://cr.openjdk.java.net/%7Erkennke/8171853/webrev.04.diff/>
>
> Full webrev here:
> http://cr.openjdk.java.net/~rkennke/8171853/webrev.04/ 
> <http://cr.openjdk.java.net/%7Erkennke/8171853/webrev.04/>
>
> Thanks,
> Roman
>
>> Hello Roman,
>>
>> In hotspot.m4, I believe the check on line 328 (pre changes) is still 
>> relevant for just the zero case.
>>
>> Otherwise build changes look good to me.
>>
>> /Erik
>>
>>
>> On 2017-10-16 00:00, Roman Kennke wrote:
>>>
>>> Ok, I fixed all the comments you mentioned.
>>>
>>> Differential (against webrev.01):
>>> http://cr.openjdk.java.net/~rkennke/8171853/webrev.03.diff/ 
>>> <http://cr.openjdk.java.net/%7Erkennke/8171853/webrev.03.diff/>
>>> Full webrev:
>>> http://cr.openjdk.java.net/~rkennke/8171853/webrev.03/ 
>>> <http://cr.openjdk.java.net/%7Erkennke/8171853/webrev.03/>
>>>
>>> Roman
>>>
>>>> Just spotted this:
>>>>
>>>> ./hotspot/jtreg/compiler/whitebox/CompilerWhiteBoxTest.java: /** 
>>>> {@code CompLevel::CompLevel_full_optimization} -- C2 or Shark */
>>>>
>>>> David
>>>>
>>>> On 16/10/2017 7:25 AM, David Holmes wrote:
>>>>> On 16/10/2017 7:01 AM, Roman Kennke wrote:
>>>>>> Hi David,
>>>>>>
>>>>>> thanks!
>>>>>>
>>>>>> I'm uploading a 2nd revision of the patch that excludes the 
>>>>>> generated-configure.sh part, and adds a smallish Zero-related fix.
>>>>>>
>>>>>> http://cr.openjdk.java.net/~rkennke/8171853/webrev.01/ 
>>>>>> <http://cr.openjdk.java.net/%7Erkennke/8171853/webrev.01/>
>>>>>
>>>>> Can you point me to the exact change please as I don't want to 
>>>>> re-examine it all. :)
>>>>>
>>>>> I'll pull this in and do a test build run internally.
>>>>>
>>>>> Thanks,
>>>>> David
>>>>>
>>>>>> Thanks, Roman
>>>>>>
>>>>>>
>>>>>>> Hi Roman,
>>>>>>>
>>>>>>> The build changes must be reviewed on build-dev - now cc'd.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> David
>>>>>>>
>>>>>>> On 15/10/2017 8:41 AM, Roman Kennke wrote:
>>>>>>>> The JEP to remove the Shark compiler has received exclusively 
>>>>>>>> positive feedback (JDK-8189173) on zero-dev. So here comes the 
>>>>>>>> big patch to remove it.
>>>>>>>>
>>>>>>>> What I have done:
>>>>>>>>
>>>>>>>> grep -i -R shark src
>>>>>>>> grep -i -R shark make
>>>>>>>> grep -i -R shark doc
>>>>>>>> grep -i -R shark doc
>>>>>>>>
>>>>>>>> and purged any reference to shark. Almost everything was 
>>>>>>>> straightforward.
>>>>>>>>
>>>>>>>> The only things I wasn't really sure of:
>>>>>>>>
>>>>>>>> - in globals.hpp, I re-arranged the KIND_* bits to account for 
>>>>>>>> the gap that removing KIND_SHARK left. I hope that's good?
>>>>>>>> - in relocInfo_zero.hpp I put a ShouldNotCallThis() in 
>>>>>>>> pd_address_in_code(), I am not sure it is the right thing to 
>>>>>>>> do. If not, what *would* be the right thing?
>>>>>>>>
>>>>>>>> Then of course I did:
>>>>>>>>
>>>>>>>> rm -rf src/hotspot/share/shark
>>>>>>>>
>>>>>>>> I also went through the build machinery and removed stuff 
>>>>>>>> related to Shark and LLVM libs.
>>>>>>>>
>>>>>>>> Now the only references in the whole JDK tree to shark is a 
>>>>>>>> 'Shark Bay' in a timezone file, and 'Wireshark' in some tests ;-)
>>>>>>>>
>>>>>>>> I tested by building a regular x86 JVM and running JTREG tests. 
>>>>>>>> All looks fine.
>>>>>>>>
>>>>>>>> - I could not build zero because it seems broken because of the 
>>>>>>>> recent Atomic::* changes
>>>>>>>> - I could not test any of the other arches that seemed to 
>>>>>>>> reference Shark (arm and sparc)
>>>>>>>>
>>>>>>>> Here's the full webrev:
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~rkennke/8171853/webrev.00/ 
>>>>>>>> <http://cr.openjdk.java.net/%7Erkennke/8171853/webrev.00/>
>>>>>>>>
>>>>>>>> Can I get a review on this?
>>>>>>>>
>>>>>>>> Thanks, Roman
>>>>>>>>
>>>>>>
>>>
>>
>


From coleen.phillimore at oracle.com  Mon Oct 16 13:10:47 2017
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Mon, 16 Oct 2017 09:10:47 -0400
Subject: RFR: 8189333: Fix Zero build after Atomic::xchg changes
In-Reply-To: <b0c31664-fc25-8608-8c23-882711691c13@redhat.com>
References: <003ff7d9-759f-1ef5-f580-18c2571b63e5@redhat.com>
 <eba9e499-71e5-b092-b5e9-4a766cca4f83@oracle.com>
 <b0c31664-fc25-8608-8c23-882711691c13@redhat.com>
Message-ID: <441ed55f-6398-9fa1-d571-86548ed5a2a9@oracle.com>


Hi Roman, Can you build zero with this changeset?

http://cr.openjdk.java.net/~coleenp/8188220.03/webrev/index.html

My scripts for building zero are broken now.

thanks,
Coleen

On 10/15/17 5:40 PM, Roman Kennke wrote:
> Am 15.10.2017 um 23:32 schrieb David Holmes:
>> Hi Roman,
>>
>> On 16/10/2017 7:12 AM, Roman Kennke wrote:
>>> Zero debug build has been broken by: JDK-8187977: Generalize 
>>> Atomic::xchg to use templates.
>>>
>>> This patch fixes it by casting the unsigned literal to jint:
>>>
>>> http://cr.openjdk.java.net/~rkennke/8189333/webrev.00/ 
>>> <http://cr.openjdk.java.net/%7Erkennke/8189333/webrev.00/>
>>
>> Looks fine.
>>
>> I can push this for you straight away (relatively speaking :) ) under 
>> the trivial rule.
> Thanks!
>
> Roman


From coleen.phillimore at oracle.com  Mon Oct 16 13:13:52 2017
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Mon, 16 Oct 2017 09:13:52 -0400
Subject: RFR (L, but tedious) 8188220: Remove Atomic::*_ptr() uses and
 overloads from hotspot
In-Reply-To: <0784FA88-3D00-4DBA-8726-3A3B23C91B3E@oracle.com>
References: <b6d75bd8-ca77-6507-f1f0-db57bbb99ae0@oracle.com>
 <7A475565-84D9-4F98-AE7B-2FDB206CC6E1@oracle.com>
 <49b7c5f7-2f6d-16ac-0b60-140619d0fffd@oracle.com>
 <f50c2c96-11a8-2d02-5eaf-292451b6c6ff@oracle.com>
 <0784FA88-3D00-4DBA-8726-3A3B23C91B3E@oracle.com>
Message-ID: <2f32124d-2428-678d-ef50-3306231aa848@oracle.com>


On 10/14/17 7:36 PM, Kim Barrett wrote:
>> On Oct 13, 2017, at 2:34 PM, coleen.phillimore at oracle.com wrote:
>>
>>
>> Hi, Here is the version with the changes from Kim's comments that has passed at least testing with JPRT and tier1, locally.   More testing (tier2-5) is in progress.
>>
>> Also includes a corrected version of Atomic::sub care of Erik Osterlund.
>>
>> open webrev at http://cr.openjdk.java.net/~coleenp/8188220.kim-review-changes/webrev
>> open webrev at http://cr.openjdk.java.net/~coleenp/8188220.review-comments/webrev
>>
>> Full version:
>>
>> http://cr.openjdk.java.net/~coleenp/8188220.03/webrev
>>
>> Thanks!
>> Coleen
> I still dislike and disagree with what is being proposed regarding replace_if_null.

We can discuss that seperately, please file an RFE.
>
> ------------------------------------------------------------------------------
> I forgot that I'd promised you an updated Atomic::sub definition.
> Unfortunately, the new one still has problems, performing some
> conversions that should not be permitted (and are disallowed by
> Atomic::add).  Try this instead.  (This hasn't been tested, not even
> compiled; hopefully I don't have any typos or anything.)  The intent
> is that this supports the same conversions as Atomic::add.
>
> template<typename I, typename D>
> inline D Atomic::sub(I sub_value, D volatile* dest) {
>    STATIC_ASSERT(IsPointer<D>::value || IsIntegral<D>::value);
>    STATIC_ASSERT(IsIntegral<I>::value);
>    // If D is a pointer type, use [u]intptr_t as the addend type,
>    // matching signedness of I.  Otherwise, use D as the addend type.
>    typedef typename Conditional<IsSigned<I>::value, intptr_t, uintptr_t>::type PI;
>    typedef typename Conditional<IsPointer<D>::value, PI, D>::type AddendType;
>    // Only allow conversions that can't change the value.
>    STATIC_ASSERT(IsSigned<I>::value == IsSigned<AddendType>::value);
>    STATIC_ASSERT(sizeof(I) <= sizeof(AddendType));
>    AddendType addend = sub_value;
>    // Assumes two's complement integer representation.
>    #pragma warning(suppress: 4146) // In case AddendType is not signed.
>    return Atomic::add(-addend, dest);
> }

Uh, Ok.? I'll try it out.
>
>>>> src/hotspot/share/gc/cms/concurrentMarkSweepGeneration.cpp
>>>> 7960   Atomic::add(-n, &_num_par_pushes);
>>>>
>>>> Atomic::sub
>>> fixed.
> Nope, not fixed in http://cr.openjdk.java.net/~coleenp/8188220.03/webrev

Missed it twice now.? I think I have it now.
>>>> src/hotspot/share/gc/g1/heapRegionRemSet.cpp
>>>>    200       PerRegionTable* res =
>>>>    201         Atomic::cmpxchg(nxt, &_free_list, fl);
>>>>
>>>> Please remove the line break, now that the code has been simplified.
>>>>
>>>> But wait, doesn't this alloc exhibit classic ABA problems?  I *think*
>>>> this works because alloc and bulk_free are called in different phases,
>>>> never overlapping.
>>> I don't know.  Do you want to file a bug to investigate this?
>>> fixed.
> No, I now think it?s ok, though confusing.
>
>>>> src/hotspot/share/gc/g1/sparsePRT.cpp
>>>>    295     SparsePRT* res =
>>>>    296       Atomic::cmpxchg(sprt, &_head_expanded_list, hd);
>>>> and
>>>>    307     SparsePRT* res =
>>>>    308       Atomic::cmpxchg(next, &_head_expanded_list, hd);
>>>>
>>>> I'd rather not have the line breaks in these either.
>>>>
>>>> And get_from_expanded_list also appears to have classic ABA problems.
>>>> I *think* this works because add_to_expanded_list and
>>>> get_from_expanded_list are called in different phases, never
>>>> overlapping.
>>> Fixed, same question as above?  Or one bug to investigate both?
> Again, I think it?s ok, though confusing.
>
>>>> src/hotspot/share/gc/shared/taskqueue.inline.hpp
>>>>    262   return (size_t) Atomic::cmpxchg((intptr_t)new_age._data,
>>>>    263                                   (volatile intptr_t *)&_data,
>>>>    264 (intptr_t)old_age._data);
>>>>
>>>> This should be
>>>>
>>>>     return Atomic::cmpxchg(new_age._data, &_data, old_age._data);
>>> fixed.
> Still casting the result.

I thought I fixed it.? I think I fixed it now.
>
>>>> src/hotspot/share/oops/method.hpp
>>>>    139   volatile address from_compiled_entry() const   { return OrderAccess::load_acquire(&_from_compiled_entry); }
>>>>    140   volatile address from_compiled_entry_no_trampoline() const;
>>>>    141   volatile address from_interpreted_entry() const{ return OrderAccess::load_acquire(&_from_interpreted_entry); }
>>>>
>>>> [pre-existing]
>>>> The volatile qualifiers here seem suspect to me.
>>> Again much suspicion about concurrency and giant pain, which I remember, of debugging these when they were broken.
> Let me be more direct: the volatile qualifiers for the function return
> types are bogus and confusing, and should be removed.

Okay, sure.

>
>>>> src/hotspot/share/prims/jni.cpp
>>>>
>>>> [pre-existing]
>>>>
>>>> copy_jni_function_table should be using Copy::disjoint_words_atomic.
>>> yuck.
> Of course, neither is entirely technically correct, since both are
> treating conversion of function pointers to void* as okay in shared
> code, e.g. violating some of the raison d'etre of CAST_{TO,FROM}_FN_PTR.
> For way more detail than you probably care about, see the discussion
> starting here:
> http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2016-March/018578.html
> through (5 messages in total)
> http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2016-March/018623.html
>
> Oh well.
>
>>>> src/hotspot/share/runtime/mutex.hpp
>>>>
>>>> [pre-existing]
>>>>
>>>> I think the Address member of the SplitWord union is unused. Looking
>>>> at AcquireOrPush (and others), I'm wondering whether it *should* be
>>>> used there, or whether just using intptr_t casts and doing integral
>>>> arithmetic (as is presently being done) is easier and clearer.
>>>>
>>>> Also the _LSBINDEX macro probably ought to be defined in mutex.cpp
>>>> rather than polluting the global namespace.  And technically, that
>>>> name is reserved word.
>>> I moved both this and _LBIT into the top of mutex.cpp since they are used there.
> Good.
>
>>> Cant define const intptr_t _LBIT =1; in a class in our version of C++.
> Sorry, please explain?  If you tried to move it into SplitWord, that doesn?t work;
> unions are not permitted to have static data members (I don?t off-hand know why,
> just that it?s explicitly forbidden).
>
> And you left the seemingly unused Address member in SplitWord.

This is the compilation error I get:

/scratch/cphillim/hg/10ptr2/open/src/hotspot/share/runtime/mutex.hpp:124:33: 
error: non-static data member initializers only available with 
-std=c++11 or -std=gnu++11 [-Werror]
 ?? const intptr_t _NEW_LOCKBIT = 1;


I don't own this SplitWord code so do not want to remove the unused 
Address member.

>
>>>> src/hotspot/share/runtime/thread.cpp
>>>> 4707   intptr_t w = Atomic::cmpxchg((intptr_t)LOCKBIT, Lock, (intptr_t)0);
>>>>
>>>> This and other places suggest LOCKBIT should be defined as intptr_t,
>>>> rather than as an enum value.  The MuxBits enum type is unused.
>>>>
>>>> And the cast of 0 is another case where implicit widening would be nice.
>>> Making LOCKBIT a const intptr_t = 1 removes a lot of casts.
> Because of the new definition of LOCKBIT I noticed the immediately
> preceeding typedef for MutexT, which seems to be unused.

Removed MutexT.
>
> ------------------------------------------------------------------------------
> src/hotspot/share/oops/cpCache.cpp
>   114 bool ConstantPoolCacheEntry::init_flags_atomic(intx flags) {
>   115   intptr_t result = Atomic::cmpxchg(flags, &_flags, (intx)0);
>   116   return (result == 0);
>   117 }
>
> [I missed this on earlier pass.]
>
> Should be
>
> bool ConstantPoolCacheEntry::init_flags_atomic(intx flags) {
>    return Atomic::cmpxchg(flags, &_flags, (intx)0) == 0;
> }
>
> Otherwise, I end up asking why result is intptr_t when the cmpxchg is
> dealing with intx.  Yeah, one's a typedef of the other, but mixing
> them like that in the same expression is not helpful.
>
>
Sure why not?

Actually init_flags_atomic is not used and neither is 
init_method_flags_atomic so I did one better and removed them.

Thanks for the again thorough code review and Atomic::sub.?? I'll post 
incremental when it compiles.

Coleen

From coleen.phillimore at oracle.com  Mon Oct 16 13:27:24 2017
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Mon, 16 Oct 2017 09:27:24 -0400
Subject: RFR (L, but tedious) 8188220: Remove Atomic::*_ptr() uses and
 overloads from hotspot
In-Reply-To: <33af17b9-6dce-5a5e-cb94-b3c1afbe8532@oracle.com>
References: <b6d75bd8-ca77-6507-f1f0-db57bbb99ae0@oracle.com>
 <7A475565-84D9-4F98-AE7B-2FDB206CC6E1@oracle.com>
 <49b7c5f7-2f6d-16ac-0b60-140619d0fffd@oracle.com>
 <f50c2c96-11a8-2d02-5eaf-292451b6c6ff@oracle.com>
 <7265c30d-946b-19c4-a1b3-c3314a869ee8@oracle.com>
 <33af17b9-6dce-5a5e-cb94-b3c1afbe8532@oracle.com>
Message-ID: <97afc964-b6e4-9937-94d0-06aa181919a2@oracle.com>


On 10/15/17 9:18 PM, David Holmes wrote:
> One tiny follow up as I was looking at this code:
>
> src/hotspot/share/services/mallocSiteTable.hpp
>
> 65?? MallocSiteHashtableEntry* _next;
>
> should be
>
> 65?? MallocSiteHashtableEntry* volatile _next;
>
> as we operate on it with CAS.

Ok, got it. thanks.

Coleen
>
> Thanks,
> David
>
> On 14/10/2017 10:32 PM, David Holmes wrote:
>> Hi Coleen,
>>
>> These changes all seem okay to me - except I can't comment on the 
>> Atomic::sub implementation. :)
>>
>> Thanks for adding the assert to header_addr(). FYI from 
>> objectMonitor.hpp:
>>
>> // ObjectMonitor Layout Overview/Highlights/Restrictions:
>> //
>> // - The _header field must be at offset 0 because the displaced header
>> //?? from markOop is stored there. We do not want markOop.hpp to include
>> //?? ObjectMonitor.hpp to avoid exposing ObjectMonitor everywhere. This
>> //?? means that ObjectMonitor cannot inherit from any other class nor 
>> can
>> //?? it use any virtual member functions. This restriction is 
>> critical to
>> //?? the proper functioning of the VM.
>>
>> so it is important we ensure this holds.
>>
>> Thanks,
>> David
>>
>> On 14/10/2017 4:34 AM, coleen.phillimore at oracle.com wrote:
>>>
>>> Hi, Here is the version with the changes from Kim's comments that 
>>> has passed at least testing with JPRT and tier1, locally.?? More 
>>> testing (tier2-5) is in progress.
>>>
>>> Also includes a corrected version of Atomic::sub care of Erik 
>>> Osterlund.
>>>
>>> open webrev at 
>>> http://cr.openjdk.java.net/~coleenp/8188220.kim-review-changes/webrev
>>> open webrev at 
>>> http://cr.openjdk.java.net/~coleenp/8188220.review-comments/webrev
>>>
>>> Full version:
>>>
>>> http://cr.openjdk.java.net/~coleenp/8188220.03/webrev
>>>
>>> Thanks!
>>> Coleen
>>>
>>> On 10/13/17 9:25 AM, coleen.phillimore at oracle.com wrote:
>>>>
>>>> Hi Kim, Thank you for the detailed review and the time you've spent 
>>>> on it, and discussion yesterday.
>>>>
>>>> On 10/12/17 7:17 PM, Kim Barrett wrote:
>>>>>> On Oct 10, 2017, at 6:01 PM, coleen.phillimore at oracle.com wrote:
>>>>>>
>>>>>> Summary: With the new template functions these are unnecessary.
>>>>>>
>>>>>> The changes are mostly s/_ptr// and removing the cast to return 
>>>>>> type.? There weren't many types that needed to be improved to 
>>>>>> match the template version of the function.?? Some notes:
>>>>>> 1. replaced CASPTR with Atomic::cmpxchg() in mutex.cpp, 
>>>>>> rearranging arguments.
>>>>>> 2. renamed Atomic::replace_if_null to Atomic::cmpxchg_if_null.? I 
>>>>>> disliked the first name because it's not explicit from the 
>>>>>> callers that there's an underlying cas.? If people want to fight, 
>>>>>> I'll remove the function and use cmpxchg because there are only a 
>>>>>> couple places where this is a little nicer.
>>>>>> 3. Added Atomic::sub()
>>>>>>
>>>>>> Tested with JPRT, mach5 tier1-5 on linux,windows and solaris.
>>>>>>
>>>>>> open webrev at http://cr.openjdk.java.net/~coleenp/8188220.01/webrev
>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8188220
>>>>>>
>>>>>> Thanks,
>>>>>> Coleen
>>>>> I looked harder at the potential ABA problems, and believe they are
>>>>> okay.? There can be multiple threads doing pushes, and there can be
>>>>> multiple threads doing pops, but not both at the same time.
>>>>>
>>>>> ------------------------------------------------------------------------------ 
>>>>>
>>>>> src/hotspot/cpu/zero/cppInterpreter_zero.cpp
>>>>> ? 279???? if (Atomic::cmpxchg(monitor, lockee->mark_addr(), disp) 
>>>>> != disp) {
>>>>>
>>>>> How does this work?? monitor and disp seem like they have unrelated
>>>>> types?? Given that this is zero-specific code, maybe this hasn't been
>>>>> tested?
>>>>>
>>>>> Similarly here:
>>>>> ? 423?????? if (Atomic::cmpxchg(header, rcvr->mark_addr(), lock) 
>>>>> != lock) {
>>>>
>>>> I haven't built zero.? I don't know how to do this anymore (help?) 
>>>> I fixed the obvious type mismatches here and in 
>>>> bytecodeInterpreter.cpp.? I'll try to build it.
>>>>>
>>>>> ------------------------------------------------------------------------------ 
>>>>>
>>>>> src/hotspot/share/asm/assembler.cpp
>>>>> ? 239???????? dcon->value_fn = cfn;
>>>>>
>>>>> Is it actually safe to remove the atomic update?? If multiple threads
>>>>> performing the assignment *are* possible (and I don't understand the
>>>>> context yet, so don't know the answer to that), then a bare 
>>>>> non-atomic
>>>>> assignment is a race, e.g. undefined behavior.
>>>>>
>>>>> Regardless of that, I think the CAST_FROM_FN_PTR should be retained.
>>>>
>>>> I can find no uses of this code, ie. looking for "delayed_value". I 
>>>> think it was early jsr292 code.? I could also not find any 
>>>> combination of casts that would make it compile, so in the end I 
>>>> believed the comment and took out the cmpxchg.?? The code appears 
>>>> to be intended to for bootstrapping, see the call to 
>>>> update_delayed_values() in JavaClasses::compute_offsets().
>>>>
>>>> The CAST_FROM_FN_PTR was to get it to compile with cmpxchg, the new 
>>>> code does not require a cast.? If you can help with finding the 
>>>> right set of casts, I'd be happy to put the cmpxchg back in. I just 
>>>> couldn't find one.
>>>>>
>>>>> ------------------------------------------------------------------------------ 
>>>>>
>>>>> src/hotspot/share/classfile/classLoaderData.cpp
>>>>> ? 167?? Chunk* head = (Chunk*) OrderAccess::load_acquire(&_head);
>>>>>
>>>>> I think the cast to Chunk* is no longer needed.
>>>>
>>>> Missed another, thanks.? No that's the same one David found.
>>>>>
>>>>> ------------------------------------------------------------------------------ 
>>>>>
>>>>> src/hotspot/share/classfile/classLoaderData.cpp
>>>>> ? 946???? ClassLoaderData* old = Atomic::cmpxchg(cld, cld_addr, 
>>>>> (ClassLoaderData*)NULL);
>>>>> ? 947???? if (old != NULL) {
>>>>> ? 948?????? delete cld;
>>>>> ? 949?????? // Returns the data.
>>>>> ? 950?????? return old;
>>>>> ? 951???? }
>>>>>
>>>>> That could instead be
>>>>>
>>>>> ?? if (!Atomic::replace_if_null(cld, cld_addr)) {
>>>>> ???? delete cld;?????????? // Lost the race.
>>>>> ???? return *cld_addr;???? // Use the winner's value.
>>>>> ?? }
>>>>>
>>>>> And apparently the caller of CLDG::add doesn't care whether the
>>>>> returned CLD has actually been added to the graph yet.? If that's not
>>>>> true, then there's a bug here, since a race loser might return a
>>>>> winner's value before the winner has actually done the insertion.
>>>>
>>>> True, the race loser doesn't care whether the CLD has been added to 
>>>> the graph.
>>>> Your instead code requires a comment that replace_if_null is really 
>>>> a compare exchange and has an extra read of the original value, so 
>>>> I am leaving what I have which is clearer to me.
>>>>
>>>>>
>>>>> ------------------------------------------------------------------------------ 
>>>>>
>>>>> src/hotspot/share/classfile/verifier.cpp
>>>>> ?? 71 static void* verify_byte_codes_fn() {
>>>>> ?? 72?? if (OrderAccess::load_acquire(&_verify_byte_codes_fn) == 
>>>>> NULL) {
>>>>> ?? 73???? void *lib_handle = os::native_java_library();
>>>>> ?? 74???? void *func = os::dll_lookup(lib_handle, 
>>>>> "VerifyClassCodesForMajorVersion");
>>>>> ?? 75 OrderAccess::release_store(&_verify_byte_codes_fn, func);
>>>>> ?? 76???? if (func == NULL) {
>>>>> ?? 77?????? _is_new_verify_byte_codes_fn = false;
>>>>> ?? 78?????? func = os::dll_lookup(lib_handle, "VerifyClassCodes");
>>>>> ?? 79 OrderAccess::release_store(&_verify_byte_codes_fn, func);
>>>>> ?? 80???? }
>>>>> ?? 81?? }
>>>>> ?? 82?? return (void*)_verify_byte_codes_fn;
>>>>> ?? 83 }
>>>>>
>>>>> [pre-existing]
>>>>>
>>>>> I think this code has race problems; a caller could unexpectedly and
>>>>> inappropriately return NULL.? Consider the case where there is no
>>>>> VerifyClassCodesForMajorVersion, but there is VerifyClassCodes.
>>>>>
>>>>> The variable is initially NULL.
>>>>>
>>>>> Both Thread1 and Thread2 reach line 73, having both seen a NULL value
>>>>> for the variable.
>>>>>
>>>>> Thread1 reaches line 80, setting the variable to VerifyClassCodes.
>>>>>
>>>>> Thread2 reaches line 76, resetting the variable to NULL.
>>>>>
>>>>> Thread1 reads the now (momentarily) NULL value and returns it.
>>>>>
>>>>> I think the first release_store should be conditional on func != 
>>>>> NULL.
>>>>> Also, the usage of _is_new_verify_byte_codes_fn seems suspect.
>>>>> And a minor additional nit: the cast in the return is unnecessary.
>>>>
>>>> Yes, this looks like a bug.?? I'll cut/paste this and file it. It 
>>>> may be that this is support for the old verifier in old jdk 
>>>> versions that can be cleaned up.
>>>>>
>>>>> ------------------------------------------------------------------------------ 
>>>>>
>>>>> src/hotspot/share/code/nmethod.cpp
>>>>> 1664?? nmethod* observed_mark_link = _oops_do_mark_link;
>>>>> 1665?? if (observed_mark_link == NULL) {
>>>>> 1666???? // Claim this nmethod for this thread to mark.
>>>>> 1667???? if (Atomic::cmpxchg_if_null(NMETHOD_SENTINEL, 
>>>>> &_oops_do_mark_link)) {
>>>>>
>>>>> With these changes, the only use of observed_mark_link is in the if.
>>>>> I'm not sure that variable is really useful anymore, e.g. just use
>>>>>
>>>>> ?? if (_oops_do_mark_link == NULL) {
>>>>
>>>> Ok fixed.
>>>>>
>>>>> ------------------------------------------------------------------------------ 
>>>>>
>>>>> src/hotspot/share/gc/cms/concurrentMarkSweepGeneration.cpp
>>>>>
>>>>> In CMSCollector::par_take_from_overflow_list, if BUSY and prefix were
>>>>> of type oopDesc*, I think there would be a whole lot fewer casts and
>>>>> cast_to_oop's.? Later on, I think suffix_head, 
>>>>> observed_overflow_list,
>>>>> and curr_overflow_list could also be oopDesc* instead of oop to
>>>>> eliminate more casts.
>>>>
>>>> I actually tried to make this change but ran into more fan out that 
>>>> way, so went back and just fixed the cmpxchg calls to cast oops to 
>>>> oopDesc* and things were less perturbed that way.
>>>>>
>>>>> And some similar changes in CMSCollector::par_push_on_overflow_list.
>>>>>
>>>>> And similarly in parNewGeneration.cpp, in push_on_overflow_list and
>>>>> take_from_overflow_list_work.
>>>>>
>>>>> As noted in the comments for JDK-8165857, the lists and "objects"
>>>>> involved here aren't really oops, but rather the shattered remains of
>>>>
>>>> Yes, somewhat horrified at the value of BUSY.
>>>>> oops.? The suggestion there was to use HeapWord* and carry through 
>>>>> the
>>>>> fanout; what was actually done was to change _overflow_list to
>>>>> oopDesc* to minimize fanout, even though that's kind of lying to the
>>>>> type system.? Now, with the cleanup of cmpxchg_ptr and such, we're
>>>>> paying the price of doing the minimal thing back then.
>>>>
>>>> I will file an RFE about cleaning this up.? I think what I've done 
>>>> was the minimal thing.
>>>>>
>>>>> ------------------------------------------------------------------------------ 
>>>>>
>>>>> src/hotspot/share/gc/cms/concurrentMarkSweepGeneration.cpp
>>>>> 7960?? Atomic::add(-n, &_num_par_pushes);
>>>>>
>>>>> Atomic::sub
>>>>
>>>> fixed.
>>>>>
>>>>> ------------------------------------------------------------------------------ 
>>>>>
>>>>> src/hotspot/share/gc/cms/parNewGeneration.cpp
>>>>> 1455?? Atomic::add(-n, &_num_par_pushes);
>>>> fixed.
>>>>> Atomic::sub
>>>>>
>>>>> ------------------------------------------------------------------------------ 
>>>>>
>>>>> src/hotspot/share/gc/g1/dirtyCardQueue.cpp
>>>>> ? 283???? void* actual = Atomic::cmpxchg(next, 
>>>>> &_cur_par_buffer_node, nd);
>>>>> ...
>>>>> ? 289?????? nd = static_cast<BufferNode*>(actual);
>>>>>
>>>>> Change actual's type to BufferNode* and remove the cast on line 289.
>>>>
>>>> fixed.? missed that one. gross.
>>>>>
>>>>> ------------------------------------------------------------------------------ 
>>>>>
>>>>> src/hotspot/share/gc/g1/g1CollectedHeap.cpp
>>>>>
>>>>> [pre-existing]
>>>>> 3499???????? old = (CompiledMethod*)_postponed_list;
>>>>>
>>>>> I think that cast is only needed because
>>>>> G1CodeCacheUnloadingTask::_postponed_list is incorrectly typed as
>>>>> "volatile CompiledMethod*", when I think it ought to be
>>>>> "CompiledMethod* volatile".
>>>>>
>>>>> I think G1CodeCacheUnloading::_claimed_nmethod is similarly 
>>>>> mis-typed,
>>>>> with a similar should not be needed cast:
>>>>> 3530?????? first = (CompiledMethod*)_claimed_nmethod;
>>>>>
>>>>> and another for _postponed_list here:
>>>>> 3552?????? claim = (CompiledMethod*)_postponed_list;
>>>>
>>>> I've fixed this.?? C++ is so confusing about where to put the 
>>>> volatile.?? Everyone has been tripped up by it.
>>>>>
>>>>> ------------------------------------------------------------------------------ 
>>>>>
>>>>> src/hotspot/share/gc/g1/g1HotCardCache.cpp
>>>>> ?? 77?? jbyte* previous_ptr = (jbyte*)Atomic::cmpxchg(card_ptr,
>>>>>
>>>>> I think the cast of the cmpxchg result is no longer needed.
>>>>
>>>> fixed.
>>>>>
>>>>> ------------------------------------------------------------------------------ 
>>>>>
>>>>> src/hotspot/share/gc/g1/g1PageBasedVirtualSpace.cpp
>>>>> ? 254?????? char* touch_addr = 
>>>>> (char*)Atomic::add(actual_chunk_size, &_cur_addr) - 
>>>>> actual_chunk_size;
>>>>>
>>>>> I think the cast of the add result is no longer needed.
>>>> got it already.
>>>>>
>>>>> ------------------------------------------------------------------------------ 
>>>>>
>>>>> src/hotspot/share/gc/g1/g1StringDedup.cpp
>>>>> ? 213?? return (size_t)Atomic::add(partition_size, &_next_bucket) 
>>>>> - partition_size;
>>>>>
>>>>> I think the cast of the add result is no longer needed.
>>>>
>>>> I was slacking in the g1 files.? fixed.
>>>>>
>>>>> ------------------------------------------------------------------------------ 
>>>>>
>>>>> src/hotspot/share/gc/g1/heapRegionRemSet.cpp
>>>>> ? 200?????? PerRegionTable* res =
>>>>> ? 201???????? Atomic::cmpxchg(nxt, &_free_list, fl);
>>>>>
>>>>> Please remove the line break, now that the code has been simplified.
>>>>>
>>>>> But wait, doesn't this alloc exhibit classic ABA problems?? I *think*
>>>>> this works because alloc and bulk_free are called in different 
>>>>> phases,
>>>>> never overlapping.
>>>>
>>>> I don't know.? Do you want to file a bug to investigate this?
>>>> fixed.
>>>>>
>>>>> ------------------------------------------------------------------------------ 
>>>>>
>>>>> src/hotspot/share/gc/g1/sparsePRT.cpp
>>>>> ? 295???? SparsePRT* res =
>>>>> ? 296?????? Atomic::cmpxchg(sprt, &_head_expanded_list, hd);
>>>>> and
>>>>> ? 307???? SparsePRT* res =
>>>>> ? 308?????? Atomic::cmpxchg(next, &_head_expanded_list, hd);
>>>>>
>>>>> I'd rather not have the line breaks in these either.
>>>>>
>>>>> And get_from_expanded_list also appears to have classic ABA problems.
>>>>> I *think* this works because add_to_expanded_list and
>>>>> get_from_expanded_list are called in different phases, never
>>>>> overlapping.
>>>>
>>>> Fixed, same question as above?? Or one bug to investigate both?
>>>>>
>>>>> ------------------------------------------------------------------------------ 
>>>>>
>>>>> src/hotspot/share/gc/shared/taskqueue.inline.hpp
>>>>> ? 262?? return (size_t) Atomic::cmpxchg((intptr_t)new_age._data,
>>>>> ? 263?????????????????????????????????? (volatile intptr_t *)&_data,
>>>>> ? 264 (intptr_t)old_age._data);
>>>>>
>>>>> This should be
>>>>>
>>>>> ?? return Atomic::cmpxchg(new_age._data, &_data, old_age._data);
>>>>
>>>> fixed.
>>>>>
>>>>> ------------------------------------------------------------------------------ 
>>>>>
>>>>> src/hotspot/share/interpreter/bytecodeInterpreter.cpp
>>>>> This doesn't have any casts, which I think is correct.
>>>>> ? 708???????????? if (Atomic::cmpxchg(header, rcvr->mark_addr(), 
>>>>> mark) == mark) {
>>>>>
>>>>> but these do.
>>>>> ? 718???????????? if (Atomic::cmpxchg((void*)new_header, 
>>>>> rcvr->mark_addr(), mark) == mark) {
>>>>> ? 737???????????? if (Atomic::cmpxchg((void*)new_header, 
>>>>> rcvr->mark_addr(), header) == header) {
>>>>>
>>>>> I'm not sure how the ones with casts even compile? mark_addr() seems
>>>>> to be a markOop*, which is a markOopDesc**, where markOopDesc is a
>>>>> class.? void* is not implicitly convertible to markOopDesc*.
>>>>>
>>>>> Hm, this entire file is #ifdef CC_INTERP.? Is this zero-only 
>>>>> code?? Or
>>>>> something like that?
>>>>>
>>>>> Similarly here:
>>>>> ? 906?????????? if (Atomic::cmpxchg(header, lockee->mark_addr(), 
>>>>> mark) == mark) {
>>>>> and
>>>>> ? 917?????????? if (Atomic::cmpxchg((void*)new_header, 
>>>>> lockee->mark_addr(), mark) == mark) {
>>>>> ? 935?????????? if (Atomic::cmpxchg((void*)new_header, 
>>>>> lockee->mark_addr(), header) == header) {
>>>>>
>>>>> and here:
>>>>> 1847?????????????? if (Atomic::cmpxchg(header, 
>>>>> lockee->mark_addr(), mark) == mark) {
>>>>> 1858?????????????? if (Atomic::cmpxchg((void*)new_header, 
>>>>> lockee->mark_addr(), mark) == mark) {
>>>>> 1878?????????????? if (Atomic::cmpxchg((void*)new_header, 
>>>>> lockee->mark_addr(), header) == header) {
>>>>>
>>>>> and here:
>>>>> 1847?????????????? if (Atomic::cmpxchg(header, 
>>>>> lockee->mark_addr(), mark) == mark) {
>>>>> 1858?????????????? if (Atomic::cmpxchg((void*)new_header, 
>>>>> lockee->mark_addr(), mark) == mark) {
>>>>> 1878?????????????? if (Atomic::cmpxchg((void*)new_header, 
>>>>> lockee->mark_addr(), header) == header) {
>>>>
>>>> I've changed all these.?? This is part of Zero.
>>>>>
>>>>> ------------------------------------------------------------------------------ 
>>>>>
>>>>> src/hotspot/share/memory/metaspace.cpp
>>>>> 1502?? size_t value = OrderAccess::load_acquire(&_capacity_until_GC);
>>>>> ...
>>>>> 1537?? return (size_t)Atomic::sub((intptr_t)v, &_capacity_until_GC);
>>>>>
>>>>> These and other uses of _capacity_until_GC suggest that variable's
>>>>> type should be size_t rather than intptr_t.? Note that I haven't done
>>>>> a careful check of uses to see if there are any places where such a
>>>>> change would cause problems.
>>>>
>>>> Yes, I had a hard time with metaspace.cpp because I agree 
>>>> _capacity_until_GC should be size_t.?? Tried to make this change 
>>>> and it cascaded a bit.? I'll file an RFE to change this type 
>>>> separately.
>>>>
>>>>>
>>>>> ------------------------------------------------------------------------------ 
>>>>>
>>>>> src/hotspot/share/oops/constantPool.cpp
>>>>> ? 229?? OrderAccess::release_store((Klass* volatile *)adr, k);
>>>>> ? 246?? OrderAccess::release_store((Klass* volatile *)adr, k);
>>>>> ? 514?? OrderAccess::release_store((Klass* volatile *)adr, k);
>>>>>
>>>>> Casts are not needed.
>>>>
>>>> fixed.
>>>>>
>>>>> ------------------------------------------------------------------------------ 
>>>>>
>>>>> src/hotspot/share/oops/constantPool.hpp
>>>>> ? 148???? volatile intptr_t adr = 
>>>>> OrderAccess::load_acquire(obj_at_addr_raw(which));
>>>>>
>>>>> [pre-existing]
>>>>> Why is adr declared volatile?
>>>>
>>>> golly beats me.? concurrency is scary, especially in the constant 
>>>> pool.
>>>> The load_acquire() should make sure the value is fetched from 
>>>> memory so volatile is unneeded.
>>>>
>>>>>
>>>>> ------------------------------------------------------------------------------ 
>>>>>
>>>>> src/hotspot/share/oops/cpCache.cpp
>>>>> ? 157???? intx newflags = (value & parameter_size_mask);
>>>>> ? 158???? Atomic::cmpxchg(newflags, &_flags, (intx)0);
>>>>>
>>>>> This is a nice demonstration of why I wanted to include some value
>>>>> preserving integral conversions in cmpxchg, rather than requiring
>>>>> exact type matching in the integral case.? There have been some 
>>>>> others
>>>>> that I haven't commented on.? Apparently we (I) got away with
>>>>> including such conversions in Atomic::add, which I'd forgotten about.
>>>>> And see comment regarding Atomic::sub below.
>>>>>
>>>>> ------------------------------------------------------------------------------ 
>>>>>
>>>>> src/hotspot/share/oops/cpCache.hpp
>>>>> ? 139?? volatile Metadata*?? _f1;?????? // entry specific metadata 
>>>>> field
>>>>>
>>>>> [pre-existing]
>>>>> I suspect the type should be Metadata* volatile.? And that would
>>>>> eliminate the need for the cast here:
>>>>>
>>>>> ? 339?? Metadata* f1_ord() const?????????????????????? { return 
>>>>> (Metadata *)OrderAccess::load_acquire(&_f1); }
>>>>>
>>>>> I don't know if there are any other changes needed or desirable 
>>>>> around
>>>>> _f1 usage.
>>>>
>>>> yes, fixed this.
>>>>>
>>>>> ------------------------------------------------------------------------------ 
>>>>>
>>>>> src/hotspot/share/oops/method.hpp
>>>>> ? 139?? volatile address from_compiled_entry() const?? { return 
>>>>> OrderAccess::load_acquire(&_from_compiled_entry); }
>>>>> ? 140?? volatile address from_compiled_entry_no_trampoline() const;
>>>>> ? 141?? volatile address from_interpreted_entry() const{ return 
>>>>> OrderAccess::load_acquire(&_from_interpreted_entry); }
>>>>>
>>>>> [pre-existing]
>>>>> The volatile qualifiers here seem suspect to me.
>>>>
>>>> Again much suspicion about concurrency and giant pain, which I 
>>>> remember, of debugging these when they were broken.
>>>>>
>>>>> ------------------------------------------------------------------------------ 
>>>>>
>>>>> src/hotspot/share/oops/oop.inline.hpp
>>>>> ? 391???? narrowOop old = (narrowOop)Atomic::xchg(val, 
>>>>> (narrowOop*)dest);
>>>>>
>>>>> Cast of return type is not needed.
>>>>
>>>> fixed.
>>>>
>>>>>
>>>>> ------------------------------------------------------------------------------ 
>>>>>
>>>>> src/hotspot/share/prims/jni.cpp
>>>>>
>>>>> [pre-existing]
>>>>>
>>>>> copy_jni_function_table should be using Copy::disjoint_words_atomic.
>>>>
>>>> yuck.
>>>>>
>>>>> ------------------------------------------------------------------------------ 
>>>>>
>>>>> src/hotspot/share/prims/jni.cpp
>>>>>
>>>>> [pre-existing]
>>>>>
>>>>> 3892?? // We're about to use Atomic::xchg for synchronization. 
>>>>> Some Zero
>>>>> 3893?? // platforms use the GCC builtin __sync_lock_test_and_set 
>>>>> for this,
>>>>> 3894?? // but __sync_lock_test_and_set is not guaranteed to do 
>>>>> what we want
>>>>> 3895?? // on all architectures.? So we check it works before 
>>>>> relying on it.
>>>>> 3896 #if defined(ZERO) && defined(ASSERT)
>>>>> 3897?? {
>>>>> 3898???? jint a = 0xcafebabe;
>>>>> 3899???? jint b = Atomic::xchg(0xdeadbeef, &a);
>>>>> 3900???? void *c = &a;
>>>>> 3901???? void *d = Atomic::xchg(&b, &c);
>>>>> 3902???? assert(a == (jint) 0xdeadbeef && b == (jint) 0xcafebabe, 
>>>>> "Atomic::xchg() works");
>>>>> 3903???? assert(c == &b && d == &a, "Atomic::xchg() works");
>>>>> 3904?? }
>>>>> 3905 #endif // ZERO && ASSERT
>>>>>
>>>>> It seems rather strange to be testing Atomic::xchg() here, rather 
>>>>> than
>>>>> as part of unit testing Atomic?? Fail unit testing => don't try to
>>>>> use...
>>>>
>>>> This is zero.? I'm not touching this.
>>>>>
>>>>> ------------------------------------------------------------------------------ 
>>>>>
>>>>> src/hotspot/share/prims/jvmtiRawMonitor.cpp
>>>>> ? 130???? if (Atomic::cmpxchg_if_null((void*)Self, &_owner)) {
>>>>> ? 142???? if (_owner == NULL && 
>>>>> Atomic::cmpxchg_if_null((void*)Self, &_owner)) {
>>>>>
>>>>> I think these casts aren't needed. _owner is void*, and Self is
>>>>> Thread*, which is implicitly convertible to void*.
>>>>>
>>>>> Similarly here, for the THREAD argument:
>>>>> ? 280???? Contended = Atomic::cmpxchg((void*)THREAD, &_owner, 
>>>>> (void*)NULL);
>>>>> ? 283???? Contended = Atomic::cmpxchg((void*)THREAD, &_owner, 
>>>>> (void*)NULL);
>>>>
>>>> Okay, let me see if the compiler(s) eat that. (yes they do)
>>>>>
>>>>> ------------------------------------------------------------------------------ 
>>>>>
>>>>> src/hotspot/share/prims/jvmtiRawMonitor.hpp
>>>>>
>>>>> This file is in the webrev, but seems to be unchanged.
>>>>
>>>> It'll be cleaned up with the the commit and not be part of the 
>>>> changeset.
>>>>>
>>>>> ------------------------------------------------------------------------------ 
>>>>>
>>>>> src/hotspot/share/runtime/atomic.hpp
>>>>> ? 520 template<typename I, typename D>
>>>>> ? 521 inline D Atomic::sub(I sub_value, D volatile* dest) {
>>>>> ? 522?? STATIC_ASSERT(IsPointer<D>::value || IsIntegral<D>::value);
>>>>> ? 523?? // Assumes two's complement integer representation.
>>>>> ? 524?? #pragma warning(suppress: 4146)
>>>>> ? 525?? return Atomic::add(-sub_value, dest);
>>>>> ? 526 }
>>>>>
>>>>> I'm pretty sure this implementation is incorrect.? I think it 
>>>>> produces
>>>>> the wrong result when I and D are both unsigned integer types and
>>>>> sizeof(I) < sizeof(D).
>>>>
>>>> Can you suggest a correction?? I just copied Atomic::dec().
>>>>>
>>>>> ------------------------------------------------------------------------------ 
>>>>>
>>>>> src/hotspot/share/runtime/mutex.cpp
>>>>> ? 304?? intptr_t v = Atomic::cmpxchg((intptr_t)_LBIT, 
>>>>> &_LockWord.FullWord, (intptr_t)0);? // agro ...
>>>>>
>>>>> _LBIT should probably be intptr_t, rather than an enum. Note that the
>>>>> enum type is unused.? The old value here is another place where an
>>>>> implicit widening of same signedness would have been nice. (Such
>>>>> implicit widening doesn't work for enums, since it's unspecified
>>>>> whether they default to signed or unsigned representation, and
>>>>> implementatinos differ.)
>>>>
>>>> This would be a good/simple cleanup.? I changed it to const 
>>>> intptr_t _LBIT = 1;
>>>>>
>>>>> ------------------------------------------------------------------------------ 
>>>>>
>>>>> src/hotspot/share/runtime/mutex.hpp
>>>>>
>>>>> [pre-existing]
>>>>>
>>>>> I think the Address member of the SplitWord union is unused. Looking
>>>>> at AcquireOrPush (and others), I'm wondering whether it *should* be
>>>>> used there, or whether just using intptr_t casts and doing integral
>>>>> arithmetic (as is presently being done) is easier and clearer.
>>>>>
>>>>> Also the _LSBINDEX macro probably ought to be defined in mutex.cpp
>>>>> rather than polluting the global namespace.? And technically, that
>>>>> name is reserved word.
>>>>
>>>> I moved both this and _LBIT into the top of mutex.cpp since they 
>>>> are used there.
>>>> Cant define const intptr_t _LBIT =1; in a class in our version of C++.
>>>>>
>>>>> ------------------------------------------------------------------------------ 
>>>>>
>>>>> src/hotspot/share/runtime/objectMonitor.cpp
>>>>> ? 252?? void * cur = Atomic::cmpxchg((void*)Self, &_owner, 
>>>>> (void*)NULL);
>>>>> ? 409?? if (Atomic::cmpxchg_if_null((void*)Self, &_owner)) {
>>>>> 1983?????? ox = (Thread*)Atomic::cmpxchg((void*)Self, &_owner, 
>>>>> (void*)NULL);
>>>>>
>>>>> I think the casts of Self aren't needed.
>>>>
>>>> fixed.
>>>>>
>>>>> ------------------------------------------------------------------------------ 
>>>>>
>>>>> src/hotspot/share/runtime/objectMonitor.cpp
>>>>> ? 995?????? if (!Atomic::cmpxchg_if_null((void*)THREAD, &_owner)) {
>>>>> 1020???????? if (!Atomic::cmpxchg_if_null((void*)THREAD, &_owner)) {
>>>>>
>>>>> I think the casts of THREAD aren't needed.
>>>>
>>>> nope, fixed.
>>>>> ------------------------------------------------------------------------------ 
>>>>>
>>>>> src/hotspot/share/runtime/objectMonitor.hpp
>>>>> ? 254?? markOopDesc* volatile* header_addr();
>>>>>
>>>>> Why isn't this volatile markOop* ?
>>>>
>>>> fixed.
>>>>>
>>>>> ------------------------------------------------------------------------------ 
>>>>>
>>>>> src/hotspot/share/runtime/synchronizer.cpp
>>>>> ? 242???????? Atomic::cmpxchg_if_null((void*)Self, &(m->_owner))) {
>>>>>
>>>>> I think the cast of Self isn't needed.
>>>>
>>>> fixed.
>>>>>
>>>>> ------------------------------------------------------------------------------ 
>>>>>
>>>>> src/hotspot/share/runtime/synchronizer.cpp
>>>>> ? 992?? for (; block != NULL; block = (PaddedEnd<ObjectMonitor> 
>>>>> *)next(block)) {
>>>>> 1734???? for (; block != NULL; block = (PaddedEnd<ObjectMonitor> 
>>>>> *)next(block)) {
>>>>>
>>>>> [pre-existing]
>>>>> All calls to next() pass a PaddedEnd<ObjectMonitor>* and cast the
>>>>> result.? How about moving all that behavior into next().
>>>>
>>>> I fixed this next() function, but it necessitated a cast to 
>>>> FreeNext field.? The PaddedEnd<> type was intentionally not 
>>>> propagated to all the things that use it.?? Which is a shame 
>>>> because there are a lot more casts to PaddedEnd<ObjectMonitor> that 
>>>> could have been removed.
>>>>>
>>>>> ------------------------------------------------------------------------------ 
>>>>>
>>>>> src/hotspot/share/runtime/synchronizer.cpp
>>>>> 1970???? if (monitor > (ObjectMonitor *)&block[0] &&
>>>>> 1971???????? monitor < (ObjectMonitor *)&block[_BLOCKSIZE]) {
>>>>>
>>>>> [pre-existing]
>>>>> Are the casts needed here?? I think PaddedEnd<ObjectMonitor> is
>>>>> derived from ObjectMonitor, so implicit conversions should apply.
>>>>
>>>> prob not.? removed them.
>>>>>
>>>>> ------------------------------------------------------------------------------ 
>>>>>
>>>>> src/hotspot/share/runtime/synchronizer.hpp
>>>>> ?? 28 #include "memory/padded.hpp"
>>>>> ? 163?? static PaddedEnd<ObjectMonitor> * volatile gBlockList;
>>>>>
>>>>> I was going to suggest as an alternative just making gBlockList a 
>>>>> file
>>>>> scoped variable in synchronizer.cpp, since it isn't used outside of
>>>>> that file. Except that it is referenced by vmStructs. Curses!
>>>>
>>>> It's also used by the SA.
>>>>>
>>>>> ------------------------------------------------------------------------------ 
>>>>>
>>>>> src/hotspot/share/runtime/thread.cpp
>>>>> 4707?? intptr_t w = Atomic::cmpxchg((intptr_t)LOCKBIT, Lock, 
>>>>> (intptr_t)0);
>>>>>
>>>>> This and other places suggest LOCKBIT should be defined as intptr_t,
>>>>> rather than as an enum value.? The MuxBits enum type is unused.
>>>>>
>>>>> And the cast of 0 is another case where implicit widening would be 
>>>>> nice.
>>>>
>>>> Making LOCKBIT a const intptr_t = 1 removes a lot of casts.
>>>>>
>>>>> ------------------------------------------------------------------------------ 
>>>>>
>>>>> src/hotspot/share/services/mallocSiteTable.cpp
>>>>> ? 261 bool MallocSiteHashtableEntry::atomic_insert(const 
>>>>> MallocSiteHashtableEntry* entry) {
>>>>> ? 262?? return Atomic::cmpxchg_if_null(entry, (const 
>>>>> MallocSiteHashtableEntry**)&_next);
>>>>> ? 263 }
>>>>>
>>>>> I think the problem here that is leading to the cast is that
>>>>> atomic_insert is taking a const T*.? Note that it's only caller 
>>>>> passes
>>>>> a non-const T*.
>>>>
>>>> I'll change the type to non-const.? We try to use consts...
>>>>
>>>> Thanks for the detailed review!? The gcc compiler seems happy so 
>>>> far, I'll post a webrev of the result of these changes after fixing 
>>>> Atomic::sub() and seeing how the other compilers deal with these 
>>>> changes.
>>>>
>>>> Thanks,
>>>> Coleen
>>>>
>>>>>
>>>>> ------------------------------------------------------------------------------ 
>>>>>
>>>>>
>>>>
>>>


From rkennke at redhat.com  Mon Oct 16 13:45:00 2017
From: rkennke at redhat.com (Roman Kennke)
Date: Mon, 16 Oct 2017 15:45:00 +0200
Subject: RFR: 8189333: Fix Zero build after Atomic::xchg changes
In-Reply-To: <441ed55f-6398-9fa1-d571-86548ed5a2a9@oracle.com>
References: <003ff7d9-759f-1ef5-f580-18c2571b63e5@redhat.com>
 <eba9e499-71e5-b092-b5e9-4a766cca4f83@oracle.com>
 <b0c31664-fc25-8608-8c23-882711691c13@redhat.com>
 <441ed55f-6398-9fa1-d571-86548ed5a2a9@oracle.com>
Message-ID: <9cd66129-3636-8de3-4624-a69bd8f28b99@redhat.com>

Hi Coleen,

Nope. It fails with this (and a bunch of similar) errors:
https://paste.fedoraproject.org/paste/cWKozoxY23z72~EMm0BPBA 
<https://paste.fedoraproject.org/paste/cWKozoxY23z72%7EEMm0BPBA>

It does build with this additional patch:
http://cr.openjdk.java.net/~rkennke/fix-zero-coleen/webrev/ 
<http://cr.openjdk.java.net/%7Erkennke/fix-zero-coleen/webrev/>

I.e.:
- cast BasicLock to markOop by using markOopDesc::encode()
- use oopDesc::cas_set_mark() instead of the raw Atomic ops (probably 
not strictly required for this change, but still much nicer)


You should not require any build scripts for Zero though. Simply run 
configure with --with-jvm-variants=zero and build in the corresponding 
linux-x86_64-normal-zero-slowdebug or similar directory using the usual 
make calls.


>
> Hi Roman, Can you build zero with this changeset?
>
> http://cr.openjdk.java.net/~coleenp/8188220.03/webrev/index.html
>
> My scripts for building zero are broken now.
>
> thanks,
> Coleen
>
> On 10/15/17 5:40 PM, Roman Kennke wrote:
>> Am 15.10.2017 um 23:32 schrieb David Holmes:
>>> Hi Roman,
>>>
>>> On 16/10/2017 7:12 AM, Roman Kennke wrote:
>>>> Zero debug build has been broken by: JDK-8187977: Generalize 
>>>> Atomic::xchg to use templates.
>>>>
>>>> This patch fixes it by casting the unsigned literal to jint:
>>>>
>>>> http://cr.openjdk.java.net/~rkennke/8189333/webrev.00/ 
>>>> <http://cr.openjdk.java.net/%7Erkennke/8189333/webrev.00/>
>>>
>>> Looks fine.
>>>
>>> I can push this for you straight away (relatively speaking :) ) 
>>> under the trivial rule.
>> Thanks!
>>
>> Roman
>


From stefan.karlsson at oracle.com  Mon Oct 16 14:14:27 2017
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Mon, 16 Oct 2017 16:14:27 +0200
Subject: RFR: 8189359: Move native weak oops cleaning out of ReferenceProcessor
Message-ID: <8c0aafa1-ca06-105c-72f9-7bd11d382452@oracle.com>

Hi all,

Please review this patch to move the JNI global weak handle processing 
out of the ReferenceProcessor into a new class, WeakProcessor, that will 
be used to gather processing and cleaning of "native weak" oops.

After this patch the ReferenceProcessor will only deal with the Java 
level java.lang.ref weak references.

http://cr.openjdk.java.net/~stefank/8189359/webrev.00/
https://bugs.openjdk.java.net/browse/JDK-8189359

Note this patch only moves the JNIHandles::weak_oops_do calls into the 
new WeakProcessor. A subsequent patch for JDK-8189359 will move the 
JvmtiExport::weak_oops_do from JNIHandleBlock into the WeakProcessor.

Future patches like JDK-8171119, for example, will be able to add it's 
set of native weak oops into the new WeakProcessor functions and won't 
have to duplicate the code for all GCs or add call inside the 
ReferenceProcessor.

Tested with JPRT.

Thanks,
StefanK

From nils.eliasson at oracle.com  Mon Oct 16 14:26:40 2017
From: nils.eliasson at oracle.com (Nils Eliasson)
Date: Mon, 16 Oct 2017 16:26:40 +0200
Subject: RFR: Newer AMD 17h (EPYC) Processor family defaults
In-Reply-To: <CAPVMLfULETh-GpK-iGPoshn2qXQF9akx5RbgW+MLF_rnDLcBiQ@mail.gmail.com>
References: <CAPVMLfU1MtwQ3vSxu=sfpfmonouiq62rHEsiwRBthbKR1276JA@mail.gmail.com>
 <f080cb6d-9c8e-7145-7d8f-8ac5805624bb@oracle.com>
 <CAPVMLfViTaBdH3344GLvwwb=j151X-9hbugBoSMeki0QP5bRfg@mail.gmail.com>
 <4d4fe028-ea6a-4f77-ab69-5c2bc752e1f5@oracle.com>
 <CAPVMLfUXEUXzt4wh9FWY5g4ugCPBf9y+-MHYzzm67CG88tdRVQ@mail.gmail.com>
 <47bc0a90-ed6a-220a-c3d1-b4df2d8bbc74@oracle.com>
 <9c53f889-e58e-33ac-3c05-874779b469d6@oracle.com>
 <45619e1a-9eb0-a540-193b-5187da3bf6bc@oracle.com>
 <CAPVMLfXG65eN99CoN-bMKOfYXnAc6=mnZgb9t8BQROqH6dvZzA@mail.gmail.com>
 <66e4af43-c0e2-6d64-b69f-35166150ffa2@oracle.com>
 <CAPVMLfWnAxmRfXvCcA4Qrh-V4RKtjHNPsXPT52G388rsW28HfA@mail.gmail.com>
 <CAPVMLfUGypGzHPcS+xaJ2gSuh2P5-RG3PeuuPVLJSOG2YOOCog@mail.gmail.com>
 <11af0f62-ba6b-d533-d23c-750d2ca012c7@oracle.com>
 <CAPVMLfUdQTx28wuNT9MwW6sPXwCXYWANpNOv25X6NFMHSt=-MA@mail.gmail.com>
 <e502b457-46b3-a737-3646-d148f1a82d8a@oracle.com>
 <CAPVMLfULETh-GpK-iGPoshn2qXQF9akx5RbgW+MLF_rnDLcBiQ@mail.gmail.com>
Message-ID: <886a112d-fc55-34d5-6e70-1e6a78cf1b0f@oracle.com>

Hi,

I ran into a problem touching this area, so I'm hijacking this thread.

 > #ifdef COMPILER2
 > - if (MaxVectorSize > 16) {
 > - // Limit vectors size to 16 bytes on current AMD cpus.

> +    if (cpu_family() < 0x17 && MaxVectorSize > 16) {
> +      // Limit vectors size to 16 bytes on AMD cpus < 17h.
>        FLAG_SET_DEFAULT(MaxVectorSize, 16);
>      }
>  #endif // COMPILER2


The limitation of MaxVecorSize to 16 for some processors in this code 
has the sideeffect that the TypeVect::VECTY and mreg2type[Op_VecY] won't 
be initalized even though the platform has the capability.

Type.cpp:~660

[...]
 >   if (Matcher::vector_size_supported(T_FLOAT,4)) {
 >     TypeVect::VECTX = TypeVect::make(T_FLOAT,4);
 >   }
 >   if (Matcher::vector_size_supported(T_FLOAT,8)) {
 >     TypeVect::VECTY = TypeVect::make(T_FLOAT,8);
 >   }
 >   if (Matcher::vector_size_supported(T_FLOAT,16)) {
 >     TypeVect::VECTZ = TypeVect::make(T_FLOAT,16);
 >   }
[...]
 >   mreg2type[Op_VecX] = TypeVect::VECTX;
 >   mreg2type[Op_VecY] = TypeVect::VECTY;
 >   mreg2type[Op_VecZ] = TypeVect::VECTZ;

In the ad-files feature flags (UseAVX etc.) are used to control what 
rules should be matched if it has effects on specific vector registers. 
Here we have a mismatch.

On a platform that supports AVX2 but have MaxVectorSize limited to 16, 
the VM will fail in regalloc when the TypeVect::VECTY/mreg2type[Op_VecY] 
is uninitalized, we will also hit asserts in a few places like: 
assert(Matcher::vector_size_supported(T_FLOAT,RegMask::SlotsPerVecY), 
"sanity");

Shouldn't the type initalization in type.cpp be dependent on feature 
flag (UseAVX etc.) instead of MaxVectorLength? (The type for the vector 
registers are initalized if the platform supports them, but they might 
not be used if MaxVectorSize is limited.)

I suggest something like this:

http://cr.openjdk.java.net/~neliasso/maxvectorsize/webrev/

I will open a bug and and a separate RFR if this seems reasonable to you.

Regards,
Nils Eliasson

On 2017-09-22 09:41, Rohit Arul Raj wrote:
> Thanks Vladimir,
>
> On Wed, Sep 20, 2017 at 10:07 PM, Vladimir Kozlov
> <vladimir.kozlov at oracle.com> wrote:
>>>        __ cmpl(rax, 0x80000000);     // Is cpuid(0x80000001) supported?
>>>        __ jcc(Assembler::belowEqual, done);
>>>        __ cmpl(rax, 0x80000004);     // Is cpuid(0x80000005) supported?
>>> -    __ jccb(Assembler::belowEqual, ext_cpuid1);
>>> +   __ jcc(Assembler::belowEqual, ext_cpuid1);
>>
>> Good. You may need to increase size of the buffer too (to be safe) to 1100:
>>
>> static const int stub_size = 1000;
>>
> Please find the updated patch after the requested change.
>
> diff --git a/src/cpu/x86/vm/vm_version_x86.cpp
> b/src/cpu/x86/vm/vm_version_x86.cpp
> --- a/src/cpu/x86/vm/vm_version_x86.cpp
> +++ b/src/cpu/x86/vm/vm_version_x86.cpp
> @@ -46,7 +46,7 @@
>   address VM_Version::_cpuinfo_cont_addr = 0;
>
>   static BufferBlob* stub_blob;
> -static const int stub_size = 1000;
> +static const int stub_size = 1100;
>
>   extern "C" {
>     typedef void (*get_cpu_info_stub_t)(void*);
> @@ -70,7 +70,7 @@
>       bool use_evex = FLAG_IS_DEFAULT(UseAVX) || (UseAVX > 2);
>
>       Label detect_486, cpu486, detect_586, std_cpuid1, std_cpuid4;
> -    Label sef_cpuid, ext_cpuid, ext_cpuid1, ext_cpuid5, ext_cpuid7,
> done, wrapup;
> +    Label sef_cpuid, ext_cpuid, ext_cpuid1, ext_cpuid5, ext_cpuid7,
> ext_cpuid8, done, wrapup;
>       Label legacy_setup, save_restore_except, legacy_save_restore,
> start_simd_check;
>
>       StubCodeMark mark(this, "VM_Version", "get_cpu_info_stub");
> @@ -267,14 +267,30 @@
>       __ cmpl(rax, 0x80000000);     // Is cpuid(0x80000001) supported?
>       __ jcc(Assembler::belowEqual, done);
>       __ cmpl(rax, 0x80000004);     // Is cpuid(0x80000005) supported?
> -    __ jccb(Assembler::belowEqual, ext_cpuid1);
> +    __ jcc(Assembler::belowEqual, ext_cpuid1);
>       __ cmpl(rax, 0x80000006);     // Is cpuid(0x80000007) supported?
>       __ jccb(Assembler::belowEqual, ext_cpuid5);
>       __ cmpl(rax, 0x80000007);     // Is cpuid(0x80000008) supported?
>       __ jccb(Assembler::belowEqual, ext_cpuid7);
> +    __ cmpl(rax, 0x80000008);     // Is cpuid(0x80000009 and above) supported?
> +    __ jccb(Assembler::belowEqual, ext_cpuid8);
> +    __ cmpl(rax, 0x8000001E);     // Is cpuid(0x8000001E) supported?
> +    __ jccb(Assembler::below, ext_cpuid8);
> +    //
> +    // Extended cpuid(0x8000001E)
> +    //
> +    __ movl(rax, 0x8000001E);
> +    __ cpuid();
> +    __ lea(rsi, Address(rbp, in_bytes(VM_Version::ext_cpuid1E_offset())));
> +    __ movl(Address(rsi, 0), rax);
> +    __ movl(Address(rsi, 4), rbx);
> +    __ movl(Address(rsi, 8), rcx);
> +    __ movl(Address(rsi,12), rdx);
> +
>       //
>       // Extended cpuid(0x80000008)
>       //
> +    __ bind(ext_cpuid8);
>       __ movl(rax, 0x80000008);
>       __ cpuid();
>       __ lea(rsi, Address(rbp, in_bytes(VM_Version::ext_cpuid8_offset())));
> @@ -1109,11 +1125,27 @@
>       }
>
>   #ifdef COMPILER2
> -    if (MaxVectorSize > 16) {
> -      // Limit vectors size to 16 bytes on current AMD cpus.
> +    if (cpu_family() < 0x17 && MaxVectorSize > 16) {
> +      // Limit vectors size to 16 bytes on AMD cpus < 17h.
>         FLAG_SET_DEFAULT(MaxVectorSize, 16);
>       }
>   #endif // COMPILER2
> +
> +    // Some defaults for AMD family 17h
> +    if ( cpu_family() == 0x17 ) {
> +      // On family 17h processors use XMM and UnalignedLoadStores for
> Array Copy
> +      if (supports_sse2() && FLAG_IS_DEFAULT(UseXMMForArrayCopy)) {
> +        FLAG_SET_DEFAULT(UseXMMForArrayCopy, true);
> +      }
> +      if (supports_sse2() && FLAG_IS_DEFAULT(UseUnalignedLoadStores)) {
> +        FLAG_SET_DEFAULT(UseUnalignedLoadStores, true);
> +      }
> +#ifdef COMPILER2
> +      if (supports_sse4_2() && FLAG_IS_DEFAULT(UseFPUForSpilling)) {
> +        FLAG_SET_DEFAULT(UseFPUForSpilling, true);
> +      }
> +#endif
> +    }
>     }
>
>     if( is_intel() ) { // Intel cpus specific settings
> diff --git a/src/cpu/x86/vm/vm_version_x86.hpp
> b/src/cpu/x86/vm/vm_version_x86.hpp
> --- a/src/cpu/x86/vm/vm_version_x86.hpp
> +++ b/src/cpu/x86/vm/vm_version_x86.hpp
> @@ -228,6 +228,15 @@
>       } bits;
>     };
>
> +  union ExtCpuid1EEbx {
> +    uint32_t value;
> +    struct {
> +      uint32_t                  : 8,
> +               threads_per_core : 8,
> +                                : 16;
> +    } bits;
> +  };
> +
>     union XemXcr0Eax {
>       uint32_t value;
>       struct {
> @@ -398,6 +407,12 @@
>       ExtCpuid8Ecx ext_cpuid8_ecx;
>       uint32_t     ext_cpuid8_edx; // reserved
>
> +    // cpuid function 0x8000001E // AMD 17h
> +    uint32_t      ext_cpuid1E_eax;
> +    ExtCpuid1EEbx ext_cpuid1E_ebx; // threads per core (AMD17h)
> +    uint32_t      ext_cpuid1E_ecx;
> +    uint32_t      ext_cpuid1E_edx; // unused currently
> +
>       // extended control register XCR0 (the XFEATURE_ENABLED_MASK register)
>       XemXcr0Eax   xem_xcr0_eax;
>       uint32_t     xem_xcr0_edx; // reserved
> @@ -505,6 +520,14 @@
>         result |= CPU_CLMUL;
>       if (_cpuid_info.sef_cpuid7_ebx.bits.rtm != 0)
>         result |= CPU_RTM;
> +    if(_cpuid_info.sef_cpuid7_ebx.bits.adx != 0)
> +       result |= CPU_ADX;
> +    if(_cpuid_info.sef_cpuid7_ebx.bits.bmi2 != 0)
> +      result |= CPU_BMI2;
> +    if (_cpuid_info.sef_cpuid7_ebx.bits.sha != 0)
> +      result |= CPU_SHA;
> +    if (_cpuid_info.std_cpuid1_ecx.bits.fma != 0)
> +      result |= CPU_FMA;
>
>       // AMD features.
>       if (is_amd()) {
> @@ -518,16 +541,8 @@
>       }
>       // Intel features.
>       if(is_intel()) {
> -      if(_cpuid_info.sef_cpuid7_ebx.bits.adx != 0)
> -         result |= CPU_ADX;
> -      if(_cpuid_info.sef_cpuid7_ebx.bits.bmi2 != 0)
> -        result |= CPU_BMI2;
> -      if (_cpuid_info.sef_cpuid7_ebx.bits.sha != 0)
> -        result |= CPU_SHA;
>         if(_cpuid_info.ext_cpuid1_ecx.bits.lzcnt_intel != 0)
>           result |= CPU_LZCNT;
> -      if (_cpuid_info.std_cpuid1_ecx.bits.fma != 0)
> -        result |= CPU_FMA;
>         // for Intel, ecx.bits.misalignsse bit (bit 8) indicates
> support for prefetchw
>         if (_cpuid_info.ext_cpuid1_ecx.bits.misalignsse != 0) {
>           result |= CPU_3DNOW_PREFETCH;
> @@ -590,6 +605,7 @@
>     static ByteSize ext_cpuid5_offset() { return
> byte_offset_of(CpuidInfo, ext_cpuid5_eax); }
>     static ByteSize ext_cpuid7_offset() { return
> byte_offset_of(CpuidInfo, ext_cpuid7_eax); }
>     static ByteSize ext_cpuid8_offset() { return
> byte_offset_of(CpuidInfo, ext_cpuid8_eax); }
> +  static ByteSize ext_cpuid1E_offset() { return
> byte_offset_of(CpuidInfo, ext_cpuid1E_eax); }
>     static ByteSize tpl_cpuidB0_offset() { return
> byte_offset_of(CpuidInfo, tpl_cpuidB0_eax); }
>     static ByteSize tpl_cpuidB1_offset() { return
> byte_offset_of(CpuidInfo, tpl_cpuidB1_eax); }
>     static ByteSize tpl_cpuidB2_offset() { return
> byte_offset_of(CpuidInfo, tpl_cpuidB2_eax); }
> @@ -673,8 +689,12 @@
>       if (is_intel() && supports_processor_topology()) {
>         result = _cpuid_info.tpl_cpuidB0_ebx.bits.logical_cpus;
>       } else if (_cpuid_info.std_cpuid1_edx.bits.ht != 0) {
> -      result = _cpuid_info.std_cpuid1_ebx.bits.threads_per_cpu /
> -               cores_per_cpu();
> +      if (cpu_family() >= 0x17) {
> +        result = _cpuid_info.ext_cpuid1E_ebx.bits.threads_per_core + 1;
> +      } else {
> +        result = _cpuid_info.std_cpuid1_ebx.bits.threads_per_cpu /
> +                 cores_per_cpu();
> +      }
>       }
>       return (result == 0 ? 1 : result);
>     }
>
> Regards,
> Rohit
>
>>> diff --git a/src/cpu/x86/vm/vm_version_x86.cpp
>>> b/src/cpu/x86/vm/vm_version_x86.cpp
>>> --- a/src/cpu/x86/vm/vm_version_x86.cpp
>>> +++ b/src/cpu/x86/vm/vm_version_x86.cpp
>>> @@ -70,7 +70,7 @@
>>>        bool use_evex = FLAG_IS_DEFAULT(UseAVX) || (UseAVX > 2);
>>>
>>>        Label detect_486, cpu486, detect_586, std_cpuid1, std_cpuid4;
>>> -    Label sef_cpuid, ext_cpuid, ext_cpuid1, ext_cpuid5, ext_cpuid7,
>>> done, wrapup;
>>> +    Label sef_cpuid, ext_cpuid, ext_cpuid1, ext_cpuid5, ext_cpuid7,
>>> ext_cpuid8, done, wrapup;
>>>        Label legacy_setup, save_restore_except, legacy_save_restore,
>>> start_simd_check;
>>>
>>>        StubCodeMark mark(this, "VM_Version", "get_cpu_info_stub");
>>> @@ -267,14 +267,30 @@
>>>        __ cmpl(rax, 0x80000000);     // Is cpuid(0x80000001) supported?
>>>        __ jcc(Assembler::belowEqual, done);
>>>        __ cmpl(rax, 0x80000004);     // Is cpuid(0x80000005) supported?
>>> -    __ jccb(Assembler::belowEqual, ext_cpuid1);
>>> +    __ jcc(Assembler::belowEqual, ext_cpuid1);
>>>        __ cmpl(rax, 0x80000006);     // Is cpuid(0x80000007) supported?
>>>        __ jccb(Assembler::belowEqual, ext_cpuid5);
>>>        __ cmpl(rax, 0x80000007);     // Is cpuid(0x80000008) supported?
>>>        __ jccb(Assembler::belowEqual, ext_cpuid7);
>>> +    __ cmpl(rax, 0x80000008);     // Is cpuid(0x80000009 and above)
>>> supported?
>>> +    __ jccb(Assembler::belowEqual, ext_cpuid8);
>>> +    __ cmpl(rax, 0x8000001E);     // Is cpuid(0x8000001E) supported?
>>> +    __ jccb(Assembler::below, ext_cpuid8);
>>> +    //
>>> +    // Extended cpuid(0x8000001E)
>>> +    //
>>> +    __ movl(rax, 0x8000001E);
>>> +    __ cpuid();
>>> +    __ lea(rsi, Address(rbp,
>>> in_bytes(VM_Version::ext_cpuid1E_offset())));
>>> +    __ movl(Address(rsi, 0), rax);
>>> +    __ movl(Address(rsi, 4), rbx);
>>> +    __ movl(Address(rsi, 8), rcx);
>>> +    __ movl(Address(rsi,12), rdx);
>>> +
>>>        //
>>>        // Extended cpuid(0x80000008)
>>>        //
>>> +    __ bind(ext_cpuid8);
>>>        __ movl(rax, 0x80000008);
>>>        __ cpuid();
>>>        __ lea(rsi, Address(rbp,
>>> in_bytes(VM_Version::ext_cpuid8_offset())));
>>> @@ -1109,11 +1125,27 @@
>>>        }
>>>
>>>    #ifdef COMPILER2
>>> -    if (MaxVectorSize > 16) {
>>> -      // Limit vectors size to 16 bytes on current AMD cpus.
>>> +    if (cpu_family() < 0x17 && MaxVectorSize > 16) {
>>> +      // Limit vectors size to 16 bytes on AMD cpus < 17h.
>>>          FLAG_SET_DEFAULT(MaxVectorSize, 16);
>>>        }
>>>    #endif // COMPILER2
>>> +
>>> +    // Some defaults for AMD family 17h
>>> +    if ( cpu_family() == 0x17 ) {
>>> +      // On family 17h processors use XMM and UnalignedLoadStores for
>>> Array Copy
>>> +      if (supports_sse2() && FLAG_IS_DEFAULT(UseXMMForArrayCopy)) {
>>> +        FLAG_SET_DEFAULT(UseXMMForArrayCopy, true);
>>> +      }
>>> +      if (supports_sse2() && FLAG_IS_DEFAULT(UseUnalignedLoadStores)) {
>>> +        FLAG_SET_DEFAULT(UseUnalignedLoadStores, true);
>>> +      }
>>> +#ifdef COMPILER2
>>> +      if (supports_sse4_2() && FLAG_IS_DEFAULT(UseFPUForSpilling)) {
>>> +        FLAG_SET_DEFAULT(UseFPUForSpilling, true);
>>> +      }
>>> +#endif
>>> +    }
>>>      }
>>>
>>>      if( is_intel() ) { // Intel cpus specific settings
>>> diff --git a/src/cpu/x86/vm/vm_version_x86.hpp
>>> b/src/cpu/x86/vm/vm_version_x86.hpp
>>> --- a/src/cpu/x86/vm/vm_version_x86.hpp
>>> +++ b/src/cpu/x86/vm/vm_version_x86.hpp
>>> @@ -228,6 +228,15 @@
>>>        } bits;
>>>      };
>>>
>>> +  union ExtCpuid1EEbx {
>>> +    uint32_t value;
>>> +    struct {
>>> +      uint32_t                  : 8,
>>> +               threads_per_core : 8,
>>> +                                : 16;
>>> +    } bits;
>>> +  };
>>> +
>>>      union XemXcr0Eax {
>>>        uint32_t value;
>>>        struct {
>>> @@ -398,6 +407,12 @@
>>>        ExtCpuid8Ecx ext_cpuid8_ecx;
>>>        uint32_t     ext_cpuid8_edx; // reserved
>>>
>>> +    // cpuid function 0x8000001E // AMD 17h
>>> +    uint32_t      ext_cpuid1E_eax;
>>> +    ExtCpuid1EEbx ext_cpuid1E_ebx; // threads per core (AMD17h)
>>> +    uint32_t      ext_cpuid1E_ecx;
>>> +    uint32_t      ext_cpuid1E_edx; // unused currently
>>> +
>>>        // extended control register XCR0 (the XFEATURE_ENABLED_MASK
>>> register)
>>>        XemXcr0Eax   xem_xcr0_eax;
>>>        uint32_t     xem_xcr0_edx; // reserved
>>> @@ -505,6 +520,14 @@
>>>          result |= CPU_CLMUL;
>>>        if (_cpuid_info.sef_cpuid7_ebx.bits.rtm != 0)
>>>          result |= CPU_RTM;
>>> +    if(_cpuid_info.sef_cpuid7_ebx.bits.adx != 0)
>>> +       result |= CPU_ADX;
>>> +    if(_cpuid_info.sef_cpuid7_ebx.bits.bmi2 != 0)
>>> +      result |= CPU_BMI2;
>>> +    if (_cpuid_info.sef_cpuid7_ebx.bits.sha != 0)
>>> +      result |= CPU_SHA;
>>> +    if (_cpuid_info.std_cpuid1_ecx.bits.fma != 0)
>>> +      result |= CPU_FMA;
>>>
>>>        // AMD features.
>>>        if (is_amd()) {
>>> @@ -518,16 +541,8 @@
>>>        }
>>>        // Intel features.
>>>        if(is_intel()) {
>>> -      if(_cpuid_info.sef_cpuid7_ebx.bits.adx != 0)
>>> -         result |= CPU_ADX;
>>> -      if(_cpuid_info.sef_cpuid7_ebx.bits.bmi2 != 0)
>>> -        result |= CPU_BMI2;
>>> -      if (_cpuid_info.sef_cpuid7_ebx.bits.sha != 0)
>>> -        result |= CPU_SHA;
>>>          if(_cpuid_info.ext_cpuid1_ecx.bits.lzcnt_intel != 0)
>>>            result |= CPU_LZCNT;
>>> -      if (_cpuid_info.std_cpuid1_ecx.bits.fma != 0)
>>> -        result |= CPU_FMA;
>>>          // for Intel, ecx.bits.misalignsse bit (bit 8) indicates
>>> support for prefetchw
>>>          if (_cpuid_info.ext_cpuid1_ecx.bits.misalignsse != 0) {
>>>            result |= CPU_3DNOW_PREFETCH;
>>> @@ -590,6 +605,7 @@
>>>      static ByteSize ext_cpuid5_offset() { return
>>> byte_offset_of(CpuidInfo, ext_cpuid5_eax); }
>>>      static ByteSize ext_cpuid7_offset() { return
>>> byte_offset_of(CpuidInfo, ext_cpuid7_eax); }
>>>      static ByteSize ext_cpuid8_offset() { return
>>> byte_offset_of(CpuidInfo, ext_cpuid8_eax); }
>>> +  static ByteSize ext_cpuid1E_offset() { return
>>> byte_offset_of(CpuidInfo, ext_cpuid1E_eax); }
>>>      static ByteSize tpl_cpuidB0_offset() { return
>>> byte_offset_of(CpuidInfo, tpl_cpuidB0_eax); }
>>>      static ByteSize tpl_cpuidB1_offset() { return
>>> byte_offset_of(CpuidInfo, tpl_cpuidB1_eax); }
>>>      static ByteSize tpl_cpuidB2_offset() { return
>>> byte_offset_of(CpuidInfo, tpl_cpuidB2_eax); }
>>> @@ -673,8 +689,12 @@
>>>        if (is_intel() && supports_processor_topology()) {
>>>          result = _cpuid_info.tpl_cpuidB0_ebx.bits.logical_cpus;
>>>        } else if (_cpuid_info.std_cpuid1_edx.bits.ht != 0) {
>>> -      result = _cpuid_info.std_cpuid1_ebx.bits.threads_per_cpu /
>>> -               cores_per_cpu();
>>> +      if (cpu_family() >= 0x17) {
>>> +        result = _cpuid_info.ext_cpuid1E_ebx.bits.threads_per_core + 1;
>>> +      } else {
>>> +        result = _cpuid_info.std_cpuid1_ebx.bits.threads_per_cpu /
>>> +                 cores_per_cpu();
>>> +      }
>>>        }
>>>        return (result == 0 ? 1 : result);
>>>      }
>>>
>>> Please let me know your comments.
>>> Thanks for your review.
>>>
>>> Regards,
>>> Rohit
>>>
>>>>
>>>> On 9/11/17 9:52 PM, Rohit Arul Raj wrote:
>>>>>
>>>>> Hello David,
>>>>>
>>>>>>>
>>>>>>> 1. ExtCpuid1EEx
>>>>>>>
>>>>>>> Should this be ExtCpuid1EEbx? (I see the naming here is somewhat
>>>>>>> inconsistent - and potentially confusing: I would have preferred to
>>>>>>> see
>>>>>>> things like ExtCpuid_1E_Ebx, to make it clear.)
>>>>>>
>>>>>>
>>>>>> Yes, I can change it accordingly.
>>>>>>
>>>>> I have attached the updated, re-tested patch as per your comments above.
>>>>>
>>>>> diff --git a/src/cpu/x86/vm/vm_version_x86.cpp
>>>>> b/src/cpu/x86/vm/vm_version_x86.cpp
>>>>> --- a/src/cpu/x86/vm/vm_version_x86.cpp
>>>>> +++ b/src/cpu/x86/vm/vm_version_x86.cpp
>>>>> @@ -70,7 +70,7 @@
>>>>>         bool use_evex = FLAG_IS_DEFAULT(UseAVX) || (UseAVX > 2);
>>>>>
>>>>>         Label detect_486, cpu486, detect_586, std_cpuid1, std_cpuid4;
>>>>> -    Label sef_cpuid, ext_cpuid, ext_cpuid1, ext_cpuid5, ext_cpuid7,
>>>>> done, wrapup;
>>>>> +    Label sef_cpuid, ext_cpuid, ext_cpuid1, ext_cpuid5, ext_cpuid7,
>>>>> ext_cpuid8, done, wrapup;
>>>>>         Label legacy_setup, save_restore_except, legacy_save_restore,
>>>>> start_simd_check;
>>>>>
>>>>>         StubCodeMark mark(this, "VM_Version", "get_cpu_info_stub");
>>>>> @@ -272,9 +272,23 @@
>>>>>         __ jccb(Assembler::belowEqual, ext_cpuid5);
>>>>>         __ cmpl(rax, 0x80000007);     // Is cpuid(0x80000008) supported?
>>>>>         __ jccb(Assembler::belowEqual, ext_cpuid7);
>>>>> +    __ cmpl(rax, 0x80000008);     // Is cpuid(0x8000001E) supported?
>>>>> +    __ jccb(Assembler::belowEqual, ext_cpuid8);
>>>>> +    //
>>>>> +    // Extended cpuid(0x8000001E)
>>>>> +    //
>>>>> +    __ movl(rax, 0x8000001E);
>>>>> +    __ cpuid();
>>>>> +    __ lea(rsi, Address(rbp,
>>>>> in_bytes(VM_Version::ext_cpuid_1E_offset())));
>>>>> +    __ movl(Address(rsi, 0), rax);
>>>>> +    __ movl(Address(rsi, 4), rbx);
>>>>> +    __ movl(Address(rsi, 8), rcx);
>>>>> +    __ movl(Address(rsi,12), rdx);
>>>>> +
>>>>>         //
>>>>>         // Extended cpuid(0x80000008)
>>>>>         //
>>>>> +    __ bind(ext_cpuid8);
>>>>>         __ movl(rax, 0x80000008);
>>>>>         __ cpuid();
>>>>>         __ lea(rsi, Address(rbp,
>>>>> in_bytes(VM_Version::ext_cpuid8_offset())));
>>>>> @@ -1109,11 +1123,27 @@
>>>>>         }
>>>>>
>>>>>     #ifdef COMPILER2
>>>>> -    if (MaxVectorSize > 16) {
>>>>> -      // Limit vectors size to 16 bytes on current AMD cpus.
>>>>> +    if (cpu_family() < 0x17 && MaxVectorSize > 16) {
>>>>> +      // Limit vectors size to 16 bytes on AMD cpus < 17h.
>>>>>           FLAG_SET_DEFAULT(MaxVectorSize, 16);
>>>>>         }
>>>>>     #endif // COMPILER2
>>>>> +
>>>>> +    // Some defaults for AMD family 17h
>>>>> +    if ( cpu_family() == 0x17 ) {
>>>>> +      // On family 17h processors use XMM and UnalignedLoadStores for
>>>>> Array Copy
>>>>> +      if (supports_sse2() && FLAG_IS_DEFAULT(UseXMMForArrayCopy)) {
>>>>> +        FLAG_SET_DEFAULT(UseXMMForArrayCopy, true);
>>>>> +      }
>>>>> +      if (supports_sse2() && FLAG_IS_DEFAULT(UseUnalignedLoadStores)) {
>>>>> +        FLAG_SET_DEFAULT(UseUnalignedLoadStores, true);
>>>>> +      }
>>>>> +#ifdef COMPILER2
>>>>> +      if (supports_sse4_2() && FLAG_IS_DEFAULT(UseFPUForSpilling)) {
>>>>> +        FLAG_SET_DEFAULT(UseFPUForSpilling, true);
>>>>> +      }
>>>>> +#endif
>>>>> +    }
>>>>>       }
>>>>>
>>>>>       if( is_intel() ) { // Intel cpus specific settings
>>>>> diff --git a/src/cpu/x86/vm/vm_version_x86.hpp
>>>>> b/src/cpu/x86/vm/vm_version_x86.hpp
>>>>> --- a/src/cpu/x86/vm/vm_version_x86.hpp
>>>>> +++ b/src/cpu/x86/vm/vm_version_x86.hpp
>>>>> @@ -228,6 +228,15 @@
>>>>>         } bits;
>>>>>       };
>>>>>
>>>>> +  union ExtCpuid_1E_Ebx {
>>>>> +    uint32_t value;
>>>>> +    struct {
>>>>> +      uint32_t                  : 8,
>>>>> +               threads_per_core : 8,
>>>>> +                                : 16;
>>>>> +    } bits;
>>>>> +  };
>>>>> +
>>>>>       union XemXcr0Eax {
>>>>>         uint32_t value;
>>>>>         struct {
>>>>> @@ -398,6 +407,12 @@
>>>>>         ExtCpuid8Ecx ext_cpuid8_ecx;
>>>>>         uint32_t     ext_cpuid8_edx; // reserved
>>>>>
>>>>> +    // cpuid function 0x8000001E // AMD 17h
>>>>> +    uint32_t        ext_cpuid_1E_eax;
>>>>> +    ExtCpuid_1E_Ebx ext_cpuid_1E_ebx; // threads per core (AMD17h)
>>>>> +    uint32_t        ext_cpuid_1E_ecx;
>>>>> +    uint32_t        ext_cpuid_1E_edx; // unused currently
>>>>> +
>>>>>         // extended control register XCR0 (the XFEATURE_ENABLED_MASK
>>>>> register)
>>>>>         XemXcr0Eax   xem_xcr0_eax;
>>>>>         uint32_t     xem_xcr0_edx; // reserved
>>>>> @@ -505,6 +520,14 @@
>>>>>           result |= CPU_CLMUL;
>>>>>         if (_cpuid_info.sef_cpuid7_ebx.bits.rtm != 0)
>>>>>           result |= CPU_RTM;
>>>>> +    if(_cpuid_info.sef_cpuid7_ebx.bits.adx != 0)
>>>>> +       result |= CPU_ADX;
>>>>> +    if(_cpuid_info.sef_cpuid7_ebx.bits.bmi2 != 0)
>>>>> +      result |= CPU_BMI2;
>>>>> +    if (_cpuid_info.sef_cpuid7_ebx.bits.sha != 0)
>>>>> +      result |= CPU_SHA;
>>>>> +    if (_cpuid_info.std_cpuid1_ecx.bits.fma != 0)
>>>>> +      result |= CPU_FMA;
>>>>>
>>>>>         // AMD features.
>>>>>         if (is_amd()) {
>>>>> @@ -518,16 +541,8 @@
>>>>>         }
>>>>>         // Intel features.
>>>>>         if(is_intel()) {
>>>>> -      if(_cpuid_info.sef_cpuid7_ebx.bits.adx != 0)
>>>>> -         result |= CPU_ADX;
>>>>> -      if(_cpuid_info.sef_cpuid7_ebx.bits.bmi2 != 0)
>>>>> -        result |= CPU_BMI2;
>>>>> -      if (_cpuid_info.sef_cpuid7_ebx.bits.sha != 0)
>>>>> -        result |= CPU_SHA;
>>>>>           if(_cpuid_info.ext_cpuid1_ecx.bits.lzcnt_intel != 0)
>>>>>             result |= CPU_LZCNT;
>>>>> -      if (_cpuid_info.std_cpuid1_ecx.bits.fma != 0)
>>>>> -        result |= CPU_FMA;
>>>>>           // for Intel, ecx.bits.misalignsse bit (bit 8) indicates
>>>>> support for prefetchw
>>>>>           if (_cpuid_info.ext_cpuid1_ecx.bits.misalignsse != 0) {
>>>>>             result |= CPU_3DNOW_PREFETCH;
>>>>> @@ -590,6 +605,7 @@
>>>>>       static ByteSize ext_cpuid5_offset() { return
>>>>> byte_offset_of(CpuidInfo, ext_cpuid5_eax); }
>>>>>       static ByteSize ext_cpuid7_offset() { return
>>>>> byte_offset_of(CpuidInfo, ext_cpuid7_eax); }
>>>>>       static ByteSize ext_cpuid8_offset() { return
>>>>> byte_offset_of(CpuidInfo, ext_cpuid8_eax); }
>>>>> +  static ByteSize ext_cpuid_1E_offset() { return
>>>>> byte_offset_of(CpuidInfo, ext_cpuid_1E_eax); }
>>>>>       static ByteSize tpl_cpuidB0_offset() { return
>>>>> byte_offset_of(CpuidInfo, tpl_cpuidB0_eax); }
>>>>>       static ByteSize tpl_cpuidB1_offset() { return
>>>>> byte_offset_of(CpuidInfo, tpl_cpuidB1_eax); }
>>>>>       static ByteSize tpl_cpuidB2_offset() { return
>>>>> byte_offset_of(CpuidInfo, tpl_cpuidB2_eax); }
>>>>> @@ -673,8 +689,11 @@
>>>>>         if (is_intel() && supports_processor_topology()) {
>>>>>           result = _cpuid_info.tpl_cpuidB0_ebx.bits.logical_cpus;
>>>>>         } else if (_cpuid_info.std_cpuid1_edx.bits.ht != 0) {
>>>>> -      result = _cpuid_info.std_cpuid1_ebx.bits.threads_per_cpu /
>>>>> -               cores_per_cpu();
>>>>> +      if (cpu_family() >= 0x17)
>>>>> +        result = _cpuid_info.ext_cpuid_1E_ebx.bits.threads_per_core +
>>>>> 1;
>>>>> +      else
>>>>> +        result = _cpuid_info.std_cpuid1_ebx.bits.threads_per_cpu /
>>>>> +                 cores_per_cpu();
>>>>>         }
>>>>>         return (result == 0 ? 1 : result);
>>>>>       }
>>>>>
>>>>>
>>>>> Please let me know your comments
>>>>>
>>>>> Thanks for your time.
>>>>>
>>>>> Regards,
>>>>> Rohit
>>>>>
>>>>>
>>>>>>> Thanks,
>>>>>>> David
>>>>>>> -----
>>>>>>>
>>>>>>>
>>>>>>>> Reference:
>>>>>>>>
>>>>>>>>
>>>>>>>> https://support.amd.com/TechDocs/54945_PPR_Family_17h_Models_00h-0Fh.pdf
>>>>>>>> [Pg 82]
>>>>>>>>
>>>>>>>>         CPUID_Fn8000001E_EBX [Core Identifiers] (CoreId)
>>>>>>>>           15:8 ThreadsPerCore: threads per core. Read-only. Reset:
>>>>>>>> XXh.
>>>>>>>> The number of threads per core is ThreadsPerCore+1.
>>>>>>>>
>>>>>>>> diff --git a/src/cpu/x86/vm/vm_version_x86.cpp
>>>>>>>> b/src/cpu/x86/vm/vm_version_x86.cpp
>>>>>>>> --- a/src/cpu/x86/vm/vm_version_x86.cpp
>>>>>>>> +++ b/src/cpu/x86/vm/vm_version_x86.cpp
>>>>>>>> @@ -70,7 +70,7 @@
>>>>>>>>          bool use_evex = FLAG_IS_DEFAULT(UseAVX) || (UseAVX > 2);
>>>>>>>>
>>>>>>>>          Label detect_486, cpu486, detect_586, std_cpuid1, std_cpuid4;
>>>>>>>> -    Label sef_cpuid, ext_cpuid, ext_cpuid1, ext_cpuid5, ext_cpuid7,
>>>>>>>> done, wrapup;
>>>>>>>> +    Label sef_cpuid, ext_cpuid, ext_cpuid1, ext_cpuid5, ext_cpuid7,
>>>>>>>> ext_cpuid8, done, wrapup;
>>>>>>>>          Label legacy_setup, save_restore_except, legacy_save_restore,
>>>>>>>> start_simd_check;
>>>>>>>>
>>>>>>>>          StubCodeMark mark(this, "VM_Version", "get_cpu_info_stub");
>>>>>>>> @@ -272,9 +272,23 @@
>>>>>>>>          __ jccb(Assembler::belowEqual, ext_cpuid5);
>>>>>>>>          __ cmpl(rax, 0x80000007);     // Is cpuid(0x80000008)
>>>>>>>> supported?
>>>>>>>>          __ jccb(Assembler::belowEqual, ext_cpuid7);
>>>>>>>> +    __ cmpl(rax, 0x80000008);     // Is cpuid(0x8000001E) supported?
>>>>>>>> +    __ jccb(Assembler::belowEqual, ext_cpuid8);
>>>>>>>> +    //
>>>>>>>> +    // Extended cpuid(0x8000001E)
>>>>>>>> +    //
>>>>>>>> +    __ movl(rax, 0x8000001E);
>>>>>>>> +    __ cpuid();
>>>>>>>> +    __ lea(rsi, Address(rbp,
>>>>>>>> in_bytes(VM_Version::ext_cpuid1E_offset())));
>>>>>>>> +    __ movl(Address(rsi, 0), rax);
>>>>>>>> +    __ movl(Address(rsi, 4), rbx);
>>>>>>>> +    __ movl(Address(rsi, 8), rcx);
>>>>>>>> +    __ movl(Address(rsi,12), rdx);
>>>>>>>> +
>>>>>>>>          //
>>>>>>>>          // Extended cpuid(0x80000008)
>>>>>>>>          //
>>>>>>>> +    __ bind(ext_cpuid8);
>>>>>>>>          __ movl(rax, 0x80000008);
>>>>>>>>          __ cpuid();
>>>>>>>>          __ lea(rsi, Address(rbp,
>>>>>>>> in_bytes(VM_Version::ext_cpuid8_offset())));
>>>>>>>> @@ -1109,11 +1123,27 @@
>>>>>>>>          }
>>>>>>>>
>>>>>>>>      #ifdef COMPILER2
>>>>>>>> -    if (MaxVectorSize > 16) {
>>>>>>>> -      // Limit vectors size to 16 bytes on current AMD cpus.
>>>>>>>> +    if (cpu_family() < 0x17 && MaxVectorSize > 16) {
>>>>>>>> +      // Limit vectors size to 16 bytes on AMD cpus < 17h.
>>>>>>>>            FLAG_SET_DEFAULT(MaxVectorSize, 16);
>>>>>>>>          }
>>>>>>>>      #endif // COMPILER2
>>>>>>>> +
>>>>>>>> +    // Some defaults for AMD family 17h
>>>>>>>> +    if ( cpu_family() == 0x17 ) {
>>>>>>>> +      // On family 17h processors use XMM and UnalignedLoadStores
>>>>>>>> for
>>>>>>>> Array Copy
>>>>>>>> +      if (supports_sse2() && FLAG_IS_DEFAULT(UseXMMForArrayCopy)) {
>>>>>>>> +        FLAG_SET_DEFAULT(UseXMMForArrayCopy, true);
>>>>>>>> +      }
>>>>>>>> +      if (supports_sse2() &&
>>>>>>>> FLAG_IS_DEFAULT(UseUnalignedLoadStores))
>>>>>>>> {
>>>>>>>> +        FLAG_SET_DEFAULT(UseUnalignedLoadStores, true);
>>>>>>>> +      }
>>>>>>>> +#ifdef COMPILER2
>>>>>>>> +      if (supports_sse4_2() && FLAG_IS_DEFAULT(UseFPUForSpilling)) {
>>>>>>>> +        FLAG_SET_DEFAULT(UseFPUForSpilling, true);
>>>>>>>> +      }
>>>>>>>> +#endif
>>>>>>>> +    }
>>>>>>>>        }
>>>>>>>>
>>>>>>>>        if( is_intel() ) { // Intel cpus specific settings
>>>>>>>> diff --git a/src/cpu/x86/vm/vm_version_x86.hpp
>>>>>>>> b/src/cpu/x86/vm/vm_version_x86.hpp
>>>>>>>> --- a/src/cpu/x86/vm/vm_version_x86.hpp
>>>>>>>> +++ b/src/cpu/x86/vm/vm_version_x86.hpp
>>>>>>>> @@ -228,6 +228,15 @@
>>>>>>>>          } bits;
>>>>>>>>        };
>>>>>>>>
>>>>>>>> +  union ExtCpuid1EEx {
>>>>>>>> +    uint32_t value;
>>>>>>>> +    struct {
>>>>>>>> +      uint32_t                  : 8,
>>>>>>>> +               threads_per_core : 8,
>>>>>>>> +                                : 16;
>>>>>>>> +    } bits;
>>>>>>>> +  };
>>>>>>>> +
>>>>>>>>        union XemXcr0Eax {
>>>>>>>>          uint32_t value;
>>>>>>>>          struct {
>>>>>>>> @@ -398,6 +407,12 @@
>>>>>>>>          ExtCpuid8Ecx ext_cpuid8_ecx;
>>>>>>>>          uint32_t     ext_cpuid8_edx; // reserved
>>>>>>>>
>>>>>>>> +    // cpuid function 0x8000001E // AMD 17h
>>>>>>>> +    uint32_t     ext_cpuid1E_eax;
>>>>>>>> +    ExtCpuid1EEx ext_cpuid1E_ebx; // threads per core (AMD17h)
>>>>>>>> +    uint32_t     ext_cpuid1E_ecx;
>>>>>>>> +    uint32_t     ext_cpuid1E_edx; // unused currently
>>>>>>>> +
>>>>>>>>          // extended control register XCR0 (the XFEATURE_ENABLED_MASK
>>>>>>>> register)
>>>>>>>>          XemXcr0Eax   xem_xcr0_eax;
>>>>>>>>          uint32_t     xem_xcr0_edx; // reserved
>>>>>>>> @@ -505,6 +520,14 @@
>>>>>>>>            result |= CPU_CLMUL;
>>>>>>>>          if (_cpuid_info.sef_cpuid7_ebx.bits.rtm != 0)
>>>>>>>>            result |= CPU_RTM;
>>>>>>>> +    if(_cpuid_info.sef_cpuid7_ebx.bits.adx != 0)
>>>>>>>> +       result |= CPU_ADX;
>>>>>>>> +    if(_cpuid_info.sef_cpuid7_ebx.bits.bmi2 != 0)
>>>>>>>> +      result |= CPU_BMI2;
>>>>>>>> +    if (_cpuid_info.sef_cpuid7_ebx.bits.sha != 0)
>>>>>>>> +      result |= CPU_SHA;
>>>>>>>> +    if (_cpuid_info.std_cpuid1_ecx.bits.fma != 0)
>>>>>>>> +      result |= CPU_FMA;
>>>>>>>>
>>>>>>>>          // AMD features.
>>>>>>>>          if (is_amd()) {
>>>>>>>> @@ -518,16 +541,8 @@
>>>>>>>>          }
>>>>>>>>          // Intel features.
>>>>>>>>          if(is_intel()) {
>>>>>>>> -      if(_cpuid_info.sef_cpuid7_ebx.bits.adx != 0)
>>>>>>>> -         result |= CPU_ADX;
>>>>>>>> -      if(_cpuid_info.sef_cpuid7_ebx.bits.bmi2 != 0)
>>>>>>>> -        result |= CPU_BMI2;
>>>>>>>> -      if (_cpuid_info.sef_cpuid7_ebx.bits.sha != 0)
>>>>>>>> -        result |= CPU_SHA;
>>>>>>>>            if(_cpuid_info.ext_cpuid1_ecx.bits.lzcnt_intel != 0)
>>>>>>>>              result |= CPU_LZCNT;
>>>>>>>> -      if (_cpuid_info.std_cpuid1_ecx.bits.fma != 0)
>>>>>>>> -        result |= CPU_FMA;
>>>>>>>>            // for Intel, ecx.bits.misalignsse bit (bit 8) indicates
>>>>>>>> support for prefetchw
>>>>>>>>            if (_cpuid_info.ext_cpuid1_ecx.bits.misalignsse != 0) {
>>>>>>>>              result |= CPU_3DNOW_PREFETCH;
>>>>>>>> @@ -590,6 +605,7 @@
>>>>>>>>        static ByteSize ext_cpuid5_offset() { return
>>>>>>>> byte_offset_of(CpuidInfo, ext_cpuid5_eax); }
>>>>>>>>        static ByteSize ext_cpuid7_offset() { return
>>>>>>>> byte_offset_of(CpuidInfo, ext_cpuid7_eax); }
>>>>>>>>        static ByteSize ext_cpuid8_offset() { return
>>>>>>>> byte_offset_of(CpuidInfo, ext_cpuid8_eax); }
>>>>>>>> +  static ByteSize ext_cpuid1E_offset() { return
>>>>>>>> byte_offset_of(CpuidInfo, ext_cpuid1E_eax); }
>>>>>>>>        static ByteSize tpl_cpuidB0_offset() { return
>>>>>>>> byte_offset_of(CpuidInfo, tpl_cpuidB0_eax); }
>>>>>>>>        static ByteSize tpl_cpuidB1_offset() { return
>>>>>>>> byte_offset_of(CpuidInfo, tpl_cpuidB1_eax); }
>>>>>>>>        static ByteSize tpl_cpuidB2_offset() { return
>>>>>>>> byte_offset_of(CpuidInfo, tpl_cpuidB2_eax); }
>>>>>>>> @@ -673,8 +689,11 @@
>>>>>>>>          if (is_intel() && supports_processor_topology()) {
>>>>>>>>            result = _cpuid_info.tpl_cpuidB0_ebx.bits.logical_cpus;
>>>>>>>>          } else if (_cpuid_info.std_cpuid1_edx.bits.ht != 0) {
>>>>>>>> -      result = _cpuid_info.std_cpuid1_ebx.bits.threads_per_cpu /
>>>>>>>> -               cores_per_cpu();
>>>>>>>> +      if (cpu_family() >= 0x17)
>>>>>>>> +        result = _cpuid_info.ext_cpuid1E_ebx.bits.threads_per_core +
>>>>>>>> 1;
>>>>>>>> +      else
>>>>>>>> +        result = _cpuid_info.std_cpuid1_ebx.bits.threads_per_cpu /
>>>>>>>> +                 cores_per_cpu();
>>>>>>>>          }
>>>>>>>>          return (result == 0 ? 1 : result);
>>>>>>>>        }
>>>>>>>>
>>>>>>>> I have attached the patch for review.
>>>>>>>> Please let me know your comments.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Rohit
>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Vladimir
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> src/cpu/x86/vm/vm_version_x86.cpp
>>>>>>>>>>
>>>>>>>>>> No comments on AMD specific changes.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> David
>>>>>>>>>> -----
>>>>>>>>>>
>>>>>>>>>> On 5/09/2017 3:43 PM, David Holmes wrote:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 5/09/2017 3:29 PM, Rohit Arul Raj wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Hello David,
>>>>>>>>>>>>
>>>>>>>>>>>> On Tue, Sep 5, 2017 at 10:31 AM, David Holmes
>>>>>>>>>>>> <david.holmes at oracle.com>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Hi Rohit,
>>>>>>>>>>>>>
>>>>>>>>>>>>> I was unable to apply your patch to latest jdk10/hs/hotspot
>>>>>>>>>>>>> repo.
>>>>>>>>>>>>>
>>>>>>>>>>>> I checked out the latest jdk10/hs/hotspot [parent:
>>>>>>>>>>>> 13548:1a9c2e07a826]
>>>>>>>>>>>> and was able to apply the patch
>>>>>>>>>>>> [epyc-amd17h-defaults-3Sept.patch]
>>>>>>>>>>>> without any issues.
>>>>>>>>>>>> Can you share the error message that you are getting?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> I was getting this:
>>>>>>>>>>>
>>>>>>>>>>> applying hotspot.patch
>>>>>>>>>>> patching file src/cpu/x86/vm/vm_version_x86.cpp
>>>>>>>>>>> Hunk #1 FAILED at 1108
>>>>>>>>>>> 1 out of 1 hunks FAILED -- saving rejects to file
>>>>>>>>>>> src/cpu/x86/vm/vm_version_x86.cpp.rej
>>>>>>>>>>> patching file src/cpu/x86/vm/vm_version_x86.hpp
>>>>>>>>>>> Hunk #2 FAILED at 522
>>>>>>>>>>> 1 out of 2 hunks FAILED -- saving rejects to file
>>>>>>>>>>> src/cpu/x86/vm/vm_version_x86.hpp.rej
>>>>>>>>>>> abort: patch failed to apply
>>>>>>>>>>>
>>>>>>>>>>> but I started again and this time it applied fine, so not sure
>>>>>>>>>>> what
>>>>>>>>>>> was
>>>>>>>>>>> going on there.
>>>>>>>>>>>
>>>>>>>>>>> Cheers,
>>>>>>>>>>> David
>>>>>>>>>>>
>>>>>>>>>>>> Regards,
>>>>>>>>>>>> Rohit
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 4/09/2017 2:42 AM, Rohit Arul Raj wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hello Vladimir,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Sat, Sep 2, 2017 at 11:25 PM, Vladimir Kozlov
>>>>>>>>>>>>>> <vladimir.kozlov at oracle.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi Rohit,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On 9/2/17 1:16 AM, Rohit Arul Raj wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hello Vladimir,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Changes look good. Only question I have is about
>>>>>>>>>>>>>>>>> MaxVectorSize.
>>>>>>>>>>>>>>>>> It
>>>>>>>>>>>>>>>>> is
>>>>>>>>>>>>>>>>> set
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 16 only in presence of AVX:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> http://hg.openjdk.java.net/jdk10/hs/hotspot/file/046eab27258f/src/cpu/x86/vm/vm_version_x86.cpp#l945
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Does that code works for AMD 17h too?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks for pointing that out. Yes, the code works fine for
>>>>>>>>>>>>>>>> AMD
>>>>>>>>>>>>>>>> 17h.
>>>>>>>>>>>>>>>> So
>>>>>>>>>>>>>>>> I have removed the surplus check for MaxVectorSize from my
>>>>>>>>>>>>>>>> patch.
>>>>>>>>>>>>>>>> I
>>>>>>>>>>>>>>>> have updated, re-tested and attached the patch.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Which check you removed?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> My older patch had the below mentioned check which was required
>>>>>>>>>>>>>> on
>>>>>>>>>>>>>> JDK9 where the default MaxVectorSize was 64. It has been
>>>>>>>>>>>>>> handled
>>>>>>>>>>>>>> better in openJDK10. So this check is not required anymore.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> +    // Some defaults for AMD family 17h
>>>>>>>>>>>>>> +    if ( cpu_family() == 0x17 ) {
>>>>>>>>>>>>>> ...
>>>>>>>>>>>>>> ...
>>>>>>>>>>>>>> +      if (MaxVectorSize > 32) {
>>>>>>>>>>>>>> +        FLAG_SET_DEFAULT(MaxVectorSize, 32);
>>>>>>>>>>>>>> +      }
>>>>>>>>>>>>>> ..
>>>>>>>>>>>>>> ..
>>>>>>>>>>>>>> +      }
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I have one query regarding the setting of UseSHA flag:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> http://hg.openjdk.java.net/jdk10/hs/hotspot/file/046eab27258f/src/cpu/x86/vm/vm_version_x86.cpp#l821
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> AMD 17h has support for SHA.
>>>>>>>>>>>>>>>> AMD 15h doesn't have  support for SHA. Still "UseSHA" flag
>>>>>>>>>>>>>>>> gets
>>>>>>>>>>>>>>>> enabled for it based on the availability of BMI2 and AVX2. Is
>>>>>>>>>>>>>>>> there
>>>>>>>>>>>>>>>> an
>>>>>>>>>>>>>>>> underlying reason for this? I have handled this in the patch
>>>>>>>>>>>>>>>> but
>>>>>>>>>>>>>>>> just
>>>>>>>>>>>>>>>> wanted to confirm.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> It was done with next changes which use only AVX2 and BMI2
>>>>>>>>>>>>>>> instructions
>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>> calculate SHA-256:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> http://hg.openjdk.java.net/jdk10/hs/hotspot/rev/6a17c49de974
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I don't know if AMD 15h supports these instructions and can
>>>>>>>>>>>>>>> execute
>>>>>>>>>>>>>>> that
>>>>>>>>>>>>>>> code. You need to test it.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Ok, got it. Since AMD15h has support for AVX2 and BMI2
>>>>>>>>>>>>>> instructions,
>>>>>>>>>>>>>> it should work.
>>>>>>>>>>>>>> Confirmed by running following sanity tests:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ./hotspot/test/compiler/intrinsics/sha/sanity/TestSHA1Intrinsics.java
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ./hotspot/test/compiler/intrinsics/sha/sanity/TestSHA512Intrinsics.java
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ./hotspot/test/compiler/intrinsics/sha/sanity/TestSHA256Intrinsics.java
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> So I have removed those SHA checks from my patch too.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Please find attached updated, re-tested patch.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> diff --git a/src/cpu/x86/vm/vm_version_x86.cpp
>>>>>>>>>>>>>> b/src/cpu/x86/vm/vm_version_x86.cpp
>>>>>>>>>>>>>> --- a/src/cpu/x86/vm/vm_version_x86.cpp
>>>>>>>>>>>>>> +++ b/src/cpu/x86/vm/vm_version_x86.cpp
>>>>>>>>>>>>>> @@ -1109,11 +1109,27 @@
>>>>>>>>>>>>>>            }
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>        #ifdef COMPILER2
>>>>>>>>>>>>>> -    if (MaxVectorSize > 16) {
>>>>>>>>>>>>>> -      // Limit vectors size to 16 bytes on current AMD cpus.
>>>>>>>>>>>>>> +    if (cpu_family() < 0x17 && MaxVectorSize > 16) {
>>>>>>>>>>>>>> +      // Limit vectors size to 16 bytes on AMD cpus < 17h.
>>>>>>>>>>>>>>              FLAG_SET_DEFAULT(MaxVectorSize, 16);
>>>>>>>>>>>>>>            }
>>>>>>>>>>>>>>        #endif // COMPILER2
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> +    // Some defaults for AMD family 17h
>>>>>>>>>>>>>> +    if ( cpu_family() == 0x17 ) {
>>>>>>>>>>>>>> +      // On family 17h processors use XMM and
>>>>>>>>>>>>>> UnalignedLoadStores
>>>>>>>>>>>>>> for
>>>>>>>>>>>>>> Array Copy
>>>>>>>>>>>>>> +      if (supports_sse2() &&
>>>>>>>>>>>>>> FLAG_IS_DEFAULT(UseXMMForArrayCopy)) {
>>>>>>>>>>>>>> +        FLAG_SET_DEFAULT(UseXMMForArrayCopy, true);
>>>>>>>>>>>>>> +      }
>>>>>>>>>>>>>> +      if (supports_sse2() &&
>>>>>>>>>>>>>> FLAG_IS_DEFAULT(UseUnalignedLoadStores))
>>>>>>>>>>>>>> {
>>>>>>>>>>>>>> +        FLAG_SET_DEFAULT(UseUnalignedLoadStores, true);
>>>>>>>>>>>>>> +      }
>>>>>>>>>>>>>> +#ifdef COMPILER2
>>>>>>>>>>>>>> +      if (supports_sse4_2() &&
>>>>>>>>>>>>>> FLAG_IS_DEFAULT(UseFPUForSpilling))
>>>>>>>>>>>>>> {
>>>>>>>>>>>>>> +        FLAG_SET_DEFAULT(UseFPUForSpilling, true);
>>>>>>>>>>>>>> +      }
>>>>>>>>>>>>>> +#endif
>>>>>>>>>>>>>> +    }
>>>>>>>>>>>>>>          }
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>          if( is_intel() ) { // Intel cpus specific settings
>>>>>>>>>>>>>> diff --git a/src/cpu/x86/vm/vm_version_x86.hpp
>>>>>>>>>>>>>> b/src/cpu/x86/vm/vm_version_x86.hpp
>>>>>>>>>>>>>> --- a/src/cpu/x86/vm/vm_version_x86.hpp
>>>>>>>>>>>>>> +++ b/src/cpu/x86/vm/vm_version_x86.hpp
>>>>>>>>>>>>>> @@ -505,6 +505,14 @@
>>>>>>>>>>>>>>              result |= CPU_CLMUL;
>>>>>>>>>>>>>>            if (_cpuid_info.sef_cpuid7_ebx.bits.rtm != 0)
>>>>>>>>>>>>>>              result |= CPU_RTM;
>>>>>>>>>>>>>> +    if(_cpuid_info.sef_cpuid7_ebx.bits.adx != 0)
>>>>>>>>>>>>>> +       result |= CPU_ADX;
>>>>>>>>>>>>>> +    if(_cpuid_info.sef_cpuid7_ebx.bits.bmi2 != 0)
>>>>>>>>>>>>>> +      result |= CPU_BMI2;
>>>>>>>>>>>>>> +    if (_cpuid_info.sef_cpuid7_ebx.bits.sha != 0)
>>>>>>>>>>>>>> +      result |= CPU_SHA;
>>>>>>>>>>>>>> +    if (_cpuid_info.std_cpuid1_ecx.bits.fma != 0)
>>>>>>>>>>>>>> +      result |= CPU_FMA;
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>            // AMD features.
>>>>>>>>>>>>>>            if (is_amd()) {
>>>>>>>>>>>>>> @@ -515,19 +523,13 @@
>>>>>>>>>>>>>>                result |= CPU_LZCNT;
>>>>>>>>>>>>>>              if (_cpuid_info.ext_cpuid1_ecx.bits.sse4a != 0)
>>>>>>>>>>>>>>                result |= CPU_SSE4A;
>>>>>>>>>>>>>> +      if(_cpuid_info.std_cpuid1_edx.bits.ht != 0)
>>>>>>>>>>>>>> +        result |= CPU_HT;
>>>>>>>>>>>>>>            }
>>>>>>>>>>>>>>            // Intel features.
>>>>>>>>>>>>>>            if(is_intel()) {
>>>>>>>>>>>>>> -      if(_cpuid_info.sef_cpuid7_ebx.bits.adx != 0)
>>>>>>>>>>>>>> -         result |= CPU_ADX;
>>>>>>>>>>>>>> -      if(_cpuid_info.sef_cpuid7_ebx.bits.bmi2 != 0)
>>>>>>>>>>>>>> -        result |= CPU_BMI2;
>>>>>>>>>>>>>> -      if (_cpuid_info.sef_cpuid7_ebx.bits.sha != 0)
>>>>>>>>>>>>>> -        result |= CPU_SHA;
>>>>>>>>>>>>>>              if(_cpuid_info.ext_cpuid1_ecx.bits.lzcnt_intel !=
>>>>>>>>>>>>>> 0)
>>>>>>>>>>>>>>                result |= CPU_LZCNT;
>>>>>>>>>>>>>> -      if (_cpuid_info.std_cpuid1_ecx.bits.fma != 0)
>>>>>>>>>>>>>> -        result |= CPU_FMA;
>>>>>>>>>>>>>>              // for Intel, ecx.bits.misalignsse bit (bit 8)
>>>>>>>>>>>>>> indicates
>>>>>>>>>>>>>> support for prefetchw
>>>>>>>>>>>>>>              if (_cpuid_info.ext_cpuid1_ecx.bits.misalignsse !=
>>>>>>>>>>>>>> 0)
>>>>>>>>>>>>>> {
>>>>>>>>>>>>>>                result |= CPU_3DNOW_PREFETCH;
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Please let me know your comments.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks for your time.
>>>>>>>>>>>>>> Rohit
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks for taking time to review the code.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> diff --git a/src/cpu/x86/vm/vm_version_x86.cpp
>>>>>>>>>>>>>>>> b/src/cpu/x86/vm/vm_version_x86.cpp
>>>>>>>>>>>>>>>> --- a/src/cpu/x86/vm/vm_version_x86.cpp
>>>>>>>>>>>>>>>> +++ b/src/cpu/x86/vm/vm_version_x86.cpp
>>>>>>>>>>>>>>>> @@ -1088,6 +1088,22 @@
>>>>>>>>>>>>>>>>               }
>>>>>>>>>>>>>>>>               FLAG_SET_DEFAULT(UseSSE42Intrinsics, false);
>>>>>>>>>>>>>>>>             }
>>>>>>>>>>>>>>>> +    if (supports_sha()) {
>>>>>>>>>>>>>>>> +      if (FLAG_IS_DEFAULT(UseSHA)) {
>>>>>>>>>>>>>>>> +        FLAG_SET_DEFAULT(UseSHA, true);
>>>>>>>>>>>>>>>> +      }
>>>>>>>>>>>>>>>> +    } else if (UseSHA || UseSHA1Intrinsics ||
>>>>>>>>>>>>>>>> UseSHA256Intrinsics
>>>>>>>>>>>>>>>> ||
>>>>>>>>>>>>>>>> UseSHA512Intrinsics) {
>>>>>>>>>>>>>>>> +      if (!FLAG_IS_DEFAULT(UseSHA) ||
>>>>>>>>>>>>>>>> +          !FLAG_IS_DEFAULT(UseSHA1Intrinsics) ||
>>>>>>>>>>>>>>>> +          !FLAG_IS_DEFAULT(UseSHA256Intrinsics) ||
>>>>>>>>>>>>>>>> +          !FLAG_IS_DEFAULT(UseSHA512Intrinsics)) {
>>>>>>>>>>>>>>>> +        warning("SHA instructions are not available on this
>>>>>>>>>>>>>>>> CPU");
>>>>>>>>>>>>>>>> +      }
>>>>>>>>>>>>>>>> +      FLAG_SET_DEFAULT(UseSHA, false);
>>>>>>>>>>>>>>>> +      FLAG_SET_DEFAULT(UseSHA1Intrinsics, false);
>>>>>>>>>>>>>>>> +      FLAG_SET_DEFAULT(UseSHA256Intrinsics, false);
>>>>>>>>>>>>>>>> +      FLAG_SET_DEFAULT(UseSHA512Intrinsics, false);
>>>>>>>>>>>>>>>> +    }
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>             // some defaults for AMD family 15h
>>>>>>>>>>>>>>>>             if ( cpu_family() == 0x15 ) {
>>>>>>>>>>>>>>>> @@ -1109,11 +1125,40 @@
>>>>>>>>>>>>>>>>             }
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>         #ifdef COMPILER2
>>>>>>>>>>>>>>>> -    if (MaxVectorSize > 16) {
>>>>>>>>>>>>>>>> -      // Limit vectors size to 16 bytes on current AMD cpus.
>>>>>>>>>>>>>>>> +    if (cpu_family() < 0x17 && MaxVectorSize > 16) {
>>>>>>>>>>>>>>>> +      // Limit vectors size to 16 bytes on AMD cpus < 17h.
>>>>>>>>>>>>>>>>               FLAG_SET_DEFAULT(MaxVectorSize, 16);
>>>>>>>>>>>>>>>>             }
>>>>>>>>>>>>>>>>         #endif // COMPILER2
>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>> +    // Some defaults for AMD family 17h
>>>>>>>>>>>>>>>> +    if ( cpu_family() == 0x17 ) {
>>>>>>>>>>>>>>>> +      // On family 17h processors use XMM and
>>>>>>>>>>>>>>>> UnalignedLoadStores
>>>>>>>>>>>>>>>> for
>>>>>>>>>>>>>>>> Array Copy
>>>>>>>>>>>>>>>> +      if (supports_sse2() &&
>>>>>>>>>>>>>>>> FLAG_IS_DEFAULT(UseXMMForArrayCopy))
>>>>>>>>>>>>>>>> {
>>>>>>>>>>>>>>>> +        FLAG_SET_DEFAULT(UseXMMForArrayCopy, true);
>>>>>>>>>>>>>>>> +      }
>>>>>>>>>>>>>>>> +      if (supports_sse2() &&
>>>>>>>>>>>>>>>> FLAG_IS_DEFAULT(UseUnalignedLoadStores)) {
>>>>>>>>>>>>>>>> +        FLAG_SET_DEFAULT(UseUnalignedLoadStores, true);
>>>>>>>>>>>>>>>> +      }
>>>>>>>>>>>>>>>> +      if (supports_bmi2() &&
>>>>>>>>>>>>>>>> FLAG_IS_DEFAULT(UseBMI2Instructions))
>>>>>>>>>>>>>>>> {
>>>>>>>>>>>>>>>> +        FLAG_SET_DEFAULT(UseBMI2Instructions, true);
>>>>>>>>>>>>>>>> +      }
>>>>>>>>>>>>>>>> +      if (UseSHA) {
>>>>>>>>>>>>>>>> +        if (FLAG_IS_DEFAULT(UseSHA512Intrinsics)) {
>>>>>>>>>>>>>>>> +          FLAG_SET_DEFAULT(UseSHA512Intrinsics, false);
>>>>>>>>>>>>>>>> +        } else if (UseSHA512Intrinsics) {
>>>>>>>>>>>>>>>> +          warning("Intrinsics for SHA-384 and SHA-512 crypto
>>>>>>>>>>>>>>>> hash
>>>>>>>>>>>>>>>> functions not available on this CPU.");
>>>>>>>>>>>>>>>> +          FLAG_SET_DEFAULT(UseSHA512Intrinsics, false);
>>>>>>>>>>>>>>>> +        }
>>>>>>>>>>>>>>>> +      }
>>>>>>>>>>>>>>>> +#ifdef COMPILER2
>>>>>>>>>>>>>>>> +      if (supports_sse4_2()) {
>>>>>>>>>>>>>>>> +        if (FLAG_IS_DEFAULT(UseFPUForSpilling)) {
>>>>>>>>>>>>>>>> +          FLAG_SET_DEFAULT(UseFPUForSpilling, true);
>>>>>>>>>>>>>>>> +        }
>>>>>>>>>>>>>>>> +      }
>>>>>>>>>>>>>>>> +#endif
>>>>>>>>>>>>>>>> +    }
>>>>>>>>>>>>>>>>           }
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>           if( is_intel() ) { // Intel cpus specific settings
>>>>>>>>>>>>>>>> diff --git a/src/cpu/x86/vm/vm_version_x86.hpp
>>>>>>>>>>>>>>>> b/src/cpu/x86/vm/vm_version_x86.hpp
>>>>>>>>>>>>>>>> --- a/src/cpu/x86/vm/vm_version_x86.hpp
>>>>>>>>>>>>>>>> +++ b/src/cpu/x86/vm/vm_version_x86.hpp
>>>>>>>>>>>>>>>> @@ -505,6 +505,14 @@
>>>>>>>>>>>>>>>>               result |= CPU_CLMUL;
>>>>>>>>>>>>>>>>             if (_cpuid_info.sef_cpuid7_ebx.bits.rtm != 0)
>>>>>>>>>>>>>>>>               result |= CPU_RTM;
>>>>>>>>>>>>>>>> +    if(_cpuid_info.sef_cpuid7_ebx.bits.adx != 0)
>>>>>>>>>>>>>>>> +       result |= CPU_ADX;
>>>>>>>>>>>>>>>> +    if(_cpuid_info.sef_cpuid7_ebx.bits.bmi2 != 0)
>>>>>>>>>>>>>>>> +      result |= CPU_BMI2;
>>>>>>>>>>>>>>>> +    if (_cpuid_info.sef_cpuid7_ebx.bits.sha != 0)
>>>>>>>>>>>>>>>> +      result |= CPU_SHA;
>>>>>>>>>>>>>>>> +    if (_cpuid_info.std_cpuid1_ecx.bits.fma != 0)
>>>>>>>>>>>>>>>> +      result |= CPU_FMA;
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>             // AMD features.
>>>>>>>>>>>>>>>>             if (is_amd()) {
>>>>>>>>>>>>>>>> @@ -515,19 +523,13 @@
>>>>>>>>>>>>>>>>                 result |= CPU_LZCNT;
>>>>>>>>>>>>>>>>               if (_cpuid_info.ext_cpuid1_ecx.bits.sse4a != 0)
>>>>>>>>>>>>>>>>                 result |= CPU_SSE4A;
>>>>>>>>>>>>>>>> +      if(_cpuid_info.std_cpuid1_edx.bits.ht != 0)
>>>>>>>>>>>>>>>> +        result |= CPU_HT;
>>>>>>>>>>>>>>>>             }
>>>>>>>>>>>>>>>>             // Intel features.
>>>>>>>>>>>>>>>>             if(is_intel()) {
>>>>>>>>>>>>>>>> -      if(_cpuid_info.sef_cpuid7_ebx.bits.adx != 0)
>>>>>>>>>>>>>>>> -         result |= CPU_ADX;
>>>>>>>>>>>>>>>> -      if(_cpuid_info.sef_cpuid7_ebx.bits.bmi2 != 0)
>>>>>>>>>>>>>>>> -        result |= CPU_BMI2;
>>>>>>>>>>>>>>>> -      if (_cpuid_info.sef_cpuid7_ebx.bits.sha != 0)
>>>>>>>>>>>>>>>> -        result |= CPU_SHA;
>>>>>>>>>>>>>>>>               if(_cpuid_info.ext_cpuid1_ecx.bits.lzcnt_intel
>>>>>>>>>>>>>>>> !=
>>>>>>>>>>>>>>>> 0)
>>>>>>>>>>>>>>>>                 result |= CPU_LZCNT;
>>>>>>>>>>>>>>>> -      if (_cpuid_info.std_cpuid1_ecx.bits.fma != 0)
>>>>>>>>>>>>>>>> -        result |= CPU_FMA;
>>>>>>>>>>>>>>>>               // for Intel, ecx.bits.misalignsse bit (bit 8)
>>>>>>>>>>>>>>>> indicates
>>>>>>>>>>>>>>>> support for prefetchw
>>>>>>>>>>>>>>>>               if (_cpuid_info.ext_cpuid1_ecx.bits.misalignsse
>>>>>>>>>>>>>>>> !=
>>>>>>>>>>>>>>>> 0) {
>>>>>>>>>>>>>>>>                 result |= CPU_3DNOW_PREFETCH;
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>> Rohit
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On 9/1/17 8:04 AM, Rohit Arul Raj wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Fri, Sep 1, 2017 at 10:27 AM, Rohit Arul Raj
>>>>>>>>>>>>>>>>>> <rohitarulraj at gmail.com>
>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On Fri, Sep 1, 2017 at 3:01 AM, David Holmes
>>>>>>>>>>>>>>>>>>> <david.holmes at oracle.com>
>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Hi Rohit,
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> I think the patch needs updating for jdk10 as I already
>>>>>>>>>>>>>>>>>>>> see
>>>>>>>>>>>>>>>>>>>> a
>>>>>>>>>>>>>>>>>>>> lot of
>>>>>>>>>>>>>>>>>>>> logic
>>>>>>>>>>>>>>>>>>>> around UseSHA in vm_version_x86.cpp.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Thanks David, I will update the patch wrt JDK10 source
>>>>>>>>>>>>>>>>>>> base,
>>>>>>>>>>>>>>>>>>> test
>>>>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>> resubmit for review.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>>>> Rohit
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Hi All,
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I have updated the patch wrt openjdk10/hotspot (parent:
>>>>>>>>>>>>>>>>>> 13519:71337910df60), did regression testing using jtreg
>>>>>>>>>>>>>>>>>> ($make
>>>>>>>>>>>>>>>>>> default) and didnt find any regressions.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Can anyone please volunteer to review this patch  which
>>>>>>>>>>>>>>>>>> sets
>>>>>>>>>>>>>>>>>> flag/ISA
>>>>>>>>>>>>>>>>>> defaults for newer AMD 17h (EPYC) processor?
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> ************************* Patch
>>>>>>>>>>>>>>>>>> ****************************
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> diff --git a/src/cpu/x86/vm/vm_version_x86.cpp
>>>>>>>>>>>>>>>>>> b/src/cpu/x86/vm/vm_version_x86.cpp
>>>>>>>>>>>>>>>>>> --- a/src/cpu/x86/vm/vm_version_x86.cpp
>>>>>>>>>>>>>>>>>> +++ b/src/cpu/x86/vm/vm_version_x86.cpp
>>>>>>>>>>>>>>>>>> @@ -1088,6 +1088,22 @@
>>>>>>>>>>>>>>>>>>                }
>>>>>>>>>>>>>>>>>>                FLAG_SET_DEFAULT(UseSSE42Intrinsics, false);
>>>>>>>>>>>>>>>>>>              }
>>>>>>>>>>>>>>>>>> +    if (supports_sha()) {
>>>>>>>>>>>>>>>>>> +      if (FLAG_IS_DEFAULT(UseSHA)) {
>>>>>>>>>>>>>>>>>> +        FLAG_SET_DEFAULT(UseSHA, true);
>>>>>>>>>>>>>>>>>> +      }
>>>>>>>>>>>>>>>>>> +    } else if (UseSHA || UseSHA1Intrinsics ||
>>>>>>>>>>>>>>>>>> UseSHA256Intrinsics
>>>>>>>>>>>>>>>>>> ||
>>>>>>>>>>>>>>>>>> UseSHA512Intrinsics) {
>>>>>>>>>>>>>>>>>> +      if (!FLAG_IS_DEFAULT(UseSHA) ||
>>>>>>>>>>>>>>>>>> +          !FLAG_IS_DEFAULT(UseSHA1Intrinsics) ||
>>>>>>>>>>>>>>>>>> +          !FLAG_IS_DEFAULT(UseSHA256Intrinsics) ||
>>>>>>>>>>>>>>>>>> +          !FLAG_IS_DEFAULT(UseSHA512Intrinsics)) {
>>>>>>>>>>>>>>>>>> +        warning("SHA instructions are not available on
>>>>>>>>>>>>>>>>>> this
>>>>>>>>>>>>>>>>>> CPU");
>>>>>>>>>>>>>>>>>> +      }
>>>>>>>>>>>>>>>>>> +      FLAG_SET_DEFAULT(UseSHA, false);
>>>>>>>>>>>>>>>>>> +      FLAG_SET_DEFAULT(UseSHA1Intrinsics, false);
>>>>>>>>>>>>>>>>>> +      FLAG_SET_DEFAULT(UseSHA256Intrinsics, false);
>>>>>>>>>>>>>>>>>> +      FLAG_SET_DEFAULT(UseSHA512Intrinsics, false);
>>>>>>>>>>>>>>>>>> +    }
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>              // some defaults for AMD family 15h
>>>>>>>>>>>>>>>>>>              if ( cpu_family() == 0x15 ) {
>>>>>>>>>>>>>>>>>> @@ -1109,11 +1125,43 @@
>>>>>>>>>>>>>>>>>>              }
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>          #ifdef COMPILER2
>>>>>>>>>>>>>>>>>> -    if (MaxVectorSize > 16) {
>>>>>>>>>>>>>>>>>> -      // Limit vectors size to 16 bytes on current AMD
>>>>>>>>>>>>>>>>>> cpus.
>>>>>>>>>>>>>>>>>> +    if (cpu_family() < 0x17 && MaxVectorSize > 16) {
>>>>>>>>>>>>>>>>>> +      // Limit vectors size to 16 bytes on AMD cpus < 17h.
>>>>>>>>>>>>>>>>>>                FLAG_SET_DEFAULT(MaxVectorSize, 16);
>>>>>>>>>>>>>>>>>>              }
>>>>>>>>>>>>>>>>>>          #endif // COMPILER2
>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>> +    // Some defaults for AMD family 17h
>>>>>>>>>>>>>>>>>> +    if ( cpu_family() == 0x17 ) {
>>>>>>>>>>>>>>>>>> +      // On family 17h processors use XMM and
>>>>>>>>>>>>>>>>>> UnalignedLoadStores
>>>>>>>>>>>>>>>>>> for
>>>>>>>>>>>>>>>>>> Array Copy
>>>>>>>>>>>>>>>>>> +      if (supports_sse2() &&
>>>>>>>>>>>>>>>>>> FLAG_IS_DEFAULT(UseXMMForArrayCopy))
>>>>>>>>>>>>>>>>>> {
>>>>>>>>>>>>>>>>>> +        UseXMMForArrayCopy = true;
>>>>>>>>>>>>>>>>>> +      }
>>>>>>>>>>>>>>>>>> +      if (supports_sse2() &&
>>>>>>>>>>>>>>>>>> FLAG_IS_DEFAULT(UseUnalignedLoadStores))
>>>>>>>>>>>>>>>>>> {
>>>>>>>>>>>>>>>>>> +        UseUnalignedLoadStores = true;
>>>>>>>>>>>>>>>>>> +      }
>>>>>>>>>>>>>>>>>> +      if (supports_bmi2() &&
>>>>>>>>>>>>>>>>>> FLAG_IS_DEFAULT(UseBMI2Instructions)) {
>>>>>>>>>>>>>>>>>> +        UseBMI2Instructions = true;
>>>>>>>>>>>>>>>>>> +      }
>>>>>>>>>>>>>>>>>> +      if (MaxVectorSize > 32) {
>>>>>>>>>>>>>>>>>> +        FLAG_SET_DEFAULT(MaxVectorSize, 32);
>>>>>>>>>>>>>>>>>> +      }
>>>>>>>>>>>>>>>>>> +      if (UseSHA) {
>>>>>>>>>>>>>>>>>> +        if (FLAG_IS_DEFAULT(UseSHA512Intrinsics)) {
>>>>>>>>>>>>>>>>>> +          FLAG_SET_DEFAULT(UseSHA512Intrinsics, false);
>>>>>>>>>>>>>>>>>> +        } else if (UseSHA512Intrinsics) {
>>>>>>>>>>>>>>>>>> +          warning("Intrinsics for SHA-384 and SHA-512
>>>>>>>>>>>>>>>>>> crypto
>>>>>>>>>>>>>>>>>> hash
>>>>>>>>>>>>>>>>>> functions not available on this CPU.");
>>>>>>>>>>>>>>>>>> +          FLAG_SET_DEFAULT(UseSHA512Intrinsics, false);
>>>>>>>>>>>>>>>>>> +        }
>>>>>>>>>>>>>>>>>> +      }
>>>>>>>>>>>>>>>>>> +#ifdef COMPILER2
>>>>>>>>>>>>>>>>>> +      if (supports_sse4_2()) {
>>>>>>>>>>>>>>>>>> +        if (FLAG_IS_DEFAULT(UseFPUForSpilling)) {
>>>>>>>>>>>>>>>>>> +          FLAG_SET_DEFAULT(UseFPUForSpilling, true);
>>>>>>>>>>>>>>>>>> +        }
>>>>>>>>>>>>>>>>>> +      }
>>>>>>>>>>>>>>>>>> +#endif
>>>>>>>>>>>>>>>>>> +    }
>>>>>>>>>>>>>>>>>>            }
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>            if( is_intel() ) { // Intel cpus specific
>>>>>>>>>>>>>>>>>> settings
>>>>>>>>>>>>>>>>>> diff --git a/src/cpu/x86/vm/vm_version_x86.hpp
>>>>>>>>>>>>>>>>>> b/src/cpu/x86/vm/vm_version_x86.hpp
>>>>>>>>>>>>>>>>>> --- a/src/cpu/x86/vm/vm_version_x86.hpp
>>>>>>>>>>>>>>>>>> +++ b/src/cpu/x86/vm/vm_version_x86.hpp
>>>>>>>>>>>>>>>>>> @@ -505,6 +505,14 @@
>>>>>>>>>>>>>>>>>>                result |= CPU_CLMUL;
>>>>>>>>>>>>>>>>>>              if (_cpuid_info.sef_cpuid7_ebx.bits.rtm != 0)
>>>>>>>>>>>>>>>>>>                result |= CPU_RTM;
>>>>>>>>>>>>>>>>>> +    if(_cpuid_info.sef_cpuid7_ebx.bits.adx != 0)
>>>>>>>>>>>>>>>>>> +       result |= CPU_ADX;
>>>>>>>>>>>>>>>>>> +    if(_cpuid_info.sef_cpuid7_ebx.bits.bmi2 != 0)
>>>>>>>>>>>>>>>>>> +      result |= CPU_BMI2;
>>>>>>>>>>>>>>>>>> +    if (_cpuid_info.sef_cpuid7_ebx.bits.sha != 0)
>>>>>>>>>>>>>>>>>> +      result |= CPU_SHA;
>>>>>>>>>>>>>>>>>> +    if (_cpuid_info.std_cpuid1_ecx.bits.fma != 0)
>>>>>>>>>>>>>>>>>> +      result |= CPU_FMA;
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>              // AMD features.
>>>>>>>>>>>>>>>>>>              if (is_amd()) {
>>>>>>>>>>>>>>>>>> @@ -515,19 +523,13 @@
>>>>>>>>>>>>>>>>>>                  result |= CPU_LZCNT;
>>>>>>>>>>>>>>>>>>                if (_cpuid_info.ext_cpuid1_ecx.bits.sse4a !=
>>>>>>>>>>>>>>>>>> 0)
>>>>>>>>>>>>>>>>>>                  result |= CPU_SSE4A;
>>>>>>>>>>>>>>>>>> +      if(_cpuid_info.std_cpuid1_edx.bits.ht != 0)
>>>>>>>>>>>>>>>>>> +        result |= CPU_HT;
>>>>>>>>>>>>>>>>>>              }
>>>>>>>>>>>>>>>>>>              // Intel features.
>>>>>>>>>>>>>>>>>>              if(is_intel()) {
>>>>>>>>>>>>>>>>>> -      if(_cpuid_info.sef_cpuid7_ebx.bits.adx != 0)
>>>>>>>>>>>>>>>>>> -         result |= CPU_ADX;
>>>>>>>>>>>>>>>>>> -      if(_cpuid_info.sef_cpuid7_ebx.bits.bmi2 != 0)
>>>>>>>>>>>>>>>>>> -        result |= CPU_BMI2;
>>>>>>>>>>>>>>>>>> -      if (_cpuid_info.sef_cpuid7_ebx.bits.sha != 0)
>>>>>>>>>>>>>>>>>> -        result |= CPU_SHA;
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> if(_cpuid_info.ext_cpuid1_ecx.bits.lzcnt_intel
>>>>>>>>>>>>>>>>>> != 0)
>>>>>>>>>>>>>>>>>>                  result |= CPU_LZCNT;
>>>>>>>>>>>>>>>>>> -      if (_cpuid_info.std_cpuid1_ecx.bits.fma != 0)
>>>>>>>>>>>>>>>>>> -        result |= CPU_FMA;
>>>>>>>>>>>>>>>>>>                // for Intel, ecx.bits.misalignsse bit (bit
>>>>>>>>>>>>>>>>>> 8)
>>>>>>>>>>>>>>>>>> indicates
>>>>>>>>>>>>>>>>>> support for prefetchw
>>>>>>>>>>>>>>>>>>                if
>>>>>>>>>>>>>>>>>> (_cpuid_info.ext_cpuid1_ecx.bits.misalignsse
>>>>>>>>>>>>>>>>>> !=
>>>>>>>>>>>>>>>>>> 0) {
>>>>>>>>>>>>>>>>>>                  result |= CPU_3DNOW_PREFETCH;
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> **************************************************************
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>> Rohit
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On 1/09/2017 1:11 AM, Rohit Arul Raj wrote:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> On Thu, Aug 31, 2017 at 5:59 PM, David Holmes
>>>>>>>>>>>>>>>>>>>>> <david.holmes at oracle.com>
>>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Hi Rohit,
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> On 31/08/2017 7:03 PM, Rohit Arul Raj wrote:
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> I would like an volunteer to review this patch
>>>>>>>>>>>>>>>>>>>>>>> (openJDK9)
>>>>>>>>>>>>>>>>>>>>>>> which
>>>>>>>>>>>>>>>>>>>>>>> sets
>>>>>>>>>>>>>>>>>>>>>>> flag/ISA defaults for newer AMD 17h (EPYC) processor
>>>>>>>>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>>>>>> help
>>>>>>>>>>>>>>>>>>>>>>> us
>>>>>>>>>>>>>>>>>>>>>>> with
>>>>>>>>>>>>>>>>>>>>>>> the commit process.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Webrev:
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> https://www.dropbox.com/sh/08bsxaxupg8kbam/AADurTXLGIZ6C-tiIAi_Glyka?dl=0
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Unfortunately patches can not be accepted from systems
>>>>>>>>>>>>>>>>>>>>>> outside
>>>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>> OpenJDK
>>>>>>>>>>>>>>>>>>>>>> infrastructure and ...
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> I have also attached the patch (hg diff -g) for
>>>>>>>>>>>>>>>>>>>>>>> reference.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> ... unfortunately patches tend to get stripped by the
>>>>>>>>>>>>>>>>>>>>>> mail
>>>>>>>>>>>>>>>>>>>>>> servers.
>>>>>>>>>>>>>>>>>>>>>> If
>>>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>> patch is small please include it inline. Otherwise you
>>>>>>>>>>>>>>>>>>>>>> will
>>>>>>>>>>>>>>>>>>>>>> need
>>>>>>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>>> find
>>>>>>>>>>>>>>>>>>>>>> an
>>>>>>>>>>>>>>>>>>>>>> OpenJDK Author who can host it for you on
>>>>>>>>>>>>>>>>>>>>>> cr.openjdk.java.net.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> 3) I have done regression testing using jtreg ($make
>>>>>>>>>>>>>>>>>>>>>>> default)
>>>>>>>>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>>>>>> didnt find any regressions.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Sounds good, but until I see the patch it is hard to
>>>>>>>>>>>>>>>>>>>>>> comment
>>>>>>>>>>>>>>>>>>>>>> on
>>>>>>>>>>>>>>>>>>>>>> testing
>>>>>>>>>>>>>>>>>>>>>> requirements.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Thanks David,
>>>>>>>>>>>>>>>>>>>>> Yes, it's a small patch.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> diff --git a/src/cpu/x86/vm/vm_version_x86.cpp
>>>>>>>>>>>>>>>>>>>>> b/src/cpu/x86/vm/vm_version_x86.cpp
>>>>>>>>>>>>>>>>>>>>> --- a/src/cpu/x86/vm/vm_version_x86.cpp
>>>>>>>>>>>>>>>>>>>>> +++ b/src/cpu/x86/vm/vm_version_x86.cpp
>>>>>>>>>>>>>>>>>>>>> @@ -1051,6 +1051,22 @@
>>>>>>>>>>>>>>>>>>>>>                 }
>>>>>>>>>>>>>>>>>>>>>                 FLAG_SET_DEFAULT(UseSSE42Intrinsics,
>>>>>>>>>>>>>>>>>>>>> false);
>>>>>>>>>>>>>>>>>>>>>               }
>>>>>>>>>>>>>>>>>>>>> +    if (supports_sha()) {
>>>>>>>>>>>>>>>>>>>>> +      if (FLAG_IS_DEFAULT(UseSHA)) {
>>>>>>>>>>>>>>>>>>>>> +        FLAG_SET_DEFAULT(UseSHA, true);
>>>>>>>>>>>>>>>>>>>>> +      }
>>>>>>>>>>>>>>>>>>>>> +    } else if (UseSHA || UseSHA1Intrinsics ||
>>>>>>>>>>>>>>>>>>>>> UseSHA256Intrinsics
>>>>>>>>>>>>>>>>>>>>> ||
>>>>>>>>>>>>>>>>>>>>> UseSHA512Intrinsics) {
>>>>>>>>>>>>>>>>>>>>> +      if (!FLAG_IS_DEFAULT(UseSHA) ||
>>>>>>>>>>>>>>>>>>>>> +          !FLAG_IS_DEFAULT(UseSHA1Intrinsics) ||
>>>>>>>>>>>>>>>>>>>>> +          !FLAG_IS_DEFAULT(UseSHA256Intrinsics) ||
>>>>>>>>>>>>>>>>>>>>> +          !FLAG_IS_DEFAULT(UseSHA512Intrinsics)) {
>>>>>>>>>>>>>>>>>>>>> +        warning("SHA instructions are not available on
>>>>>>>>>>>>>>>>>>>>> this
>>>>>>>>>>>>>>>>>>>>> CPU");
>>>>>>>>>>>>>>>>>>>>> +      }
>>>>>>>>>>>>>>>>>>>>> +      FLAG_SET_DEFAULT(UseSHA, false);
>>>>>>>>>>>>>>>>>>>>> +      FLAG_SET_DEFAULT(UseSHA1Intrinsics, false);
>>>>>>>>>>>>>>>>>>>>> +      FLAG_SET_DEFAULT(UseSHA256Intrinsics, false);
>>>>>>>>>>>>>>>>>>>>> +      FLAG_SET_DEFAULT(UseSHA512Intrinsics, false);
>>>>>>>>>>>>>>>>>>>>> +    }
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>               // some defaults for AMD family 15h
>>>>>>>>>>>>>>>>>>>>>               if ( cpu_family() == 0x15 ) {
>>>>>>>>>>>>>>>>>>>>> @@ -1072,11 +1088,43 @@
>>>>>>>>>>>>>>>>>>>>>               }
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>           #ifdef COMPILER2
>>>>>>>>>>>>>>>>>>>>> -    if (MaxVectorSize > 16) {
>>>>>>>>>>>>>>>>>>>>> -      // Limit vectors size to 16 bytes on current AMD
>>>>>>>>>>>>>>>>>>>>> cpus.
>>>>>>>>>>>>>>>>>>>>> +    if (cpu_family() < 0x17 && MaxVectorSize > 16) {
>>>>>>>>>>>>>>>>>>>>> +      // Limit vectors size to 16 bytes on AMD cpus <
>>>>>>>>>>>>>>>>>>>>> 17h.
>>>>>>>>>>>>>>>>>>>>>                 FLAG_SET_DEFAULT(MaxVectorSize, 16);
>>>>>>>>>>>>>>>>>>>>>               }
>>>>>>>>>>>>>>>>>>>>>           #endif // COMPILER2
>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>> +    // Some defaults for AMD family 17h
>>>>>>>>>>>>>>>>>>>>> +    if ( cpu_family() == 0x17 ) {
>>>>>>>>>>>>>>>>>>>>> +      // On family 17h processors use XMM and
>>>>>>>>>>>>>>>>>>>>> UnalignedLoadStores
>>>>>>>>>>>>>>>>>>>>> for
>>>>>>>>>>>>>>>>>>>>> Array Copy
>>>>>>>>>>>>>>>>>>>>> +      if (supports_sse2() &&
>>>>>>>>>>>>>>>>>>>>> FLAG_IS_DEFAULT(UseXMMForArrayCopy))
>>>>>>>>>>>>>>>>>>>>> {
>>>>>>>>>>>>>>>>>>>>> +        UseXMMForArrayCopy = true;
>>>>>>>>>>>>>>>>>>>>> +      }
>>>>>>>>>>>>>>>>>>>>> +      if (supports_sse2() &&
>>>>>>>>>>>>>>>>>>>>> FLAG_IS_DEFAULT(UseUnalignedLoadStores))
>>>>>>>>>>>>>>>>>>>>> {
>>>>>>>>>>>>>>>>>>>>> +        UseUnalignedLoadStores = true;
>>>>>>>>>>>>>>>>>>>>> +      }
>>>>>>>>>>>>>>>>>>>>> +      if (supports_bmi2() &&
>>>>>>>>>>>>>>>>>>>>> FLAG_IS_DEFAULT(UseBMI2Instructions))
>>>>>>>>>>>>>>>>>>>>> {
>>>>>>>>>>>>>>>>>>>>> +        UseBMI2Instructions = true;
>>>>>>>>>>>>>>>>>>>>> +      }
>>>>>>>>>>>>>>>>>>>>> +      if (MaxVectorSize > 32) {
>>>>>>>>>>>>>>>>>>>>> +        FLAG_SET_DEFAULT(MaxVectorSize, 32);
>>>>>>>>>>>>>>>>>>>>> +      }
>>>>>>>>>>>>>>>>>>>>> +      if (UseSHA) {
>>>>>>>>>>>>>>>>>>>>> +        if (FLAG_IS_DEFAULT(UseSHA512Intrinsics)) {
>>>>>>>>>>>>>>>>>>>>> +          FLAG_SET_DEFAULT(UseSHA512Intrinsics, false);
>>>>>>>>>>>>>>>>>>>>> +        } else if (UseSHA512Intrinsics) {
>>>>>>>>>>>>>>>>>>>>> +          warning("Intrinsics for SHA-384 and SHA-512
>>>>>>>>>>>>>>>>>>>>> crypto
>>>>>>>>>>>>>>>>>>>>> hash
>>>>>>>>>>>>>>>>>>>>> functions not available on this CPU.");
>>>>>>>>>>>>>>>>>>>>> +          FLAG_SET_DEFAULT(UseSHA512Intrinsics, false);
>>>>>>>>>>>>>>>>>>>>> +        }
>>>>>>>>>>>>>>>>>>>>> +      }
>>>>>>>>>>>>>>>>>>>>> +#ifdef COMPILER2
>>>>>>>>>>>>>>>>>>>>> +      if (supports_sse4_2()) {
>>>>>>>>>>>>>>>>>>>>> +        if (FLAG_IS_DEFAULT(UseFPUForSpilling)) {
>>>>>>>>>>>>>>>>>>>>> +          FLAG_SET_DEFAULT(UseFPUForSpilling, true);
>>>>>>>>>>>>>>>>>>>>> +        }
>>>>>>>>>>>>>>>>>>>>> +      }
>>>>>>>>>>>>>>>>>>>>> +#endif
>>>>>>>>>>>>>>>>>>>>> +    }
>>>>>>>>>>>>>>>>>>>>>             }
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>             if( is_intel() ) { // Intel cpus specific
>>>>>>>>>>>>>>>>>>>>> settings
>>>>>>>>>>>>>>>>>>>>> diff --git a/src/cpu/x86/vm/vm_version_x86.hpp
>>>>>>>>>>>>>>>>>>>>> b/src/cpu/x86/vm/vm_version_x86.hpp
>>>>>>>>>>>>>>>>>>>>> --- a/src/cpu/x86/vm/vm_version_x86.hpp
>>>>>>>>>>>>>>>>>>>>> +++ b/src/cpu/x86/vm/vm_version_x86.hpp
>>>>>>>>>>>>>>>>>>>>> @@ -513,6 +513,16 @@
>>>>>>>>>>>>>>>>>>>>>                   result |= CPU_LZCNT;
>>>>>>>>>>>>>>>>>>>>>                 if (_cpuid_info.ext_cpuid1_ecx.bits.sse4a
>>>>>>>>>>>>>>>>>>>>> !=
>>>>>>>>>>>>>>>>>>>>> 0)
>>>>>>>>>>>>>>>>>>>>>                   result |= CPU_SSE4A;
>>>>>>>>>>>>>>>>>>>>> +      if(_cpuid_info.sef_cpuid7_ebx.bits.bmi2 != 0)
>>>>>>>>>>>>>>>>>>>>> +        result |= CPU_BMI2;
>>>>>>>>>>>>>>>>>>>>> +      if(_cpuid_info.std_cpuid1_edx.bits.ht != 0)
>>>>>>>>>>>>>>>>>>>>> +        result |= CPU_HT;
>>>>>>>>>>>>>>>>>>>>> +      if(_cpuid_info.sef_cpuid7_ebx.bits.adx != 0)
>>>>>>>>>>>>>>>>>>>>> +        result |= CPU_ADX;
>>>>>>>>>>>>>>>>>>>>> +      if (_cpuid_info.sef_cpuid7_ebx.bits.sha != 0)
>>>>>>>>>>>>>>>>>>>>> +        result |= CPU_SHA;
>>>>>>>>>>>>>>>>>>>>> +      if (_cpuid_info.std_cpuid1_ecx.bits.fma != 0)
>>>>>>>>>>>>>>>>>>>>> +        result |= CPU_FMA;
>>>>>>>>>>>>>>>>>>>>>               }
>>>>>>>>>>>>>>>>>>>>>               // Intel features.
>>>>>>>>>>>>>>>>>>>>>               if(is_intel()) {
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>>>>>> Rohit
>>>>>>>>>>>>>>>>>>>>>


From coleen.phillimore at oracle.com  Mon Oct 16 14:56:35 2017
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Mon, 16 Oct 2017 10:56:35 -0400
Subject: RFR: 8189333: Fix Zero build after Atomic::xchg changes
In-Reply-To: <9cd66129-3636-8de3-4624-a69bd8f28b99@redhat.com>
References: <003ff7d9-759f-1ef5-f580-18c2571b63e5@redhat.com>
 <eba9e499-71e5-b092-b5e9-4a766cca4f83@oracle.com>
 <b0c31664-fc25-8608-8c23-882711691c13@redhat.com>
 <441ed55f-6398-9fa1-d571-86548ed5a2a9@oracle.com>
 <9cd66129-3636-8de3-4624-a69bd8f28b99@redhat.com>
Message-ID: <60f5a96a-8f91-8114-c8de-8a72004fcb75@oracle.com>


Thank you for the patch.? I have scripts to remember these options. 
There used to be other options but it doesn't look like I need them now.
thanks,
Coleen

On 10/16/17 9:45 AM, Roman Kennke wrote:
> Hi Coleen,
>
> Nope. It fails with this (and a bunch of similar) errors:
> https://paste.fedoraproject.org/paste/cWKozoxY23z72~EMm0BPBA 
> <https://paste.fedoraproject.org/paste/cWKozoxY23z72%7EEMm0BPBA>
>
> It does build with this additional patch:
> http://cr.openjdk.java.net/~rkennke/fix-zero-coleen/webrev/ 
> <http://cr.openjdk.java.net/%7Erkennke/fix-zero-coleen/webrev/>
>
> I.e.:
> - cast BasicLock to markOop by using markOopDesc::encode()
> - use oopDesc::cas_set_mark() instead of the raw Atomic ops (probably 
> not strictly required for this change, but still much nicer)
>
>
> You should not require any build scripts for Zero though. Simply run 
> configure with --with-jvm-variants=zero and build in the corresponding 
> linux-x86_64-normal-zero-slowdebug or similar directory using the 
> usual make calls.
>
>
>>
>> Hi Roman, Can you build zero with this changeset?
>>
>> http://cr.openjdk.java.net/~coleenp/8188220.03/webrev/index.html
>>
>> My scripts for building zero are broken now.
>>
>> thanks,
>> Coleen
>>
>> On 10/15/17 5:40 PM, Roman Kennke wrote:
>>> Am 15.10.2017 um 23:32 schrieb David Holmes:
>>>> Hi Roman,
>>>>
>>>> On 16/10/2017 7:12 AM, Roman Kennke wrote:
>>>>> Zero debug build has been broken by: JDK-8187977: Generalize 
>>>>> Atomic::xchg to use templates.
>>>>>
>>>>> This patch fixes it by casting the unsigned literal to jint:
>>>>>
>>>>> http://cr.openjdk.java.net/~rkennke/8189333/webrev.00/ 
>>>>> <http://cr.openjdk.java.net/%7Erkennke/8189333/webrev.00/>
>>>>
>>>> Looks fine.
>>>>
>>>> I can push this for you straight away (relatively speaking :) ) 
>>>> under the trivial rule.
>>> Thanks!
>>>
>>> Roman
>>
>


From stefan.karlsson at oracle.com  Mon Oct 16 15:40:04 2017
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Mon, 16 Oct 2017 17:40:04 +0200
Subject: 8189360: JvmtiExport::weak_oops_do is called for all JNIHandleBlock
 instances
Message-ID: <8e8b2dd7-3e49-ef54-6e3b-f13fb847cbd8@oracle.com>

Hi all,

Please review this patch to move the call of the static 
JvmtiExport::weak_oops_do out of the JNIHandleBlock::weak_oops_do member 
function into the new WeakProcessor.

Today, this isn't causing any bugs because there's only one instance of 
JNIHandleBlock, the _weak_global_handles. However, in prototypes with 
more than one JNIHandleBlock, this results in multiple calls to 
JvmtiExport::weak_oops_do.

http://cr.openjdk.java.net/~stefank/8189360/webrev.00/
https://bugs.openjdk.java.net/browse/JDK-8189360

This patch builds upon the patch in:
http://mail.openjdk.java.net/pipermail/hotspot-dev/2017-October/028684.html

Tested with JPRT.

Thanks,
StefanK

From coleen.phillimore at oracle.com  Mon Oct 16 15:59:45 2017
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Mon, 16 Oct 2017 11:59:45 -0400
Subject: RFR (L, but tedious) 8188220: Remove Atomic::*_ptr() uses and
 overloads from hotspot
In-Reply-To: <2f32124d-2428-678d-ef50-3306231aa848@oracle.com>
References: <b6d75bd8-ca77-6507-f1f0-db57bbb99ae0@oracle.com>
 <7A475565-84D9-4F98-AE7B-2FDB206CC6E1@oracle.com>
 <49b7c5f7-2f6d-16ac-0b60-140619d0fffd@oracle.com>
 <f50c2c96-11a8-2d02-5eaf-292451b6c6ff@oracle.com>
 <0784FA88-3D00-4DBA-8726-3A3B23C91B3E@oracle.com>
 <2f32124d-2428-678d-ef50-3306231aa848@oracle.com>
Message-ID: <5a787ec8-afe6-b8a6-23de-5d6a5b935035@oracle.com>


The latest incremental based on these comments (now running tier1).
http://cr.openjdk.java.net/~coleenp/8188220.review-comments.02/webrev/index.html

plus what Roman sent in the "RFR: 8189333: Fix Zero build after 
Atomic::xchg changes" thread.

thanks,
Coleen

On 10/16/17 9:13 AM, coleen.phillimore at oracle.com wrote:
>
>
> On 10/14/17 7:36 PM, Kim Barrett wrote:
>>> On Oct 13, 2017, at 2:34 PM, coleen.phillimore at oracle.com wrote:
>>>
>>>
>>> Hi, Here is the version with the changes from Kim's comments that 
>>> has passed at least testing with JPRT and tier1, locally.?? More 
>>> testing (tier2-5) is in progress.
>>>
>>> Also includes a corrected version of Atomic::sub care of Erik 
>>> Osterlund.
>>>
>>> open webrev at 
>>> http://cr.openjdk.java.net/~coleenp/8188220.kim-review-changes/webrev
>>> open webrev at 
>>> http://cr.openjdk.java.net/~coleenp/8188220.review-comments/webrev
>>>
>>> Full version:
>>>
>>> http://cr.openjdk.java.net/~coleenp/8188220.03/webrev
>>>
>>> Thanks!
>>> Coleen
>> I still dislike and disagree with what is being proposed regarding 
>> replace_if_null.
>
> We can discuss that seperately, please file an RFE.
>>
>> ------------------------------------------------------------------------------ 
>>
>> I forgot that I'd promised you an updated Atomic::sub definition.
>> Unfortunately, the new one still has problems, performing some
>> conversions that should not be permitted (and are disallowed by
>> Atomic::add).? Try this instead.? (This hasn't been tested, not even
>> compiled; hopefully I don't have any typos or anything.)? The intent
>> is that this supports the same conversions as Atomic::add.
>>
>> template<typename I, typename D>
>> inline D Atomic::sub(I sub_value, D volatile* dest) {
>> ?? STATIC_ASSERT(IsPointer<D>::value || IsIntegral<D>::value);
>> ?? STATIC_ASSERT(IsIntegral<I>::value);
>> ?? // If D is a pointer type, use [u]intptr_t as the addend type,
>> ?? // matching signedness of I.? Otherwise, use D as the addend type.
>> ?? typedef typename Conditional<IsSigned<I>::value, intptr_t, 
>> uintptr_t>::type PI;
>> ?? typedef typename Conditional<IsPointer<D>::value, PI, D>::type 
>> AddendType;
>> ?? // Only allow conversions that can't change the value.
>> ?? STATIC_ASSERT(IsSigned<I>::value == IsSigned<AddendType>::value);
>> ?? STATIC_ASSERT(sizeof(I) <= sizeof(AddendType));
>> ?? AddendType addend = sub_value;
>> ?? // Assumes two's complement integer representation.
>> ?? #pragma warning(suppress: 4146) // In case AddendType is not signed.
>> ?? return Atomic::add(-addend, dest);
>> }
>
> Uh, Ok.? I'll try it out.
>>
>>>>> src/hotspot/share/gc/cms/concurrentMarkSweepGeneration.cpp
>>>>> 7960?? Atomic::add(-n, &_num_par_pushes);
>>>>>
>>>>> Atomic::sub
>>>> fixed.
>> Nope, not fixed in http://cr.openjdk.java.net/~coleenp/8188220.03/webrev
>
> Missed it twice now.? I think I have it now.
>>>>> src/hotspot/share/gc/g1/heapRegionRemSet.cpp
>>>>> ?? 200?????? PerRegionTable* res =
>>>>> ?? 201???????? Atomic::cmpxchg(nxt, &_free_list, fl);
>>>>>
>>>>> Please remove the line break, now that the code has been simplified.
>>>>>
>>>>> But wait, doesn't this alloc exhibit classic ABA problems?? I *think*
>>>>> this works because alloc and bulk_free are called in different 
>>>>> phases,
>>>>> never overlapping.
>>>> I don't know.? Do you want to file a bug to investigate this?
>>>> fixed.
>> No, I now think it?s ok, though confusing.
>>
>>>>> src/hotspot/share/gc/g1/sparsePRT.cpp
>>>>> ?? 295???? SparsePRT* res =
>>>>> ?? 296?????? Atomic::cmpxchg(sprt, &_head_expanded_list, hd);
>>>>> and
>>>>> ?? 307???? SparsePRT* res =
>>>>> ?? 308?????? Atomic::cmpxchg(next, &_head_expanded_list, hd);
>>>>>
>>>>> I'd rather not have the line breaks in these either.
>>>>>
>>>>> And get_from_expanded_list also appears to have classic ABA problems.
>>>>> I *think* this works because add_to_expanded_list and
>>>>> get_from_expanded_list are called in different phases, never
>>>>> overlapping.
>>>> Fixed, same question as above?? Or one bug to investigate both?
>> Again, I think it?s ok, though confusing.
>>
>>>>> src/hotspot/share/gc/shared/taskqueue.inline.hpp
>>>>> ?? 262?? return (size_t) Atomic::cmpxchg((intptr_t)new_age._data,
>>>>> ?? 263?????????????????????????????????? (volatile intptr_t *)&_data,
>>>>> ?? 264 (intptr_t)old_age._data);
>>>>>
>>>>> This should be
>>>>>
>>>>> ??? return Atomic::cmpxchg(new_age._data, &_data, old_age._data);
>>>> fixed.
>> Still casting the result.
>
> I thought I fixed it.? I think I fixed it now.
>>
>>>>> src/hotspot/share/oops/method.hpp
>>>>> ?? 139?? volatile address from_compiled_entry() const?? { return 
>>>>> OrderAccess::load_acquire(&_from_compiled_entry); }
>>>>> ?? 140?? volatile address from_compiled_entry_no_trampoline() const;
>>>>> ?? 141?? volatile address from_interpreted_entry() const{ return 
>>>>> OrderAccess::load_acquire(&_from_interpreted_entry); }
>>>>>
>>>>> [pre-existing]
>>>>> The volatile qualifiers here seem suspect to me.
>>>> Again much suspicion about concurrency and giant pain, which I 
>>>> remember, of debugging these when they were broken.
>> Let me be more direct: the volatile qualifiers for the function return
>> types are bogus and confusing, and should be removed.
>
> Okay, sure.
>
>>
>>>>> src/hotspot/share/prims/jni.cpp
>>>>>
>>>>> [pre-existing]
>>>>>
>>>>> copy_jni_function_table should be using Copy::disjoint_words_atomic.
>>>> yuck.
>> Of course, neither is entirely technically correct, since both are
>> treating conversion of function pointers to void* as okay in shared
>> code, e.g. violating some of the raison d'etre of CAST_{TO,FROM}_FN_PTR.
>> For way more detail than you probably care about, see the discussion
>> starting here:
>> http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2016-March/018578.html 
>>
>> through (5 messages in total)
>> http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2016-March/018623.html 
>>
>>
>> Oh well.
>>
>>>>> src/hotspot/share/runtime/mutex.hpp
>>>>>
>>>>> [pre-existing]
>>>>>
>>>>> I think the Address member of the SplitWord union is unused. Looking
>>>>> at AcquireOrPush (and others), I'm wondering whether it *should* be
>>>>> used there, or whether just using intptr_t casts and doing integral
>>>>> arithmetic (as is presently being done) is easier and clearer.
>>>>>
>>>>> Also the _LSBINDEX macro probably ought to be defined in mutex.cpp
>>>>> rather than polluting the global namespace.? And technically, that
>>>>> name is reserved word.
>>>> I moved both this and _LBIT into the top of mutex.cpp since they 
>>>> are used there.
>> Good.
>>
>>>> Cant define const intptr_t _LBIT =1; in a class in our version of C++.
>> Sorry, please explain?? If you tried to move it into SplitWord, that 
>> doesn?t work;
>> unions are not permitted to have static data members (I don?t 
>> off-hand know why,
>> just that it?s explicitly forbidden).
>>
>> And you left the seemingly unused Address member in SplitWord.
>
> This is the compilation error I get:
>
> /scratch/cphillim/hg/10ptr2/open/src/hotspot/share/runtime/mutex.hpp:124:33: 
> error: non-static data member initializers only available with 
> -std=c++11 or -std=gnu++11 [-Werror]
> ?? const intptr_t _NEW_LOCKBIT = 1;
>
>
> I don't own this SplitWord code so do not want to remove the unused 
> Address member.
>
>>
>>>>> src/hotspot/share/runtime/thread.cpp
>>>>> 4707?? intptr_t w = Atomic::cmpxchg((intptr_t)LOCKBIT, Lock, 
>>>>> (intptr_t)0);
>>>>>
>>>>> This and other places suggest LOCKBIT should be defined as intptr_t,
>>>>> rather than as an enum value.? The MuxBits enum type is unused.
>>>>>
>>>>> And the cast of 0 is another case where implicit widening would be 
>>>>> nice.
>>>> Making LOCKBIT a const intptr_t = 1 removes a lot of casts.
>> Because of the new definition of LOCKBIT I noticed the immediately
>> preceeding typedef for MutexT, which seems to be unused.
>
> Removed MutexT.
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/oops/cpCache.cpp
>> ? 114 bool ConstantPoolCacheEntry::init_flags_atomic(intx flags) {
>> ? 115?? intptr_t result = Atomic::cmpxchg(flags, &_flags, (intx)0);
>> ? 116?? return (result == 0);
>> ? 117 }
>>
>> [I missed this on earlier pass.]
>>
>> Should be
>>
>> bool ConstantPoolCacheEntry::init_flags_atomic(intx flags) {
>> ?? return Atomic::cmpxchg(flags, &_flags, (intx)0) == 0;
>> }
>>
>> Otherwise, I end up asking why result is intptr_t when the cmpxchg is
>> dealing with intx.? Yeah, one's a typedef of the other, but mixing
>> them like that in the same expression is not helpful.
>>
>>
> Sure why not?
>
> Actually init_flags_atomic is not used and neither is 
> init_method_flags_atomic so I did one better and removed them.
>
> Thanks for the again thorough code review and Atomic::sub.?? I'll post 
> incremental when it compiles.
>
> Coleen


From kim.barrett at oracle.com  Mon Oct 16 17:14:48 2017
From: kim.barrett at oracle.com (Kim Barrett)
Date: Mon, 16 Oct 2017 13:14:48 -0400
Subject: RFR (L, but tedious) 8188220: Remove Atomic::*_ptr() uses and
 overloads from hotspot
In-Reply-To: <2f32124d-2428-678d-ef50-3306231aa848@oracle.com>
References: <b6d75bd8-ca77-6507-f1f0-db57bbb99ae0@oracle.com>
 <7A475565-84D9-4F98-AE7B-2FDB206CC6E1@oracle.com>
 <49b7c5f7-2f6d-16ac-0b60-140619d0fffd@oracle.com>
 <f50c2c96-11a8-2d02-5eaf-292451b6c6ff@oracle.com>
 <0784FA88-3D00-4DBA-8726-3A3B23C91B3E@oracle.com>
 <2f32124d-2428-678d-ef50-3306231aa848@oracle.com>
Message-ID: <36C02953-AF4E-4A89-92CE-70FE4293965A@oracle.com>

> On Oct 16, 2017, at 9:13 AM, coleen.phillimore at oracle.com wrote:
>>>> Cant define const intptr_t _LBIT =1; in a class in our version of C++.
>> Sorry, please explain?  If you tried to move it into SplitWord, that doesn?t work;
>> unions are not permitted to have static data members (I don?t off-hand know why,
>> just that it?s explicitly forbidden).
>> 
>> And you left the seemingly unused Address member in SplitWord.
> 
> This is the compilation error I get:
> 
> /scratch/cphillim/hg/10ptr2/open/src/hotspot/share/runtime/mutex.hpp:124:33: error: non-static data member initializers only available with -std=c++11 or -std=gnu++11 [-Werror]
>    const intptr_t _NEW_LOCKBIT = 1;

Needs ?static? in a class.


From coleen.phillimore at oracle.com  Mon Oct 16 19:31:56 2017
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Mon, 16 Oct 2017 15:31:56 -0400
Subject: RFR: 8171853: Remove Shark compiler
In-Reply-To: <87BC5241-9C27-457F-9856-3D969831DABC@physik.fu-berlin.de>
References: <92a2fec1-88f1-2579-1202-299b13062b7b@redhat.com>
 <87BC5241-9C27-457F-9856-3D969831DABC@physik.fu-berlin.de>
Message-ID: <0ce8d126-1f0c-5cd3-edbc-b9bd36d17801@oracle.com>


On 10/15/17 2:06 AM, John Paul Adrian Glaubitz wrote:
> Hi Roman!
>
> Please let me look at SPARC next week first before merging this.
>
> And thanks for notifying me that Zero is broken again *sigh*.
>
> People, please test your changes. Yes, I know you all just care about Hotspot. But please understand that there are many people out there who rely on Zero, i.e. they are using it. Breaking code that people actively use is not nice and should not happen in a project like OpenJDK.
>
> Building Zero takes maybe 5 minutes on a fast x86 machine, so I would like to ask everyone to please test their changes against Zero as well. These tests will keep the headaches for people relying on Zero low and also avoids that distributions have to ship many patches on top of OpenJDK upstream.

I used to be able to compile and link Zero and have fixed it but as an 
occasional task, it's something that stops working.

At one point, I thought I'd filed an internal bug so that zero is built 
in JPRT.

So I can compile zero again, but can't link on OL7 (Oracle's RedHat 
version of linux).

Error: failed 
/scratch/cphillim/hg/10ptr3/build/linux-x64/jdk/lib/server/libjvm.so, 
because libffi.so.5: cannot open shared object file: No such file or 
directory

I did a "yum install libffi" which seemed to succeed.

Help?
Coleen

>
> If you cannot test your patch on a given platform X, please let me know. I have access to every platform supported by OpenJDK except AIX/PPC.
>
> Thanks,
> Adrian
>
>> On Oct 15, 2017, at 12:41 AM, Roman Kennke <rkennke at redhat.com> wrote:
>>
>> The JEP to remove the Shark compiler has received exclusively positive feedback (JDK-8189173) on zero-dev. So here comes the big patch to remove it.
>>
>> What I have done:
>>
>> grep -i -R shark src
>> grep -i -R shark make
>> grep -i -R shark doc
>> grep -i -R shark doc
>>
>> and purged any reference to shark. Almost everything was straightforward.
>>
>> The only things I wasn't really sure of:
>>
>> - in globals.hpp, I re-arranged the KIND_* bits to account for the gap that removing KIND_SHARK left. I hope that's good?
>> - in relocInfo_zero.hpp I put a ShouldNotCallThis() in pd_address_in_code(), I am not sure it is the right thing to do. If not, what *would* be the right thing?
>>
>> Then of course I did:
>>
>> rm -rf src/hotspot/share/shark
>>
>> I also went through the build machinery and removed stuff related to Shark and LLVM libs.
>>
>> Now the only references in the whole JDK tree to shark is a 'Shark Bay' in a timezone file, and 'Wireshark' in some tests ;-)
>>
>> I tested by building a regular x86 JVM and running JTREG tests. All looks fine.
>>
>> - I could not build zero because it seems broken because of the recent Atomic::* changes
>> - I could not test any of the other arches that seemed to reference Shark (arm and sparc)
>>
>> Here's the full webrev:
>>
>> http://cr.openjdk.java.net/~rkennke/8171853/webrev.00/ <http://cr.openjdk.java.net/%7Erkennke/8171853/webrev.00/>
>>
>> Can I get a review on this?
>>
>> Thanks, Roman


From rkennke at redhat.com  Mon Oct 16 19:37:41 2017
From: rkennke at redhat.com (Roman Kennke)
Date: Mon, 16 Oct 2017 21:37:41 +0200
Subject: RFR: 8171853: Remove Shark compiler
In-Reply-To: <0ce8d126-1f0c-5cd3-edbc-b9bd36d17801@oracle.com>
References: <92a2fec1-88f1-2579-1202-299b13062b7b@redhat.com>
 <87BC5241-9C27-457F-9856-3D969831DABC@physik.fu-berlin.de>
 <0ce8d126-1f0c-5cd3-edbc-b9bd36d17801@oracle.com>
Message-ID: <9aa4fc18-0c52-6570-902f-586ef981020e@redhat.com>

Am 16.10.2017 um 21:31 schrieb coleen.phillimore at oracle.com:
>
>
> On 10/15/17 2:06 AM, John Paul Adrian Glaubitz wrote:
>> Hi Roman!
>>
>> Please let me look at SPARC next week first before merging this.
>>
>> And thanks for notifying me that Zero is broken again *sigh*.
>>
>> People, please test your changes. Yes, I know you all just care about 
>> Hotspot. But please understand that there are many people out there 
>> who rely on Zero, i.e. they are using it. Breaking code that people 
>> actively use is not nice and should not happen in a project like 
>> OpenJDK.
>>
>> Building Zero takes maybe 5 minutes on a fast x86 machine, so I would 
>> like to ask everyone to please test their changes against Zero as 
>> well. These tests will keep the headaches for people relying on Zero 
>> low and also avoids that distributions have to ship many patches on 
>> top of OpenJDK upstream.
>
> I used to be able to compile and link Zero and have fixed it but as an 
> occasional task, it's something that stops working.
>
> At one point, I thought I'd filed an internal bug so that zero is 
> built in JPRT.
>
> So I can compile zero again, but can't link on OL7 (Oracle's RedHat 
> version of linux).
>
> Error: failed 
> /scratch/cphillim/hg/10ptr3/build/linux-x64/jdk/lib/server/libjvm.so, 
> because libffi.so.5: cannot open shared object file: No such file or 
> directory
>
> I did a "yum install libffi" which seemed to succeed.

What you want is: "yum install libffi-devel"

This is the only additional dependency that Zero has. And I'm doing this 
on CentOS7 (an open source version of RHEL7), which should practically 
be the same in this regard as OL7.

Roman


From mark.reinhold at oracle.com  Mon Oct 16 21:38:18 2017
From: mark.reinhold at oracle.com (mark.reinhold at oracle.com)
Date: Mon, 16 Oct 2017 14:38:18 -0700 (PDT)
Subject: JEP 310: Application Class-Data Sharing
Message-ID: <20171016213818.E4CE2EA182@eggemoggin.niobe.net>

New JEP Candidate: http://openjdk.java.net/jeps/310

- Mark

From david.holmes at oracle.com  Mon Oct 16 21:58:01 2017
From: david.holmes at oracle.com (David Holmes)
Date: Tue, 17 Oct 2017 07:58:01 +1000
Subject: RFR (L, but tedious) 8188220: Remove Atomic::*_ptr() uses and
 overloads from hotspot
In-Reply-To: <5a787ec8-afe6-b8a6-23de-5d6a5b935035@oracle.com>
References: <b6d75bd8-ca77-6507-f1f0-db57bbb99ae0@oracle.com>
 <7A475565-84D9-4F98-AE7B-2FDB206CC6E1@oracle.com>
 <49b7c5f7-2f6d-16ac-0b60-140619d0fffd@oracle.com>
 <f50c2c96-11a8-2d02-5eaf-292451b6c6ff@oracle.com>
 <0784FA88-3D00-4DBA-8726-3A3B23C91B3E@oracle.com>
 <2f32124d-2428-678d-ef50-3306231aa848@oracle.com>
 <5a787ec8-afe6-b8a6-23de-5d6a5b935035@oracle.com>
Message-ID: <2adcda24-1386-b5ee-81d3-2e4604b0f4d5@oracle.com>

Seems okay.

Thanks,
David

On 17/10/2017 1:59 AM, coleen.phillimore at oracle.com wrote:
> 
> The latest incremental based on these comments (now running tier1).
> http://cr.openjdk.java.net/~coleenp/8188220.review-comments.02/webrev/index.html 
> 
> 
> plus what Roman sent in the "RFR: 8189333: Fix Zero build after 
> Atomic::xchg changes" thread.
> 
> thanks,
> Coleen
> 
> On 10/16/17 9:13 AM, coleen.phillimore at oracle.com wrote:
>>
>>
>> On 10/14/17 7:36 PM, Kim Barrett wrote:
>>>> On Oct 13, 2017, at 2:34 PM, coleen.phillimore at oracle.com wrote:
>>>>
>>>>
>>>> Hi, Here is the version with the changes from Kim's comments that 
>>>> has passed at least testing with JPRT and tier1, locally.?? More 
>>>> testing (tier2-5) is in progress.
>>>>
>>>> Also includes a corrected version of Atomic::sub care of Erik 
>>>> Osterlund.
>>>>
>>>> open webrev at 
>>>> http://cr.openjdk.java.net/~coleenp/8188220.kim-review-changes/webrev
>>>> open webrev at 
>>>> http://cr.openjdk.java.net/~coleenp/8188220.review-comments/webrev
>>>>
>>>> Full version:
>>>>
>>>> http://cr.openjdk.java.net/~coleenp/8188220.03/webrev
>>>>
>>>> Thanks!
>>>> Coleen
>>> I still dislike and disagree with what is being proposed regarding 
>>> replace_if_null.
>>
>> We can discuss that seperately, please file an RFE.
>>>
>>> ------------------------------------------------------------------------------ 
>>>
>>> I forgot that I'd promised you an updated Atomic::sub definition.
>>> Unfortunately, the new one still has problems, performing some
>>> conversions that should not be permitted (and are disallowed by
>>> Atomic::add).? Try this instead.? (This hasn't been tested, not even
>>> compiled; hopefully I don't have any typos or anything.)? The intent
>>> is that this supports the same conversions as Atomic::add.
>>>
>>> template<typename I, typename D>
>>> inline D Atomic::sub(I sub_value, D volatile* dest) {
>>> ?? STATIC_ASSERT(IsPointer<D>::value || IsIntegral<D>::value);
>>> ?? STATIC_ASSERT(IsIntegral<I>::value);
>>> ?? // If D is a pointer type, use [u]intptr_t as the addend type,
>>> ?? // matching signedness of I.? Otherwise, use D as the addend type.
>>> ?? typedef typename Conditional<IsSigned<I>::value, intptr_t, 
>>> uintptr_t>::type PI;
>>> ?? typedef typename Conditional<IsPointer<D>::value, PI, D>::type 
>>> AddendType;
>>> ?? // Only allow conversions that can't change the value.
>>> ?? STATIC_ASSERT(IsSigned<I>::value == IsSigned<AddendType>::value);
>>> ?? STATIC_ASSERT(sizeof(I) <= sizeof(AddendType));
>>> ?? AddendType addend = sub_value;
>>> ?? // Assumes two's complement integer representation.
>>> ?? #pragma warning(suppress: 4146) // In case AddendType is not signed.
>>> ?? return Atomic::add(-addend, dest);
>>> }
>>
>> Uh, Ok.? I'll try it out.
>>>
>>>>>> src/hotspot/share/gc/cms/concurrentMarkSweepGeneration.cpp
>>>>>> 7960?? Atomic::add(-n, &_num_par_pushes);
>>>>>>
>>>>>> Atomic::sub
>>>>> fixed.
>>> Nope, not fixed in http://cr.openjdk.java.net/~coleenp/8188220.03/webrev
>>
>> Missed it twice now.? I think I have it now.
>>>>>> src/hotspot/share/gc/g1/heapRegionRemSet.cpp
>>>>>> ?? 200?????? PerRegionTable* res =
>>>>>> ?? 201???????? Atomic::cmpxchg(nxt, &_free_list, fl);
>>>>>>
>>>>>> Please remove the line break, now that the code has been simplified.
>>>>>>
>>>>>> But wait, doesn't this alloc exhibit classic ABA problems?? I *think*
>>>>>> this works because alloc and bulk_free are called in different 
>>>>>> phases,
>>>>>> never overlapping.
>>>>> I don't know.? Do you want to file a bug to investigate this?
>>>>> fixed.
>>> No, I now think it?s ok, though confusing.
>>>
>>>>>> src/hotspot/share/gc/g1/sparsePRT.cpp
>>>>>> ?? 295???? SparsePRT* res =
>>>>>> ?? 296?????? Atomic::cmpxchg(sprt, &_head_expanded_list, hd);
>>>>>> and
>>>>>> ?? 307???? SparsePRT* res =
>>>>>> ?? 308?????? Atomic::cmpxchg(next, &_head_expanded_list, hd);
>>>>>>
>>>>>> I'd rather not have the line breaks in these either.
>>>>>>
>>>>>> And get_from_expanded_list also appears to have classic ABA problems.
>>>>>> I *think* this works because add_to_expanded_list and
>>>>>> get_from_expanded_list are called in different phases, never
>>>>>> overlapping.
>>>>> Fixed, same question as above?? Or one bug to investigate both?
>>> Again, I think it?s ok, though confusing.
>>>
>>>>>> src/hotspot/share/gc/shared/taskqueue.inline.hpp
>>>>>> ?? 262?? return (size_t) Atomic::cmpxchg((intptr_t)new_age._data,
>>>>>> ?? 263?????????????????????????????????? (volatile intptr_t *)&_data,
>>>>>> ?? 264 (intptr_t)old_age._data);
>>>>>>
>>>>>> This should be
>>>>>>
>>>>>> ??? return Atomic::cmpxchg(new_age._data, &_data, old_age._data);
>>>>> fixed.
>>> Still casting the result.
>>
>> I thought I fixed it.? I think I fixed it now.
>>>
>>>>>> src/hotspot/share/oops/method.hpp
>>>>>> ?? 139?? volatile address from_compiled_entry() const?? { return 
>>>>>> OrderAccess::load_acquire(&_from_compiled_entry); }
>>>>>> ?? 140?? volatile address from_compiled_entry_no_trampoline() const;
>>>>>> ?? 141?? volatile address from_interpreted_entry() const{ return 
>>>>>> OrderAccess::load_acquire(&_from_interpreted_entry); }
>>>>>>
>>>>>> [pre-existing]
>>>>>> The volatile qualifiers here seem suspect to me.
>>>>> Again much suspicion about concurrency and giant pain, which I 
>>>>> remember, of debugging these when they were broken.
>>> Let me be more direct: the volatile qualifiers for the function return
>>> types are bogus and confusing, and should be removed.
>>
>> Okay, sure.
>>
>>>
>>>>>> src/hotspot/share/prims/jni.cpp
>>>>>>
>>>>>> [pre-existing]
>>>>>>
>>>>>> copy_jni_function_table should be using Copy::disjoint_words_atomic.
>>>>> yuck.
>>> Of course, neither is entirely technically correct, since both are
>>> treating conversion of function pointers to void* as okay in shared
>>> code, e.g. violating some of the raison d'etre of CAST_{TO,FROM}_FN_PTR.
>>> For way more detail than you probably care about, see the discussion
>>> starting here:
>>> http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2016-March/018578.html 
>>>
>>> through (5 messages in total)
>>> http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2016-March/018623.html 
>>>
>>>
>>> Oh well.
>>>
>>>>>> src/hotspot/share/runtime/mutex.hpp
>>>>>>
>>>>>> [pre-existing]
>>>>>>
>>>>>> I think the Address member of the SplitWord union is unused. Looking
>>>>>> at AcquireOrPush (and others), I'm wondering whether it *should* be
>>>>>> used there, or whether just using intptr_t casts and doing integral
>>>>>> arithmetic (as is presently being done) is easier and clearer.
>>>>>>
>>>>>> Also the _LSBINDEX macro probably ought to be defined in mutex.cpp
>>>>>> rather than polluting the global namespace.? And technically, that
>>>>>> name is reserved word.
>>>>> I moved both this and _LBIT into the top of mutex.cpp since they 
>>>>> are used there.
>>> Good.
>>>
>>>>> Cant define const intptr_t _LBIT =1; in a class in our version of C++.
>>> Sorry, please explain?? If you tried to move it into SplitWord, that 
>>> doesn?t work;
>>> unions are not permitted to have static data members (I don?t 
>>> off-hand know why,
>>> just that it?s explicitly forbidden).
>>>
>>> And you left the seemingly unused Address member in SplitWord.
>>
>> This is the compilation error I get:
>>
>> /scratch/cphillim/hg/10ptr2/open/src/hotspot/share/runtime/mutex.hpp:124:33: 
>> error: non-static data member initializers only available with 
>> -std=c++11 or -std=gnu++11 [-Werror]
>> ?? const intptr_t _NEW_LOCKBIT = 1;
>>
>>
>> I don't own this SplitWord code so do not want to remove the unused 
>> Address member.
>>
>>>
>>>>>> src/hotspot/share/runtime/thread.cpp
>>>>>> 4707?? intptr_t w = Atomic::cmpxchg((intptr_t)LOCKBIT, Lock, 
>>>>>> (intptr_t)0);
>>>>>>
>>>>>> This and other places suggest LOCKBIT should be defined as intptr_t,
>>>>>> rather than as an enum value.? The MuxBits enum type is unused.
>>>>>>
>>>>>> And the cast of 0 is another case where implicit widening would be 
>>>>>> nice.
>>>>> Making LOCKBIT a const intptr_t = 1 removes a lot of casts.
>>> Because of the new definition of LOCKBIT I noticed the immediately
>>> preceeding typedef for MutexT, which seems to be unused.
>>
>> Removed MutexT.
>>>
>>> ------------------------------------------------------------------------------ 
>>>
>>> src/hotspot/share/oops/cpCache.cpp
>>> ? 114 bool ConstantPoolCacheEntry::init_flags_atomic(intx flags) {
>>> ? 115?? intptr_t result = Atomic::cmpxchg(flags, &_flags, (intx)0);
>>> ? 116?? return (result == 0);
>>> ? 117 }
>>>
>>> [I missed this on earlier pass.]
>>>
>>> Should be
>>>
>>> bool ConstantPoolCacheEntry::init_flags_atomic(intx flags) {
>>> ?? return Atomic::cmpxchg(flags, &_flags, (intx)0) == 0;
>>> }
>>>
>>> Otherwise, I end up asking why result is intptr_t when the cmpxchg is
>>> dealing with intx.? Yeah, one's a typedef of the other, but mixing
>>> them like that in the same expression is not helpful.
>>>
>>>
>> Sure why not?
>>
>> Actually init_flags_atomic is not used and neither is 
>> init_method_flags_atomic so I did one better and removed them.
>>
>> Thanks for the again thorough code review and Atomic::sub.?? I'll post 
>> incremental when it compiles.
>>
>> Coleen
> 

From volker.simonis at gmail.com  Mon Oct 16 23:07:39 2017
From: volker.simonis at gmail.com (Volker Simonis)
Date: Mon, 16 Oct 2017 23:07:39 +0000
Subject: RFR(M): 8166317: InterpreterCodeSize should be computed
In-Reply-To: <CA+3eh10S08sqtk8dgHnDPSdUmt4buvy7Ht=iYE2hKXmPXGqf1w@mail.gmail.com>
References: <CA+3eh13aijzoGexJXDB7tTnFfmqE4BffbASUysSb2=osD7F4nA@mail.gmail.com>
 <50cda0ab-f403-372a-ce51-1a27d8821448@oracle.com>
 <CA+3eh11HCkBF8KkMG5-o-Ouji=KLqQ=FtztLWo6u3Han3yxoKw@mail.gmail.com>
 <e00bf17e-0ee5-75ba-8eb3-f766b984fa9f@oracle.com>
 <CA+3eh10Sxx5hsXABdeWCZ_P=AY5aR6F37Gs3cdFEv9P_K8GUrw@mail.gmail.com>
 <CA+3eh13fpKiZZgmHCAORgOjxdrEiTipN4zOxAxdz=++GiMZ0SQ@mail.gmail.com>
 <6704868d-caa7-51e0-4741-5d62f90d837c@oracle.com>
 <CA+3eh12_Z9S-oyMNJf8MLwtveDXAx+2xAqsKjOZjS5o_nm6WUQ@mail.gmail.com>
 <8c522d38-90db-2864-0778-6d5948b1f50c@oracle.com>
 <7fee08f1-8304-3026-19e9-844e618e98ea@oracle.com>
 <CA+3eh125jBF2OEMMJ_22N__UHx=Kj6NMKtpfnfGnnf0WUk4Wbw@mail.gmail.com>
 <2bb4136a-8c0e-ac4c-0c03-af38ff79ab40@oracle.com>
 <CA+3eh125R_kgucQ0pfUAgmW9a+y=hx05_N4eomHBSzd1viEztg@mail.gmail.com>
 <5b5219a5-960e-363b-2bdc-3613f1dae62c@oracle.com>
 <ff56e260-c520-f0a2-32ac-512af28b317c@oracle.com>
 <CA+3eh10SxnE3nOXUpuQ_+vp_S_LP0ByrRhD3A2Asyo9goccs3Q@mail.gmail.com>
 <4109f960-078f-e582-3c78-71f201a265fd@redhat.com>
 <CA+3eh10S08sqtk8dgHnDPSdUmt4buvy7Ht=iYE2hKXmPXGqf1w@mail.gmail.com>
Message-ID: <CA+3eh11Bup329GckN2E_VVb+vW6+MuHEbM5RtvOfuen7ErvPXQ@mail.gmail.com>

Volker Simonis <volker.simonis at gmail.com> schrieb am Di. 10. Okt. 2017 um
19:17:

> On Tue, Oct 10, 2017 at 9:42 AM, Andrew Haley <aph at redhat.com> wrote:
> > On 09/10/17 20:24, Volker Simonis wrote:
> >> Unfortunately we can't easily generate these stubs during
> >> 'stubRoutines_init1()' because
> >> 'generate_dirty_card_log_enqueue_if_necessary()' needs the byte map
> >> base address which is only initialized in
> >> 'CardTableModRefBS::initialize()' during 'univers_init()' which
> >> happens after 'stubRoutines_init1()'.
> >
> > Yes you can, you can do something like we do for narrow_ptrs_base:
> >
> >     if (Universe::is_fully_initialized()) {
> >       mov(rheapbase, Universe::narrow_ptrs_base());
> >     } else {
> >       lea(rheapbase,
> ExternalAddress((address)Universe::narrow_ptrs_base_addr()));
> >       ldr(rheapbase, Address(rheapbase));
> >     }
> >
>

Hi,

can somebody please take a look at the new version of the patch?

Thanks,
Volker


> Hi Andrew,
>
> thanks for your suggestion. Yes, I could do that, but that would
> replace a constant load in the barrier with a constant load plus a
> load from memory, because during stubRoutines_init1() heap won't be
> initialized. Not sure about this, but I think we want to avoid this
> overhead in the barriers.
>
> Also, Christian proposed in a previous mail to replace the G1 barrier
> stubs on SPARC with simple runtime calls like on other platforms.
> While I think that it is probably worthwhile thinking about such a
> change, I don't know the exact history of these stubs and probably
> some GC experts should decide if that's really a good idea. I'd be
> happy to open an extra issue for following up on that path.
>
> But for the moments I've simply added a new initialization step
> "g1_barrier_stubs_init()" between 'univers_init()' and
> interpreter_init() which is empty on all platforms except SPARC where
> it generates the corresponding stubs:
>
> http://cr.openjdk.java.net/~simonis/webrevs/2017/8166317.v3/
>
> I've built and smoke-tested the new change on Windows, MacOS,
> Solaris/SPARC, AIX, Linux/x86_64/ppc64/ppc64le/s390. Unfortunately I
> don't have access to ARM machines so I couldn't check arm,arm64 and
> aarch64 although I don't expect any problems there (actually I've just
> added an empty method there). But it would be great if somebody could
> check that for any case.
>
> @Vladimir: I've also rebased the change for "8187091:
> ReturnBlobToWrongHeapTest fails because of problems in
> CodeHeap::contains_blob()":
>
> http://cr.openjdk.java.net/~simonis/webrevs/2017/8187091.v2/
>
> Because it changes the same files like 8166317 it should be applied
> and pushed only after 8166317 was pushed.
>
> Thank you and best regards,
> Volker
>
> > --
> > Andrew Haley
> > Java Platform Lead Engineer
> > Red Hat UK Ltd. <https://www.redhat.com>
> > EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671
>

From kim.barrett at oracle.com  Mon Oct 16 23:29:06 2017
From: kim.barrett at oracle.com (Kim Barrett)
Date: Mon, 16 Oct 2017 19:29:06 -0400
Subject: RFR (L, but tedious) 8188220: Remove Atomic::*_ptr() uses and
 overloads from hotspot
In-Reply-To: <5a787ec8-afe6-b8a6-23de-5d6a5b935035@oracle.com>
References: <b6d75bd8-ca77-6507-f1f0-db57bbb99ae0@oracle.com>
 <7A475565-84D9-4F98-AE7B-2FDB206CC6E1@oracle.com>
 <49b7c5f7-2f6d-16ac-0b60-140619d0fffd@oracle.com>
 <f50c2c96-11a8-2d02-5eaf-292451b6c6ff@oracle.com>
 <0784FA88-3D00-4DBA-8726-3A3B23C91B3E@oracle.com>
 <2f32124d-2428-678d-ef50-3306231aa848@oracle.com>
 <5a787ec8-afe6-b8a6-23de-5d6a5b935035@oracle.com>
Message-ID: <8DB3C54F-EA41-4F08-A2DB-839A577A2A55@oracle.com>

> On Oct 16, 2017, at 11:59 AM, coleen.phillimore at oracle.com wrote:
> 
> 
> The latest incremental based on these comments (now running tier1).
> http://cr.openjdk.java.net/~coleenp/8188220.review-comments.02/webrev/index.html
> 
> plus what Roman sent in the "RFR: 8189333: Fix Zero build after Atomic::xchg changes" thread.

Looks good.

I?ll file an RFR for replace_if_null


From coleen.phillimore at oracle.com  Tue Oct 17 00:45:03 2017
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Mon, 16 Oct 2017 20:45:03 -0400
Subject: RFR (L, but tedious) 8188220: Remove Atomic::*_ptr() uses and
 overloads from hotspot
In-Reply-To: <2adcda24-1386-b5ee-81d3-2e4604b0f4d5@oracle.com>
References: <b6d75bd8-ca77-6507-f1f0-db57bbb99ae0@oracle.com>
 <7A475565-84D9-4F98-AE7B-2FDB206CC6E1@oracle.com>
 <49b7c5f7-2f6d-16ac-0b60-140619d0fffd@oracle.com>
 <f50c2c96-11a8-2d02-5eaf-292451b6c6ff@oracle.com>
 <0784FA88-3D00-4DBA-8726-3A3B23C91B3E@oracle.com>
 <2f32124d-2428-678d-ef50-3306231aa848@oracle.com>
 <5a787ec8-afe6-b8a6-23de-5d6a5b935035@oracle.com>
 <2adcda24-1386-b5ee-81d3-2e4604b0f4d5@oracle.com>
Message-ID: <06cb2315-a5ae-5c3d-365b-c24bbf0e5bdb@oracle.com>

Thanks David!
Coleen

On 10/16/17 5:58 PM, David Holmes wrote:
> Seems okay.
>
> Thanks,
> David
>
> On 17/10/2017 1:59 AM, coleen.phillimore at oracle.com wrote:
>>
>> The latest incremental based on these comments (now running tier1).
>> http://cr.openjdk.java.net/~coleenp/8188220.review-comments.02/webrev/index.html 
>>
>>
>> plus what Roman sent in the "RFR: 8189333: Fix Zero build after 
>> Atomic::xchg changes" thread.
>>
>> thanks,
>> Coleen
>>
>> On 10/16/17 9:13 AM, coleen.phillimore at oracle.com wrote:
>>>
>>>
>>> On 10/14/17 7:36 PM, Kim Barrett wrote:
>>>>> On Oct 13, 2017, at 2:34 PM, coleen.phillimore at oracle.com wrote:
>>>>>
>>>>>
>>>>> Hi, Here is the version with the changes from Kim's comments that 
>>>>> has passed at least testing with JPRT and tier1, locally.?? More 
>>>>> testing (tier2-5) is in progress.
>>>>>
>>>>> Also includes a corrected version of Atomic::sub care of Erik 
>>>>> Osterlund.
>>>>>
>>>>> open webrev at 
>>>>> http://cr.openjdk.java.net/~coleenp/8188220.kim-review-changes/webrev
>>>>> open webrev at 
>>>>> http://cr.openjdk.java.net/~coleenp/8188220.review-comments/webrev
>>>>>
>>>>> Full version:
>>>>>
>>>>> http://cr.openjdk.java.net/~coleenp/8188220.03/webrev
>>>>>
>>>>> Thanks!
>>>>> Coleen
>>>> I still dislike and disagree with what is being proposed regarding 
>>>> replace_if_null.
>>>
>>> We can discuss that seperately, please file an RFE.
>>>>
>>>> ------------------------------------------------------------------------------ 
>>>>
>>>> I forgot that I'd promised you an updated Atomic::sub definition.
>>>> Unfortunately, the new one still has problems, performing some
>>>> conversions that should not be permitted (and are disallowed by
>>>> Atomic::add).? Try this instead.? (This hasn't been tested, not even
>>>> compiled; hopefully I don't have any typos or anything.) The intent
>>>> is that this supports the same conversions as Atomic::add.
>>>>
>>>> template<typename I, typename D>
>>>> inline D Atomic::sub(I sub_value, D volatile* dest) {
>>>> ?? STATIC_ASSERT(IsPointer<D>::value || IsIntegral<D>::value);
>>>> ?? STATIC_ASSERT(IsIntegral<I>::value);
>>>> ?? // If D is a pointer type, use [u]intptr_t as the addend type,
>>>> ?? // matching signedness of I.? Otherwise, use D as the addend type.
>>>> ?? typedef typename Conditional<IsSigned<I>::value, intptr_t, 
>>>> uintptr_t>::type PI;
>>>> ?? typedef typename Conditional<IsPointer<D>::value, PI, D>::type 
>>>> AddendType;
>>>> ?? // Only allow conversions that can't change the value.
>>>> ?? STATIC_ASSERT(IsSigned<I>::value == IsSigned<AddendType>::value);
>>>> ?? STATIC_ASSERT(sizeof(I) <= sizeof(AddendType));
>>>> ?? AddendType addend = sub_value;
>>>> ?? // Assumes two's complement integer representation.
>>>> ?? #pragma warning(suppress: 4146) // In case AddendType is not 
>>>> signed.
>>>> ?? return Atomic::add(-addend, dest);
>>>> }
>>>
>>> Uh, Ok.? I'll try it out.
>>>>
>>>>>>> src/hotspot/share/gc/cms/concurrentMarkSweepGeneration.cpp
>>>>>>> 7960?? Atomic::add(-n, &_num_par_pushes);
>>>>>>>
>>>>>>> Atomic::sub
>>>>>> fixed.
>>>> Nope, not fixed in 
>>>> http://cr.openjdk.java.net/~coleenp/8188220.03/webrev
>>>
>>> Missed it twice now.? I think I have it now.
>>>>>>> src/hotspot/share/gc/g1/heapRegionRemSet.cpp
>>>>>>> ?? 200?????? PerRegionTable* res =
>>>>>>> ?? 201???????? Atomic::cmpxchg(nxt, &_free_list, fl);
>>>>>>>
>>>>>>> Please remove the line break, now that the code has been 
>>>>>>> simplified.
>>>>>>>
>>>>>>> But wait, doesn't this alloc exhibit classic ABA problems?? I 
>>>>>>> *think*
>>>>>>> this works because alloc and bulk_free are called in different 
>>>>>>> phases,
>>>>>>> never overlapping.
>>>>>> I don't know.? Do you want to file a bug to investigate this?
>>>>>> fixed.
>>>> No, I now think it?s ok, though confusing.
>>>>
>>>>>>> src/hotspot/share/gc/g1/sparsePRT.cpp
>>>>>>> ?? 295???? SparsePRT* res =
>>>>>>> ?? 296?????? Atomic::cmpxchg(sprt, &_head_expanded_list, hd);
>>>>>>> and
>>>>>>> ?? 307???? SparsePRT* res =
>>>>>>> ?? 308?????? Atomic::cmpxchg(next, &_head_expanded_list, hd);
>>>>>>>
>>>>>>> I'd rather not have the line breaks in these either.
>>>>>>>
>>>>>>> And get_from_expanded_list also appears to have classic ABA 
>>>>>>> problems.
>>>>>>> I *think* this works because add_to_expanded_list and
>>>>>>> get_from_expanded_list are called in different phases, never
>>>>>>> overlapping.
>>>>>> Fixed, same question as above?? Or one bug to investigate both?
>>>> Again, I think it?s ok, though confusing.
>>>>
>>>>>>> src/hotspot/share/gc/shared/taskqueue.inline.hpp
>>>>>>> ?? 262?? return (size_t) Atomic::cmpxchg((intptr_t)new_age._data,
>>>>>>> ?? 263?????????????????????????????????? (volatile intptr_t 
>>>>>>> *)&_data,
>>>>>>> ?? 264 (intptr_t)old_age._data);
>>>>>>>
>>>>>>> This should be
>>>>>>>
>>>>>>> ??? return Atomic::cmpxchg(new_age._data, &_data, old_age._data);
>>>>>> fixed.
>>>> Still casting the result.
>>>
>>> I thought I fixed it.? I think I fixed it now.
>>>>
>>>>>>> src/hotspot/share/oops/method.hpp
>>>>>>> ?? 139?? volatile address from_compiled_entry() const?? { return 
>>>>>>> OrderAccess::load_acquire(&_from_compiled_entry); }
>>>>>>> ?? 140?? volatile address from_compiled_entry_no_trampoline() 
>>>>>>> const;
>>>>>>> ?? 141?? volatile address from_interpreted_entry() const{ return 
>>>>>>> OrderAccess::load_acquire(&_from_interpreted_entry); }
>>>>>>>
>>>>>>> [pre-existing]
>>>>>>> The volatile qualifiers here seem suspect to me.
>>>>>> Again much suspicion about concurrency and giant pain, which I 
>>>>>> remember, of debugging these when they were broken.
>>>> Let me be more direct: the volatile qualifiers for the function return
>>>> types are bogus and confusing, and should be removed.
>>>
>>> Okay, sure.
>>>
>>>>
>>>>>>> src/hotspot/share/prims/jni.cpp
>>>>>>>
>>>>>>> [pre-existing]
>>>>>>>
>>>>>>> copy_jni_function_table should be using 
>>>>>>> Copy::disjoint_words_atomic.
>>>>>> yuck.
>>>> Of course, neither is entirely technically correct, since both are
>>>> treating conversion of function pointers to void* as okay in shared
>>>> code, e.g. violating some of the raison d'etre of 
>>>> CAST_{TO,FROM}_FN_PTR.
>>>> For way more detail than you probably care about, see the discussion
>>>> starting here:
>>>> http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2016-March/018578.html 
>>>>
>>>> through (5 messages in total)
>>>> http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2016-March/018623.html 
>>>>
>>>>
>>>> Oh well.
>>>>
>>>>>>> src/hotspot/share/runtime/mutex.hpp
>>>>>>>
>>>>>>> [pre-existing]
>>>>>>>
>>>>>>> I think the Address member of the SplitWord union is unused. 
>>>>>>> Looking
>>>>>>> at AcquireOrPush (and others), I'm wondering whether it *should* be
>>>>>>> used there, or whether just using intptr_t casts and doing integral
>>>>>>> arithmetic (as is presently being done) is easier and clearer.
>>>>>>>
>>>>>>> Also the _LSBINDEX macro probably ought to be defined in mutex.cpp
>>>>>>> rather than polluting the global namespace.? And technically, that
>>>>>>> name is reserved word.
>>>>>> I moved both this and _LBIT into the top of mutex.cpp since they 
>>>>>> are used there.
>>>> Good.
>>>>
>>>>>> Cant define const intptr_t _LBIT =1; in a class in our version of 
>>>>>> C++.
>>>> Sorry, please explain?? If you tried to move it into SplitWord, 
>>>> that doesn?t work;
>>>> unions are not permitted to have static data members (I don?t 
>>>> off-hand know why,
>>>> just that it?s explicitly forbidden).
>>>>
>>>> And you left the seemingly unused Address member in SplitWord.
>>>
>>> This is the compilation error I get:
>>>
>>> /scratch/cphillim/hg/10ptr2/open/src/hotspot/share/runtime/mutex.hpp:124:33: 
>>> error: non-static data member initializers only available with 
>>> -std=c++11 or -std=gnu++11 [-Werror]
>>> ?? const intptr_t _NEW_LOCKBIT = 1;
>>>
>>>
>>> I don't own this SplitWord code so do not want to remove the unused 
>>> Address member.
>>>
>>>>
>>>>>>> src/hotspot/share/runtime/thread.cpp
>>>>>>> 4707?? intptr_t w = Atomic::cmpxchg((intptr_t)LOCKBIT, Lock, 
>>>>>>> (intptr_t)0);
>>>>>>>
>>>>>>> This and other places suggest LOCKBIT should be defined as 
>>>>>>> intptr_t,
>>>>>>> rather than as an enum value.? The MuxBits enum type is unused.
>>>>>>>
>>>>>>> And the cast of 0 is another case where implicit widening would 
>>>>>>> be nice.
>>>>>> Making LOCKBIT a const intptr_t = 1 removes a lot of casts.
>>>> Because of the new definition of LOCKBIT I noticed the immediately
>>>> preceeding typedef for MutexT, which seems to be unused.
>>>
>>> Removed MutexT.
>>>>
>>>> ------------------------------------------------------------------------------ 
>>>>
>>>> src/hotspot/share/oops/cpCache.cpp
>>>> ? 114 bool ConstantPoolCacheEntry::init_flags_atomic(intx flags) {
>>>> ? 115?? intptr_t result = Atomic::cmpxchg(flags, &_flags, (intx)0);
>>>> ? 116?? return (result == 0);
>>>> ? 117 }
>>>>
>>>> [I missed this on earlier pass.]
>>>>
>>>> Should be
>>>>
>>>> bool ConstantPoolCacheEntry::init_flags_atomic(intx flags) {
>>>> ?? return Atomic::cmpxchg(flags, &_flags, (intx)0) == 0;
>>>> }
>>>>
>>>> Otherwise, I end up asking why result is intptr_t when the cmpxchg is
>>>> dealing with intx.? Yeah, one's a typedef of the other, but mixing
>>>> them like that in the same expression is not helpful.
>>>>
>>>>
>>> Sure why not?
>>>
>>> Actually init_flags_atomic is not used and neither is 
>>> init_method_flags_atomic so I did one better and removed them.
>>>
>>> Thanks for the again thorough code review and Atomic::sub. I'll post 
>>> incremental when it compiles.
>>>
>>> Coleen
>>


From coleen.phillimore at oracle.com  Tue Oct 17 00:46:07 2017
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Mon, 16 Oct 2017 20:46:07 -0400
Subject: RFR (L, but tedious) 8188220: Remove Atomic::*_ptr() uses and
 overloads from hotspot
In-Reply-To: <8DB3C54F-EA41-4F08-A2DB-839A577A2A55@oracle.com>
References: <b6d75bd8-ca77-6507-f1f0-db57bbb99ae0@oracle.com>
 <7A475565-84D9-4F98-AE7B-2FDB206CC6E1@oracle.com>
 <49b7c5f7-2f6d-16ac-0b60-140619d0fffd@oracle.com>
 <f50c2c96-11a8-2d02-5eaf-292451b6c6ff@oracle.com>
 <0784FA88-3D00-4DBA-8726-3A3B23C91B3E@oracle.com>
 <2f32124d-2428-678d-ef50-3306231aa848@oracle.com>
 <5a787ec8-afe6-b8a6-23de-5d6a5b935035@oracle.com>
 <8DB3C54F-EA41-4F08-A2DB-839A577A2A55@oracle.com>
Message-ID: <e3eeeb5a-dfe9-c062-ee36-bc7e801f0815@oracle.com>

Thanks, Kim.
Coleen

On 10/16/17 7:29 PM, Kim Barrett wrote:
>> On Oct 16, 2017, at 11:59 AM, coleen.phillimore at oracle.com wrote:
>>
>>
>> The latest incremental based on these comments (now running tier1).
>> http://cr.openjdk.java.net/~coleenp/8188220.review-comments.02/webrev/index.html
>>
>> plus what Roman sent in the "RFR: 8189333: Fix Zero build after Atomic::xchg changes" thread.
> Looks good.
>
> I?ll file an RFR for replace_if_null
>


From nils.eliasson at oracle.com  Tue Oct 17 14:37:09 2017
From: nils.eliasson at oracle.com (Nils Eliasson)
Date: Tue, 17 Oct 2017 16:37:09 +0200
Subject: RFR(XL): 8185640: Thread-local handshakes
In-Reply-To: <6f2f6259-73f1-c09c-063e-39ae528fb96f@oracle.com>
References: <6f2f6259-73f1-c09c-063e-39ae528fb96f@oracle.com>
Message-ID: <e3dc8510-f79e-1865-ab2c-7e9e1f69f183@oracle.com>

Hi Robbin,

I have reviewed the compiler parts of the patch - c1, c2, jvmci and cpu*.

Look great!

Regards,

Nils


On 2017-10-11 15:37, Robbin Ehn wrote:
> Hi all,
>
> Starting the review of the code while JEP work is still not completed.
>
> JEP: https://bugs.openjdk.java.net/browse/JDK-8185640
>
> This JEP introduces a way to execute a callback on threads without 
> performing a global VM safepoint. It makes it both possible and cheap 
> to stop individual threads and not just all threads or none.
>
> Entire changeset:
> http://cr.openjdk.java.net/~rehn/8185640/v0/flat/
>
> Divided into 3-parts,
> SafepointMechanism abstraction:
> http://cr.openjdk.java.net/~rehn/8185640/v0/SafepointMechanism-0/
> Consolidating polling page allocation:
> http://cr.openjdk.java.net/~rehn/8185640/v0/PollingPage-1/
> Handshakes:
> http://cr.openjdk.java.net/~rehn/8185640/v0/Handshakes-2/
>
> A handshake operation is a callback that is executed for each 
> JavaThread while that thread is in a safepoint safe state. The 
> callback is executed either by the thread itself or by the VM thread 
> while keeping the thread in a blocked state. The big difference 
> between safepointing and handshaking is that the per thread operation 
> will be performed on all threads as soon as possible and they will 
> continue to execute as soon as it?s own operation is completed. If a 
> JavaThread is known to be running, then a handshake can be performed 
> with that single JavaThread as well.
>
> The current safepointing scheme is modified to perform an indirection 
> through a per-thread pointer which will allow a single thread's 
> execution to be forced to trap on the guard page. In order to force a 
> thread to yield the VM updates the per-thread pointer for the 
> corresponding thread to point to the guarded page.
>
> Example of potential use-cases:
> -Biased lock revocation
> -External requests for stack traces
> -Deoptimization
> -Async exception delivery
> -External suspension
> -Eliding memory barriers
>
> All of these will benefit the VM moving towards becoming more 
> low-latency friendly by reducing the number of global safepoints.
> Platforms that do not yet implement the per JavaThread poll, a 
> fallback to normal safepoint is in place. HandshakeOneThread will then 
> be a normal safepoint. The supported platforms are Linux x64 and 
> Solaris SPARC.
>
> Tested heavily with various test suits and comes with a few new tests.
>
> Performance testing using standardized benchmark show no signification 
> changes, the latest number was -0.7% on Linux x64 and +1.5% Solaris 
> SPARC (not statistically ensured). A minor regression for the load vs 
> load load on x64 is expected and a slight increase on SPARC due to the 
> cost of ?materializing? the page vs load load.
> The time to trigger a safepoint was measured on a large machine to not 
> be an issue. The looping over threads and arming the polling page will 
> benefit from the work on JavaThread life-cycle (8167108 - SMR and 
> JavaThread Lifecycle: 
> http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2017-October/024773.html) 
> which puts all JavaThreads in an array instead of a linked list.
>
> Thanks, Robbin


From erik.osterlund at oracle.com  Tue Oct 17 15:30:30 2017
From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=)
Date: Tue, 17 Oct 2017 17:30:30 +0200
Subject: RFR(XL): 8185640: Thread-local handshakes
In-Reply-To: <6f2f6259-73f1-c09c-063e-39ae528fb96f@oracle.com>
References: <6f2f6259-73f1-c09c-063e-39ae528fb96f@oracle.com>
Message-ID: <59E62216.5070401@oracle.com>

Hi Robbin,

Looks fantastic.

Thanks,
/Erik

On 2017-10-11 15:37, Robbin Ehn wrote:
> Hi all,
>
> Starting the review of the code while JEP work is still not completed.
>
> JEP: https://bugs.openjdk.java.net/browse/JDK-8185640
>
> This JEP introduces a way to execute a callback on threads without 
> performing a global VM safepoint. It makes it both possible and cheap 
> to stop individual threads and not just all threads or none.
>
> Entire changeset:
> http://cr.openjdk.java.net/~rehn/8185640/v0/flat/
>
> Divided into 3-parts,
> SafepointMechanism abstraction:
> http://cr.openjdk.java.net/~rehn/8185640/v0/SafepointMechanism-0/
> Consolidating polling page allocation:
> http://cr.openjdk.java.net/~rehn/8185640/v0/PollingPage-1/
> Handshakes:
> http://cr.openjdk.java.net/~rehn/8185640/v0/Handshakes-2/
>
> A handshake operation is a callback that is executed for each 
> JavaThread while that thread is in a safepoint safe state. The 
> callback is executed either by the thread itself or by the VM thread 
> while keeping the thread in a blocked state. The big difference 
> between safepointing and handshaking is that the per thread operation 
> will be performed on all threads as soon as possible and they will 
> continue to execute as soon as it?s own operation is completed. If a 
> JavaThread is known to be running, then a handshake can be performed 
> with that single JavaThread as well.
>
> The current safepointing scheme is modified to perform an indirection 
> through a per-thread pointer which will allow a single thread's 
> execution to be forced to trap on the guard page. In order to force a 
> thread to yield the VM updates the per-thread pointer for the 
> corresponding thread to point to the guarded page.
>
> Example of potential use-cases:
> -Biased lock revocation
> -External requests for stack traces
> -Deoptimization
> -Async exception delivery
> -External suspension
> -Eliding memory barriers
>
> All of these will benefit the VM moving towards becoming more 
> low-latency friendly by reducing the number of global safepoints.
> Platforms that do not yet implement the per JavaThread poll, a 
> fallback to normal safepoint is in place. HandshakeOneThread will then 
> be a normal safepoint. The supported platforms are Linux x64 and 
> Solaris SPARC.
>
> Tested heavily with various test suits and comes with a few new tests.
>
> Performance testing using standardized benchmark show no signification 
> changes, the latest number was -0.7% on Linux x64 and +1.5% Solaris 
> SPARC (not statistically ensured). A minor regression for the load vs 
> load load on x64 is expected and a slight increase on SPARC due to the 
> cost of ?materializing? the page vs load load.
> The time to trigger a safepoint was measured on a large machine to not 
> be an issue. The looping over threads and arming the polling page will 
> benefit from the work on JavaThread life-cycle (8167108 - SMR and 
> JavaThread Lifecycle: 
> http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2017-October/024773.html) 
> which puts all JavaThreads in an array instead of a linked list.
>
> Thanks, Robbin


From vladimir.kozlov at oracle.com  Tue Oct 17 17:49:10 2017
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Tue, 17 Oct 2017 10:49:10 -0700
Subject: RFR(M): 8166317: InterpreterCodeSize should be computed
In-Reply-To: <CA+3eh10S08sqtk8dgHnDPSdUmt4buvy7Ht=iYE2hKXmPXGqf1w@mail.gmail.com>
References: <CA+3eh13aijzoGexJXDB7tTnFfmqE4BffbASUysSb2=osD7F4nA@mail.gmail.com>
 <e00bf17e-0ee5-75ba-8eb3-f766b984fa9f@oracle.com>
 <CA+3eh10Sxx5hsXABdeWCZ_P=AY5aR6F37Gs3cdFEv9P_K8GUrw@mail.gmail.com>
 <CA+3eh13fpKiZZgmHCAORgOjxdrEiTipN4zOxAxdz=++GiMZ0SQ@mail.gmail.com>
 <6704868d-caa7-51e0-4741-5d62f90d837c@oracle.com>
 <CA+3eh12_Z9S-oyMNJf8MLwtveDXAx+2xAqsKjOZjS5o_nm6WUQ@mail.gmail.com>
 <8c522d38-90db-2864-0778-6d5948b1f50c@oracle.com>
 <7fee08f1-8304-3026-19e9-844e618e98ea@oracle.com>
 <CA+3eh125jBF2OEMMJ_22N__UHx=Kj6NMKtpfnfGnnf0WUk4Wbw@mail.gmail.com>
 <2bb4136a-8c0e-ac4c-0c03-af38ff79ab40@oracle.com>
 <CA+3eh125R_kgucQ0pfUAgmW9a+y=hx05_N4eomHBSzd1viEztg@mail.gmail.com>
 <5b5219a5-960e-363b-2bdc-3613f1dae62c@oracle.com>
 <ff56e260-c520-f0a2-32ac-512af28b317c@oracle.com>
 <CA+3eh10SxnE3nOXUpuQ_+vp_S_LP0ByrRhD3A2Asyo9goccs3Q@mail.gmail.com>
 <4109f960-078f-e582-3c78-71f201a265fd@redhat.com>
 <CA+3eh10S08sqtk8dgHnDPSdUmt4buvy7Ht=iYE2hKXmPXGqf1w@mail.gmail.com>
Message-ID: <1c2eeaa1-334a-4744-ba31-87e580faafa5@oracle.com>

Hi, Volker

You can do a trick with NOT_SPARC() macro to avoid defining empty method 
on all platforms:

+#if INCLUDE_ALL_GCS
+void g1_barrier_stubs_init() NOT_SPARC( {} );  // depends on 
universe_init, must be before interpreter_init
+#endif

I thought we pushed 8187091 already. I will keep it in mind.

Thanks,
Vladimir

On 10/10/17 10:17 AM, Volker Simonis wrote:
> On Tue, Oct 10, 2017 at 9:42 AM, Andrew Haley <aph at redhat.com> wrote:
>> On 09/10/17 20:24, Volker Simonis wrote:
>>> Unfortunately we can't easily generate these stubs during
>>> 'stubRoutines_init1()' because
>>> 'generate_dirty_card_log_enqueue_if_necessary()' needs the byte map
>>> base address which is only initialized in
>>> 'CardTableModRefBS::initialize()' during 'univers_init()' which
>>> happens after 'stubRoutines_init1()'.
>>
>> Yes you can, you can do something like we do for narrow_ptrs_base:
>>
>>      if (Universe::is_fully_initialized()) {
>>        mov(rheapbase, Universe::narrow_ptrs_base());
>>      } else {
>>        lea(rheapbase, ExternalAddress((address)Universe::narrow_ptrs_base_addr()));
>>        ldr(rheapbase, Address(rheapbase));
>>      }
>>
> 
> Hi Andrew,
> 
> thanks for your suggestion. Yes, I could do that, but that would
> replace a constant load in the barrier with a constant load plus a
> load from memory, because during stubRoutines_init1() heap won't be
> initialized. Not sure about this, but I think we want to avoid this
> overhead in the barriers.
> 
> Also, Christian proposed in a previous mail to replace the G1 barrier
> stubs on SPARC with simple runtime calls like on other platforms.
> While I think that it is probably worthwhile thinking about such a
> change, I don't know the exact history of these stubs and probably
> some GC experts should decide if that's really a good idea. I'd be
> happy to open an extra issue for following up on that path.
> 
> But for the moments I've simply added a new initialization step
> "g1_barrier_stubs_init()" between 'univers_init()' and
> interpreter_init() which is empty on all platforms except SPARC where
> it generates the corresponding stubs:
> 
> http://cr.openjdk.java.net/~simonis/webrevs/2017/8166317.v3/
> 
> I've built and smoke-tested the new change on Windows, MacOS,
> Solaris/SPARC, AIX, Linux/x86_64/ppc64/ppc64le/s390. Unfortunately I
> don't have access to ARM machines so I couldn't check arm,arm64 and
> aarch64 although I don't expect any problems there (actually I've just
> added an empty method there). But it would be great if somebody could
> check that for any case.
> 
> @Vladimir: I've also rebased the change for "8187091:
> ReturnBlobToWrongHeapTest fails because of problems in
> CodeHeap::contains_blob()":
> 
> http://cr.openjdk.java.net/~simonis/webrevs/2017/8187091.v2/
> 
> Because it changes the same files like 8166317 it should be applied
> and pushed only after 8166317 was pushed.
> 
> Thank you and best regards,
> Volker
> 
>> --
>> Andrew Haley
>> Java Platform Lead Engineer
>> Red Hat UK Ltd. <https://www.redhat.com>
>> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671

From martin.doerr at sap.com  Tue Oct 17 17:58:47 2017
From: martin.doerr at sap.com (Doerr, Martin)
Date: Tue, 17 Oct 2017 17:58:47 +0000
Subject: RFR(XL): 8185640: Thread-local handshakes
In-Reply-To: <6f2f6259-73f1-c09c-063e-39ae528fb96f@oracle.com>
References: <6f2f6259-73f1-c09c-063e-39ae528fb96f@oracle.com>
Message-ID: <15dd917732444959b7785efbe6640952@sap.com>

Hi Robbin,

my first impression is very good. Thanks for providing the webrev.

I only don't like that "poll_page_val | poll_bit()" is used in shared code. I'd prefer to use either one or the other mechanism.
Would it be ok to move the decision between what to use to platform code?
(Some platforms could still use both if this is beneficial.)

E.g. on PPC64, we'd like to use conditional trap instructions with special bit patterns if UseSIGTRAP is on. Would be excellent if we could implement set functions for _poll_armed_value and _poll_disarmed_value in platform code. poll_bit() also fits better into platform code in my opinion.

Best regards,
Martin


-----Original Message-----
From: hotspot-dev [mailto:hotspot-dev-bounces at openjdk.java.net] On Behalf Of Robbin Ehn
Sent: Mittwoch, 11. Oktober 2017 15:38
To: hotspot-dev developers <hotspot-dev at openjdk.java.net>
Subject: RFR(XL): 8185640: Thread-local handshakes

Hi all,

Starting the review of the code while JEP work is still not completed.

JEP: https://bugs.openjdk.java.net/browse/JDK-8185640

This JEP introduces a way to execute a callback on threads without performing a global VM safepoint. It makes it both possible and cheap to stop individual threads and not 
just all threads or none.

Entire changeset:
http://cr.openjdk.java.net/~rehn/8185640/v0/flat/

Divided into 3-parts,
SafepointMechanism abstraction:
http://cr.openjdk.java.net/~rehn/8185640/v0/SafepointMechanism-0/
Consolidating polling page allocation:
http://cr.openjdk.java.net/~rehn/8185640/v0/PollingPage-1/
Handshakes:
http://cr.openjdk.java.net/~rehn/8185640/v0/Handshakes-2/

A handshake operation is a callback that is executed for each JavaThread while that thread is in a safepoint safe state. The callback is executed either by the thread 
itself or by the VM thread while keeping the thread in a blocked state. The big difference between safepointing and handshaking is that the per thread operation will be 
performed on all threads as soon as possible and they will continue to execute as soon as it?s own operation is completed. If a JavaThread is known to be running, then a 
handshake can be performed with that single JavaThread as well.

The current safepointing scheme is modified to perform an indirection through a per-thread pointer which will allow a single thread's execution to be forced to trap on the 
guard page. In order to force a thread to yield the VM updates the per-thread pointer for the corresponding thread to point to the guarded page.

Example of potential use-cases:
-Biased lock revocation
-External requests for stack traces
-Deoptimization
-Async exception delivery
-External suspension
-Eliding memory barriers

All of these will benefit the VM moving towards becoming more low-latency friendly by reducing the number of global safepoints.
Platforms that do not yet implement the per JavaThread poll, a fallback to normal safepoint is in place. HandshakeOneThread will then be a normal safepoint. The supported 
platforms are Linux x64 and Solaris SPARC.

Tested heavily with various test suits and comes with a few new tests.

Performance testing using standardized benchmark show no signification changes, the latest number was -0.7% on Linux x64 and +1.5% Solaris SPARC (not statistically 
ensured). A minor regression for the load vs load load on x64 is expected and a slight increase on SPARC due to the cost of ?materializing? the page vs load load.
The time to trigger a safepoint was measured on a large machine to not be an issue. The looping over threads and arming the polling page will benefit from the work on 
JavaThread life-cycle (8167108 - SMR and JavaThread Lifecycle: http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2017-October/024773.html) which puts all 
JavaThreads in an array instead of a linked list.

Thanks, Robbin

From coleen.phillimore at oracle.com  Tue Oct 17 18:18:42 2017
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Tue, 17 Oct 2017 14:18:42 -0400
Subject: Result: New hotspot Group Member: Ioi Lam
Message-ID: <7331f8aa-6396-6a62-069e-b13ebc12c8d3@oracle.com>

The vote for Ioi Lam [1] is now closed.

Yes: 10
Veto: 0
Abstain: 0

According to the Bylaws definition of Lazy Consensus, this is sufficient 
to approve the nomination.

Coleen Phillimore

[1] 
http://mail.openjdk.java.net/pipermail/hotspot-dev/2017-October/028480.html


From vladimir.kozlov at oracle.com  Tue Oct 17 18:30:22 2017
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Tue, 17 Oct 2017 11:30:22 -0700
Subject: RFR: Newer AMD 17h (EPYC) Processor family defaults
In-Reply-To: <886a112d-fc55-34d5-6e70-1e6a78cf1b0f@oracle.com>
References: <CAPVMLfU1MtwQ3vSxu=sfpfmonouiq62rHEsiwRBthbKR1276JA@mail.gmail.com>
 <CAPVMLfViTaBdH3344GLvwwb=j151X-9hbugBoSMeki0QP5bRfg@mail.gmail.com>
 <4d4fe028-ea6a-4f77-ab69-5c2bc752e1f5@oracle.com>
 <CAPVMLfUXEUXzt4wh9FWY5g4ugCPBf9y+-MHYzzm67CG88tdRVQ@mail.gmail.com>
 <47bc0a90-ed6a-220a-c3d1-b4df2d8bbc74@oracle.com>
 <9c53f889-e58e-33ac-3c05-874779b469d6@oracle.com>
 <45619e1a-9eb0-a540-193b-5187da3bf6bc@oracle.com>
 <CAPVMLfXG65eN99CoN-bMKOfYXnAc6=mnZgb9t8BQROqH6dvZzA@mail.gmail.com>
 <66e4af43-c0e2-6d64-b69f-35166150ffa2@oracle.com>
 <CAPVMLfWnAxmRfXvCcA4Qrh-V4RKtjHNPsXPT52G388rsW28HfA@mail.gmail.com>
 <CAPVMLfUGypGzHPcS+xaJ2gSuh2P5-RG3PeuuPVLJSOG2YOOCog@mail.gmail.com>
 <11af0f62-ba6b-d533-d23c-750d2ca012c7@oracle.com>
 <CAPVMLfUdQTx28wuNT9MwW6sPXwCXYWANpNOv25X6NFMHSt=-MA@mail.gmail.com>
 <e502b457-46b3-a737-3646-d148f1a82d8a@oracle.com>
 <CAPVMLfULETh-GpK-iGPoshn2qXQF9akx5RbgW+MLF_rnDLcBiQ@mail.gmail.com>
 <886a112d-fc55-34d5-6e70-1e6a78cf1b0f@oracle.com>
Message-ID: <ae6996b9-19ed-4cb5-2f24-7bc32353d7e2@oracle.com>

Nils,

I would like to review you changes as separate bug in separate thread.
I don't like your current changes and want to discuss them.
Please, send separate RFR.

Thanks,
Vladimir

On 10/16/17 7:26 AM, Nils Eliasson wrote:
> Hi,
> 
> I ran into a problem touching this area, so I'm hijacking this thread.
> 
>  > #ifdef COMPILER2
>  > - if (MaxVectorSize > 16) {
>  > - // Limit vectors size to 16 bytes on current AMD cpus.
> 
>> +??? if (cpu_family() < 0x17 && MaxVectorSize > 16) {
>> +????? // Limit vectors size to 16 bytes on AMD cpus < 17h.
>> ?????? FLAG_SET_DEFAULT(MaxVectorSize, 16);
>> ???? }
>> ?#endif // COMPILER2
> 
> 
> The limitation of MaxVecorSize to 16 for some processors in this code 
> has the sideeffect that the TypeVect::VECTY and mreg2type[Op_VecY] won't 
> be initalized even though the platform has the capability.
> 
> Type.cpp:~660
> 
> [...]
>  >?? if (Matcher::vector_size_supported(T_FLOAT,4)) {
>  >???? TypeVect::VECTX = TypeVect::make(T_FLOAT,4);
>  >?? }
>  >?? if (Matcher::vector_size_supported(T_FLOAT,8)) {
>  >???? TypeVect::VECTY = TypeVect::make(T_FLOAT,8);
>  >?? }
>  >?? if (Matcher::vector_size_supported(T_FLOAT,16)) {
>  >???? TypeVect::VECTZ = TypeVect::make(T_FLOAT,16);
>  >?? }
> [...]
>  >?? mreg2type[Op_VecX] = TypeVect::VECTX;
>  >?? mreg2type[Op_VecY] = TypeVect::VECTY;
>  >?? mreg2type[Op_VecZ] = TypeVect::VECTZ;
> 
> In the ad-files feature flags (UseAVX etc.) are used to control what 
> rules should be matched if it has effects on specific vector registers. 
> Here we have a mismatch.
> 
> On a platform that supports AVX2 but have MaxVectorSize limited to 16, 
> the VM will fail in regalloc when the TypeVect::VECTY/mreg2type[Op_VecY] 
> is uninitalized, we will also hit asserts in a few places like: 
> assert(Matcher::vector_size_supported(T_FLOAT,RegMask::SlotsPerVecY), 
> "sanity");
> 
> Shouldn't the type initalization in type.cpp be dependent on feature 
> flag (UseAVX etc.) instead of MaxVectorLength? (The type for the vector 
> registers are initalized if the platform supports them, but they might 
> not be used if MaxVectorSize is limited.)
> 
> I suggest something like this:
> 
> http://cr.openjdk.java.net/~neliasso/maxvectorsize/webrev/
> 
> I will open a bug and and a separate RFR if this seems reasonable to you.
> 
> Regards,
> Nils Eliasson
> 
> On 2017-09-22 09:41, Rohit Arul Raj wrote:
>> Thanks Vladimir,
>>
>> On Wed, Sep 20, 2017 at 10:07 PM, Vladimir Kozlov
>> <vladimir.kozlov at oracle.com> wrote:
>>>> ?????? __ cmpl(rax, 0x80000000);???? // Is cpuid(0x80000001) supported?
>>>> ?????? __ jcc(Assembler::belowEqual, done);
>>>> ?????? __ cmpl(rax, 0x80000004);???? // Is cpuid(0x80000005) supported?
>>>> -??? __ jccb(Assembler::belowEqual, ext_cpuid1);
>>>> +?? __ jcc(Assembler::belowEqual, ext_cpuid1);
>>>
>>> Good. You may need to increase size of the buffer too (to be safe) to 
>>> 1100:
>>>
>>> static const int stub_size = 1000;
>>>
>> Please find the updated patch after the requested change.
>>
>> diff --git a/src/cpu/x86/vm/vm_version_x86.cpp
>> b/src/cpu/x86/vm/vm_version_x86.cpp
>> --- a/src/cpu/x86/vm/vm_version_x86.cpp
>> +++ b/src/cpu/x86/vm/vm_version_x86.cpp
>> @@ -46,7 +46,7 @@
>> ? address VM_Version::_cpuinfo_cont_addr = 0;
>>
>> ? static BufferBlob* stub_blob;
>> -static const int stub_size = 1000;
>> +static const int stub_size = 1100;
>>
>> ? extern "C" {
>> ??? typedef void (*get_cpu_info_stub_t)(void*);
>> @@ -70,7 +70,7 @@
>> ????? bool use_evex = FLAG_IS_DEFAULT(UseAVX) || (UseAVX > 2);
>>
>> ????? Label detect_486, cpu486, detect_586, std_cpuid1, std_cpuid4;
>> -??? Label sef_cpuid, ext_cpuid, ext_cpuid1, ext_cpuid5, ext_cpuid7,
>> done, wrapup;
>> +??? Label sef_cpuid, ext_cpuid, ext_cpuid1, ext_cpuid5, ext_cpuid7,
>> ext_cpuid8, done, wrapup;
>> ????? Label legacy_setup, save_restore_except, legacy_save_restore,
>> start_simd_check;
>>
>> ????? StubCodeMark mark(this, "VM_Version", "get_cpu_info_stub");
>> @@ -267,14 +267,30 @@
>> ????? __ cmpl(rax, 0x80000000);???? // Is cpuid(0x80000001) supported?
>> ????? __ jcc(Assembler::belowEqual, done);
>> ????? __ cmpl(rax, 0x80000004);???? // Is cpuid(0x80000005) supported?
>> -??? __ jccb(Assembler::belowEqual, ext_cpuid1);
>> +??? __ jcc(Assembler::belowEqual, ext_cpuid1);
>> ????? __ cmpl(rax, 0x80000006);???? // Is cpuid(0x80000007) supported?
>> ????? __ jccb(Assembler::belowEqual, ext_cpuid5);
>> ????? __ cmpl(rax, 0x80000007);???? // Is cpuid(0x80000008) supported?
>> ????? __ jccb(Assembler::belowEqual, ext_cpuid7);
>> +??? __ cmpl(rax, 0x80000008);???? // Is cpuid(0x80000009 and above) 
>> supported?
>> +??? __ jccb(Assembler::belowEqual, ext_cpuid8);
>> +??? __ cmpl(rax, 0x8000001E);???? // Is cpuid(0x8000001E) supported?
>> +??? __ jccb(Assembler::below, ext_cpuid8);
>> +??? //
>> +??? // Extended cpuid(0x8000001E)
>> +??? //
>> +??? __ movl(rax, 0x8000001E);
>> +??? __ cpuid();
>> +??? __ lea(rsi, Address(rbp, 
>> in_bytes(VM_Version::ext_cpuid1E_offset())));
>> +??? __ movl(Address(rsi, 0), rax);
>> +??? __ movl(Address(rsi, 4), rbx);
>> +??? __ movl(Address(rsi, 8), rcx);
>> +??? __ movl(Address(rsi,12), rdx);
>> +
>> ????? //
>> ????? // Extended cpuid(0x80000008)
>> ????? //
>> +??? __ bind(ext_cpuid8);
>> ????? __ movl(rax, 0x80000008);
>> ????? __ cpuid();
>> ????? __ lea(rsi, Address(rbp, 
>> in_bytes(VM_Version::ext_cpuid8_offset())));
>> @@ -1109,11 +1125,27 @@
>> ????? }
>>
>> ? #ifdef COMPILER2
>> -??? if (MaxVectorSize > 16) {
>> -????? // Limit vectors size to 16 bytes on current AMD cpus.
>> +??? if (cpu_family() < 0x17 && MaxVectorSize > 16) {
>> +????? // Limit vectors size to 16 bytes on AMD cpus < 17h.
>> ??????? FLAG_SET_DEFAULT(MaxVectorSize, 16);
>> ????? }
>> ? #endif // COMPILER2
>> +
>> +??? // Some defaults for AMD family 17h
>> +??? if ( cpu_family() == 0x17 ) {
>> +????? // On family 17h processors use XMM and UnalignedLoadStores for
>> Array Copy
>> +????? if (supports_sse2() && FLAG_IS_DEFAULT(UseXMMForArrayCopy)) {
>> +??????? FLAG_SET_DEFAULT(UseXMMForArrayCopy, true);
>> +????? }
>> +????? if (supports_sse2() && FLAG_IS_DEFAULT(UseUnalignedLoadStores)) {
>> +??????? FLAG_SET_DEFAULT(UseUnalignedLoadStores, true);
>> +????? }
>> +#ifdef COMPILER2
>> +????? if (supports_sse4_2() && FLAG_IS_DEFAULT(UseFPUForSpilling)) {
>> +??????? FLAG_SET_DEFAULT(UseFPUForSpilling, true);
>> +????? }
>> +#endif
>> +??? }
>> ??? }
>>
>> ??? if( is_intel() ) { // Intel cpus specific settings
>> diff --git a/src/cpu/x86/vm/vm_version_x86.hpp
>> b/src/cpu/x86/vm/vm_version_x86.hpp
>> --- a/src/cpu/x86/vm/vm_version_x86.hpp
>> +++ b/src/cpu/x86/vm/vm_version_x86.hpp
>> @@ -228,6 +228,15 @@
>> ????? } bits;
>> ??? };
>>
>> +? union ExtCpuid1EEbx {
>> +??? uint32_t value;
>> +??? struct {
>> +????? uint32_t????????????????? : 8,
>> +?????????????? threads_per_core : 8,
>> +??????????????????????????????? : 16;
>> +??? } bits;
>> +? };
>> +
>> ??? union XemXcr0Eax {
>> ????? uint32_t value;
>> ????? struct {
>> @@ -398,6 +407,12 @@
>> ????? ExtCpuid8Ecx ext_cpuid8_ecx;
>> ????? uint32_t???? ext_cpuid8_edx; // reserved
>>
>> +??? // cpuid function 0x8000001E // AMD 17h
>> +??? uint32_t????? ext_cpuid1E_eax;
>> +??? ExtCpuid1EEbx ext_cpuid1E_ebx; // threads per core (AMD17h)
>> +??? uint32_t????? ext_cpuid1E_ecx;
>> +??? uint32_t????? ext_cpuid1E_edx; // unused currently
>> +
>> ????? // extended control register XCR0 (the XFEATURE_ENABLED_MASK 
>> register)
>> ????? XemXcr0Eax?? xem_xcr0_eax;
>> ????? uint32_t???? xem_xcr0_edx; // reserved
>> @@ -505,6 +520,14 @@
>> ??????? result |= CPU_CLMUL;
>> ????? if (_cpuid_info.sef_cpuid7_ebx.bits.rtm != 0)
>> ??????? result |= CPU_RTM;
>> +??? if(_cpuid_info.sef_cpuid7_ebx.bits.adx != 0)
>> +?????? result |= CPU_ADX;
>> +??? if(_cpuid_info.sef_cpuid7_ebx.bits.bmi2 != 0)
>> +????? result |= CPU_BMI2;
>> +??? if (_cpuid_info.sef_cpuid7_ebx.bits.sha != 0)
>> +????? result |= CPU_SHA;
>> +??? if (_cpuid_info.std_cpuid1_ecx.bits.fma != 0)
>> +????? result |= CPU_FMA;
>>
>> ????? // AMD features.
>> ????? if (is_amd()) {
>> @@ -518,16 +541,8 @@
>> ????? }
>> ????? // Intel features.
>> ????? if(is_intel()) {
>> -????? if(_cpuid_info.sef_cpuid7_ebx.bits.adx != 0)
>> -???????? result |= CPU_ADX;
>> -????? if(_cpuid_info.sef_cpuid7_ebx.bits.bmi2 != 0)
>> -??????? result |= CPU_BMI2;
>> -????? if (_cpuid_info.sef_cpuid7_ebx.bits.sha != 0)
>> -??????? result |= CPU_SHA;
>> ??????? if(_cpuid_info.ext_cpuid1_ecx.bits.lzcnt_intel != 0)
>> ????????? result |= CPU_LZCNT;
>> -????? if (_cpuid_info.std_cpuid1_ecx.bits.fma != 0)
>> -??????? result |= CPU_FMA;
>> ??????? // for Intel, ecx.bits.misalignsse bit (bit 8) indicates
>> support for prefetchw
>> ??????? if (_cpuid_info.ext_cpuid1_ecx.bits.misalignsse != 0) {
>> ????????? result |= CPU_3DNOW_PREFETCH;
>> @@ -590,6 +605,7 @@
>> ??? static ByteSize ext_cpuid5_offset() { return
>> byte_offset_of(CpuidInfo, ext_cpuid5_eax); }
>> ??? static ByteSize ext_cpuid7_offset() { return
>> byte_offset_of(CpuidInfo, ext_cpuid7_eax); }
>> ??? static ByteSize ext_cpuid8_offset() { return
>> byte_offset_of(CpuidInfo, ext_cpuid8_eax); }
>> +? static ByteSize ext_cpuid1E_offset() { return
>> byte_offset_of(CpuidInfo, ext_cpuid1E_eax); }
>> ??? static ByteSize tpl_cpuidB0_offset() { return
>> byte_offset_of(CpuidInfo, tpl_cpuidB0_eax); }
>> ??? static ByteSize tpl_cpuidB1_offset() { return
>> byte_offset_of(CpuidInfo, tpl_cpuidB1_eax); }
>> ??? static ByteSize tpl_cpuidB2_offset() { return
>> byte_offset_of(CpuidInfo, tpl_cpuidB2_eax); }
>> @@ -673,8 +689,12 @@
>> ????? if (is_intel() && supports_processor_topology()) {
>> ??????? result = _cpuid_info.tpl_cpuidB0_ebx.bits.logical_cpus;
>> ????? } else if (_cpuid_info.std_cpuid1_edx.bits.ht != 0) {
>> -????? result = _cpuid_info.std_cpuid1_ebx.bits.threads_per_cpu /
>> -?????????????? cores_per_cpu();
>> +????? if (cpu_family() >= 0x17) {
>> +??????? result = _cpuid_info.ext_cpuid1E_ebx.bits.threads_per_core + 1;
>> +????? } else {
>> +??????? result = _cpuid_info.std_cpuid1_ebx.bits.threads_per_cpu /
>> +???????????????? cores_per_cpu();
>> +????? }
>> ????? }
>> ????? return (result == 0 ? 1 : result);
>> ??? }
>>
>> Regards,
>> Rohit
>>
>>>> diff --git a/src/cpu/x86/vm/vm_version_x86.cpp
>>>> b/src/cpu/x86/vm/vm_version_x86.cpp
>>>> --- a/src/cpu/x86/vm/vm_version_x86.cpp
>>>> +++ b/src/cpu/x86/vm/vm_version_x86.cpp
>>>> @@ -70,7 +70,7 @@
>>>> ?????? bool use_evex = FLAG_IS_DEFAULT(UseAVX) || (UseAVX > 2);
>>>>
>>>> ?????? Label detect_486, cpu486, detect_586, std_cpuid1, std_cpuid4;
>>>> -??? Label sef_cpuid, ext_cpuid, ext_cpuid1, ext_cpuid5, ext_cpuid7,
>>>> done, wrapup;
>>>> +??? Label sef_cpuid, ext_cpuid, ext_cpuid1, ext_cpuid5, ext_cpuid7,
>>>> ext_cpuid8, done, wrapup;
>>>> ?????? Label legacy_setup, save_restore_except, legacy_save_restore,
>>>> start_simd_check;
>>>>
>>>> ?????? StubCodeMark mark(this, "VM_Version", "get_cpu_info_stub");
>>>> @@ -267,14 +267,30 @@
>>>> ?????? __ cmpl(rax, 0x80000000);???? // Is cpuid(0x80000001) supported?
>>>> ?????? __ jcc(Assembler::belowEqual, done);
>>>> ?????? __ cmpl(rax, 0x80000004);???? // Is cpuid(0x80000005) supported?
>>>> -??? __ jccb(Assembler::belowEqual, ext_cpuid1);
>>>> +??? __ jcc(Assembler::belowEqual, ext_cpuid1);
>>>> ?????? __ cmpl(rax, 0x80000006);???? // Is cpuid(0x80000007) supported?
>>>> ?????? __ jccb(Assembler::belowEqual, ext_cpuid5);
>>>> ?????? __ cmpl(rax, 0x80000007);???? // Is cpuid(0x80000008) supported?
>>>> ?????? __ jccb(Assembler::belowEqual, ext_cpuid7);
>>>> +??? __ cmpl(rax, 0x80000008);???? // Is cpuid(0x80000009 and above)
>>>> supported?
>>>> +??? __ jccb(Assembler::belowEqual, ext_cpuid8);
>>>> +??? __ cmpl(rax, 0x8000001E);???? // Is cpuid(0x8000001E) supported?
>>>> +??? __ jccb(Assembler::below, ext_cpuid8);
>>>> +??? //
>>>> +??? // Extended cpuid(0x8000001E)
>>>> +??? //
>>>> +??? __ movl(rax, 0x8000001E);
>>>> +??? __ cpuid();
>>>> +??? __ lea(rsi, Address(rbp,
>>>> in_bytes(VM_Version::ext_cpuid1E_offset())));
>>>> +??? __ movl(Address(rsi, 0), rax);
>>>> +??? __ movl(Address(rsi, 4), rbx);
>>>> +??? __ movl(Address(rsi, 8), rcx);
>>>> +??? __ movl(Address(rsi,12), rdx);
>>>> +
>>>> ?????? //
>>>> ?????? // Extended cpuid(0x80000008)
>>>> ?????? //
>>>> +??? __ bind(ext_cpuid8);
>>>> ?????? __ movl(rax, 0x80000008);
>>>> ?????? __ cpuid();
>>>> ?????? __ lea(rsi, Address(rbp,
>>>> in_bytes(VM_Version::ext_cpuid8_offset())));
>>>> @@ -1109,11 +1125,27 @@
>>>> ?????? }
>>>>
>>>> ?? #ifdef COMPILER2
>>>> -??? if (MaxVectorSize > 16) {
>>>> -????? // Limit vectors size to 16 bytes on current AMD cpus.
>>>> +??? if (cpu_family() < 0x17 && MaxVectorSize > 16) {
>>>> +????? // Limit vectors size to 16 bytes on AMD cpus < 17h.
>>>> ???????? FLAG_SET_DEFAULT(MaxVectorSize, 16);
>>>> ?????? }
>>>> ?? #endif // COMPILER2
>>>> +
>>>> +??? // Some defaults for AMD family 17h
>>>> +??? if ( cpu_family() == 0x17 ) {
>>>> +????? // On family 17h processors use XMM and UnalignedLoadStores for
>>>> Array Copy
>>>> +????? if (supports_sse2() && FLAG_IS_DEFAULT(UseXMMForArrayCopy)) {
>>>> +??????? FLAG_SET_DEFAULT(UseXMMForArrayCopy, true);
>>>> +????? }
>>>> +????? if (supports_sse2() && 
>>>> FLAG_IS_DEFAULT(UseUnalignedLoadStores)) {
>>>> +??????? FLAG_SET_DEFAULT(UseUnalignedLoadStores, true);
>>>> +????? }
>>>> +#ifdef COMPILER2
>>>> +????? if (supports_sse4_2() && FLAG_IS_DEFAULT(UseFPUForSpilling)) {
>>>> +??????? FLAG_SET_DEFAULT(UseFPUForSpilling, true);
>>>> +????? }
>>>> +#endif
>>>> +??? }
>>>> ???? }
>>>>
>>>> ???? if( is_intel() ) { // Intel cpus specific settings
>>>> diff --git a/src/cpu/x86/vm/vm_version_x86.hpp
>>>> b/src/cpu/x86/vm/vm_version_x86.hpp
>>>> --- a/src/cpu/x86/vm/vm_version_x86.hpp
>>>> +++ b/src/cpu/x86/vm/vm_version_x86.hpp
>>>> @@ -228,6 +228,15 @@
>>>> ?????? } bits;
>>>> ???? };
>>>>
>>>> +? union ExtCpuid1EEbx {
>>>> +??? uint32_t value;
>>>> +??? struct {
>>>> +????? uint32_t????????????????? : 8,
>>>> +?????????????? threads_per_core : 8,
>>>> +??????????????????????????????? : 16;
>>>> +??? } bits;
>>>> +? };
>>>> +
>>>> ???? union XemXcr0Eax {
>>>> ?????? uint32_t value;
>>>> ?????? struct {
>>>> @@ -398,6 +407,12 @@
>>>> ?????? ExtCpuid8Ecx ext_cpuid8_ecx;
>>>> ?????? uint32_t???? ext_cpuid8_edx; // reserved
>>>>
>>>> +??? // cpuid function 0x8000001E // AMD 17h
>>>> +??? uint32_t????? ext_cpuid1E_eax;
>>>> +??? ExtCpuid1EEbx ext_cpuid1E_ebx; // threads per core (AMD17h)
>>>> +??? uint32_t????? ext_cpuid1E_ecx;
>>>> +??? uint32_t????? ext_cpuid1E_edx; // unused currently
>>>> +
>>>> ?????? // extended control register XCR0 (the XFEATURE_ENABLED_MASK
>>>> register)
>>>> ?????? XemXcr0Eax?? xem_xcr0_eax;
>>>> ?????? uint32_t???? xem_xcr0_edx; // reserved
>>>> @@ -505,6 +520,14 @@
>>>> ???????? result |= CPU_CLMUL;
>>>> ?????? if (_cpuid_info.sef_cpuid7_ebx.bits.rtm != 0)
>>>> ???????? result |= CPU_RTM;
>>>> +??? if(_cpuid_info.sef_cpuid7_ebx.bits.adx != 0)
>>>> +?????? result |= CPU_ADX;
>>>> +??? if(_cpuid_info.sef_cpuid7_ebx.bits.bmi2 != 0)
>>>> +????? result |= CPU_BMI2;
>>>> +??? if (_cpuid_info.sef_cpuid7_ebx.bits.sha != 0)
>>>> +????? result |= CPU_SHA;
>>>> +??? if (_cpuid_info.std_cpuid1_ecx.bits.fma != 0)
>>>> +????? result |= CPU_FMA;
>>>>
>>>> ?????? // AMD features.
>>>> ?????? if (is_amd()) {
>>>> @@ -518,16 +541,8 @@
>>>> ?????? }
>>>> ?????? // Intel features.
>>>> ?????? if(is_intel()) {
>>>> -????? if(_cpuid_info.sef_cpuid7_ebx.bits.adx != 0)
>>>> -???????? result |= CPU_ADX;
>>>> -????? if(_cpuid_info.sef_cpuid7_ebx.bits.bmi2 != 0)
>>>> -??????? result |= CPU_BMI2;
>>>> -????? if (_cpuid_info.sef_cpuid7_ebx.bits.sha != 0)
>>>> -??????? result |= CPU_SHA;
>>>> ???????? if(_cpuid_info.ext_cpuid1_ecx.bits.lzcnt_intel != 0)
>>>> ?????????? result |= CPU_LZCNT;
>>>> -????? if (_cpuid_info.std_cpuid1_ecx.bits.fma != 0)
>>>> -??????? result |= CPU_FMA;
>>>> ???????? // for Intel, ecx.bits.misalignsse bit (bit 8) indicates
>>>> support for prefetchw
>>>> ???????? if (_cpuid_info.ext_cpuid1_ecx.bits.misalignsse != 0) {
>>>> ?????????? result |= CPU_3DNOW_PREFETCH;
>>>> @@ -590,6 +605,7 @@
>>>> ???? static ByteSize ext_cpuid5_offset() { return
>>>> byte_offset_of(CpuidInfo, ext_cpuid5_eax); }
>>>> ???? static ByteSize ext_cpuid7_offset() { return
>>>> byte_offset_of(CpuidInfo, ext_cpuid7_eax); }
>>>> ???? static ByteSize ext_cpuid8_offset() { return
>>>> byte_offset_of(CpuidInfo, ext_cpuid8_eax); }
>>>> +? static ByteSize ext_cpuid1E_offset() { return
>>>> byte_offset_of(CpuidInfo, ext_cpuid1E_eax); }
>>>> ???? static ByteSize tpl_cpuidB0_offset() { return
>>>> byte_offset_of(CpuidInfo, tpl_cpuidB0_eax); }
>>>> ???? static ByteSize tpl_cpuidB1_offset() { return
>>>> byte_offset_of(CpuidInfo, tpl_cpuidB1_eax); }
>>>> ???? static ByteSize tpl_cpuidB2_offset() { return
>>>> byte_offset_of(CpuidInfo, tpl_cpuidB2_eax); }
>>>> @@ -673,8 +689,12 @@
>>>> ?????? if (is_intel() && supports_processor_topology()) {
>>>> ???????? result = _cpuid_info.tpl_cpuidB0_ebx.bits.logical_cpus;
>>>> ?????? } else if (_cpuid_info.std_cpuid1_edx.bits.ht != 0) {
>>>> -????? result = _cpuid_info.std_cpuid1_ebx.bits.threads_per_cpu /
>>>> -?????????????? cores_per_cpu();
>>>> +????? if (cpu_family() >= 0x17) {
>>>> +??????? result = _cpuid_info.ext_cpuid1E_ebx.bits.threads_per_core 
>>>> + 1;
>>>> +????? } else {
>>>> +??????? result = _cpuid_info.std_cpuid1_ebx.bits.threads_per_cpu /
>>>> +???????????????? cores_per_cpu();
>>>> +????? }
>>>> ?????? }
>>>> ?????? return (result == 0 ? 1 : result);
>>>> ???? }
>>>>
>>>> Please let me know your comments.
>>>> Thanks for your review.
>>>>
>>>> Regards,
>>>> Rohit
>>>>
>>>>>
>>>>> On 9/11/17 9:52 PM, Rohit Arul Raj wrote:
>>>>>>
>>>>>> Hello David,
>>>>>>
>>>>>>>>
>>>>>>>> 1. ExtCpuid1EEx
>>>>>>>>
>>>>>>>> Should this be ExtCpuid1EEbx? (I see the naming here is somewhat
>>>>>>>> inconsistent - and potentially confusing: I would have preferred to
>>>>>>>> see
>>>>>>>> things like ExtCpuid_1E_Ebx, to make it clear.)
>>>>>>>
>>>>>>>
>>>>>>> Yes, I can change it accordingly.
>>>>>>>
>>>>>> I have attached the updated, re-tested patch as per your comments 
>>>>>> above.
>>>>>>
>>>>>> diff --git a/src/cpu/x86/vm/vm_version_x86.cpp
>>>>>> b/src/cpu/x86/vm/vm_version_x86.cpp
>>>>>> --- a/src/cpu/x86/vm/vm_version_x86.cpp
>>>>>> +++ b/src/cpu/x86/vm/vm_version_x86.cpp
>>>>>> @@ -70,7 +70,7 @@
>>>>>> ??????? bool use_evex = FLAG_IS_DEFAULT(UseAVX) || (UseAVX > 2);
>>>>>>
>>>>>> ??????? Label detect_486, cpu486, detect_586, std_cpuid1, std_cpuid4;
>>>>>> -??? Label sef_cpuid, ext_cpuid, ext_cpuid1, ext_cpuid5, ext_cpuid7,
>>>>>> done, wrapup;
>>>>>> +??? Label sef_cpuid, ext_cpuid, ext_cpuid1, ext_cpuid5, ext_cpuid7,
>>>>>> ext_cpuid8, done, wrapup;
>>>>>> ??????? Label legacy_setup, save_restore_except, legacy_save_restore,
>>>>>> start_simd_check;
>>>>>>
>>>>>> ??????? StubCodeMark mark(this, "VM_Version", "get_cpu_info_stub");
>>>>>> @@ -272,9 +272,23 @@
>>>>>> ??????? __ jccb(Assembler::belowEqual, ext_cpuid5);
>>>>>> ??????? __ cmpl(rax, 0x80000007);???? // Is cpuid(0x80000008) 
>>>>>> supported?
>>>>>> ??????? __ jccb(Assembler::belowEqual, ext_cpuid7);
>>>>>> +??? __ cmpl(rax, 0x80000008);???? // Is cpuid(0x8000001E) supported?
>>>>>> +??? __ jccb(Assembler::belowEqual, ext_cpuid8);
>>>>>> +??? //
>>>>>> +??? // Extended cpuid(0x8000001E)
>>>>>> +??? //
>>>>>> +??? __ movl(rax, 0x8000001E);
>>>>>> +??? __ cpuid();
>>>>>> +??? __ lea(rsi, Address(rbp,
>>>>>> in_bytes(VM_Version::ext_cpuid_1E_offset())));
>>>>>> +??? __ movl(Address(rsi, 0), rax);
>>>>>> +??? __ movl(Address(rsi, 4), rbx);
>>>>>> +??? __ movl(Address(rsi, 8), rcx);
>>>>>> +??? __ movl(Address(rsi,12), rdx);
>>>>>> +
>>>>>> ??????? //
>>>>>> ??????? // Extended cpuid(0x80000008)
>>>>>> ??????? //
>>>>>> +??? __ bind(ext_cpuid8);
>>>>>> ??????? __ movl(rax, 0x80000008);
>>>>>> ??????? __ cpuid();
>>>>>> ??????? __ lea(rsi, Address(rbp,
>>>>>> in_bytes(VM_Version::ext_cpuid8_offset())));
>>>>>> @@ -1109,11 +1123,27 @@
>>>>>> ??????? }
>>>>>>
>>>>>> ??? #ifdef COMPILER2
>>>>>> -??? if (MaxVectorSize > 16) {
>>>>>> -????? // Limit vectors size to 16 bytes on current AMD cpus.
>>>>>> +??? if (cpu_family() < 0x17 && MaxVectorSize > 16) {
>>>>>> +????? // Limit vectors size to 16 bytes on AMD cpus < 17h.
>>>>>> ????????? FLAG_SET_DEFAULT(MaxVectorSize, 16);
>>>>>> ??????? }
>>>>>> ??? #endif // COMPILER2
>>>>>> +
>>>>>> +??? // Some defaults for AMD family 17h
>>>>>> +??? if ( cpu_family() == 0x17 ) {
>>>>>> +????? // On family 17h processors use XMM and UnalignedLoadStores 
>>>>>> for
>>>>>> Array Copy
>>>>>> +????? if (supports_sse2() && FLAG_IS_DEFAULT(UseXMMForArrayCopy)) {
>>>>>> +??????? FLAG_SET_DEFAULT(UseXMMForArrayCopy, true);
>>>>>> +????? }
>>>>>> +????? if (supports_sse2() && 
>>>>>> FLAG_IS_DEFAULT(UseUnalignedLoadStores)) {
>>>>>> +??????? FLAG_SET_DEFAULT(UseUnalignedLoadStores, true);
>>>>>> +????? }
>>>>>> +#ifdef COMPILER2
>>>>>> +????? if (supports_sse4_2() && FLAG_IS_DEFAULT(UseFPUForSpilling)) {
>>>>>> +??????? FLAG_SET_DEFAULT(UseFPUForSpilling, true);
>>>>>> +????? }
>>>>>> +#endif
>>>>>> +??? }
>>>>>> ????? }
>>>>>>
>>>>>> ????? if( is_intel() ) { // Intel cpus specific settings
>>>>>> diff --git a/src/cpu/x86/vm/vm_version_x86.hpp
>>>>>> b/src/cpu/x86/vm/vm_version_x86.hpp
>>>>>> --- a/src/cpu/x86/vm/vm_version_x86.hpp
>>>>>> +++ b/src/cpu/x86/vm/vm_version_x86.hpp
>>>>>> @@ -228,6 +228,15 @@
>>>>>> ??????? } bits;
>>>>>> ????? };
>>>>>>
>>>>>> +? union ExtCpuid_1E_Ebx {
>>>>>> +??? uint32_t value;
>>>>>> +??? struct {
>>>>>> +????? uint32_t????????????????? : 8,
>>>>>> +?????????????? threads_per_core : 8,
>>>>>> +??????????????????????????????? : 16;
>>>>>> +??? } bits;
>>>>>> +? };
>>>>>> +
>>>>>> ????? union XemXcr0Eax {
>>>>>> ??????? uint32_t value;
>>>>>> ??????? struct {
>>>>>> @@ -398,6 +407,12 @@
>>>>>> ??????? ExtCpuid8Ecx ext_cpuid8_ecx;
>>>>>> ??????? uint32_t???? ext_cpuid8_edx; // reserved
>>>>>>
>>>>>> +??? // cpuid function 0x8000001E // AMD 17h
>>>>>> +??? uint32_t??????? ext_cpuid_1E_eax;
>>>>>> +??? ExtCpuid_1E_Ebx ext_cpuid_1E_ebx; // threads per core (AMD17h)
>>>>>> +??? uint32_t??????? ext_cpuid_1E_ecx;
>>>>>> +??? uint32_t??????? ext_cpuid_1E_edx; // unused currently
>>>>>> +
>>>>>> ??????? // extended control register XCR0 (the XFEATURE_ENABLED_MASK
>>>>>> register)
>>>>>> ??????? XemXcr0Eax?? xem_xcr0_eax;
>>>>>> ??????? uint32_t???? xem_xcr0_edx; // reserved
>>>>>> @@ -505,6 +520,14 @@
>>>>>> ????????? result |= CPU_CLMUL;
>>>>>> ??????? if (_cpuid_info.sef_cpuid7_ebx.bits.rtm != 0)
>>>>>> ????????? result |= CPU_RTM;
>>>>>> +??? if(_cpuid_info.sef_cpuid7_ebx.bits.adx != 0)
>>>>>> +?????? result |= CPU_ADX;
>>>>>> +??? if(_cpuid_info.sef_cpuid7_ebx.bits.bmi2 != 0)
>>>>>> +????? result |= CPU_BMI2;
>>>>>> +??? if (_cpuid_info.sef_cpuid7_ebx.bits.sha != 0)
>>>>>> +????? result |= CPU_SHA;
>>>>>> +??? if (_cpuid_info.std_cpuid1_ecx.bits.fma != 0)
>>>>>> +????? result |= CPU_FMA;
>>>>>>
>>>>>> ??????? // AMD features.
>>>>>> ??????? if (is_amd()) {
>>>>>> @@ -518,16 +541,8 @@
>>>>>> ??????? }
>>>>>> ??????? // Intel features.
>>>>>> ??????? if(is_intel()) {
>>>>>> -????? if(_cpuid_info.sef_cpuid7_ebx.bits.adx != 0)
>>>>>> -???????? result |= CPU_ADX;
>>>>>> -????? if(_cpuid_info.sef_cpuid7_ebx.bits.bmi2 != 0)
>>>>>> -??????? result |= CPU_BMI2;
>>>>>> -????? if (_cpuid_info.sef_cpuid7_ebx.bits.sha != 0)
>>>>>> -??????? result |= CPU_SHA;
>>>>>> ????????? if(_cpuid_info.ext_cpuid1_ecx.bits.lzcnt_intel != 0)
>>>>>> ??????????? result |= CPU_LZCNT;
>>>>>> -????? if (_cpuid_info.std_cpuid1_ecx.bits.fma != 0)
>>>>>> -??????? result |= CPU_FMA;
>>>>>> ????????? // for Intel, ecx.bits.misalignsse bit (bit 8) indicates
>>>>>> support for prefetchw
>>>>>> ????????? if (_cpuid_info.ext_cpuid1_ecx.bits.misalignsse != 0) {
>>>>>> ??????????? result |= CPU_3DNOW_PREFETCH;
>>>>>> @@ -590,6 +605,7 @@
>>>>>> ????? static ByteSize ext_cpuid5_offset() { return
>>>>>> byte_offset_of(CpuidInfo, ext_cpuid5_eax); }
>>>>>> ????? static ByteSize ext_cpuid7_offset() { return
>>>>>> byte_offset_of(CpuidInfo, ext_cpuid7_eax); }
>>>>>> ????? static ByteSize ext_cpuid8_offset() { return
>>>>>> byte_offset_of(CpuidInfo, ext_cpuid8_eax); }
>>>>>> +? static ByteSize ext_cpuid_1E_offset() { return
>>>>>> byte_offset_of(CpuidInfo, ext_cpuid_1E_eax); }
>>>>>> ????? static ByteSize tpl_cpuidB0_offset() { return
>>>>>> byte_offset_of(CpuidInfo, tpl_cpuidB0_eax); }
>>>>>> ????? static ByteSize tpl_cpuidB1_offset() { return
>>>>>> byte_offset_of(CpuidInfo, tpl_cpuidB1_eax); }
>>>>>> ????? static ByteSize tpl_cpuidB2_offset() { return
>>>>>> byte_offset_of(CpuidInfo, tpl_cpuidB2_eax); }
>>>>>> @@ -673,8 +689,11 @@
>>>>>> ??????? if (is_intel() && supports_processor_topology()) {
>>>>>> ????????? result = _cpuid_info.tpl_cpuidB0_ebx.bits.logical_cpus;
>>>>>> ??????? } else if (_cpuid_info.std_cpuid1_edx.bits.ht != 0) {
>>>>>> -????? result = _cpuid_info.std_cpuid1_ebx.bits.threads_per_cpu /
>>>>>> -?????????????? cores_per_cpu();
>>>>>> +????? if (cpu_family() >= 0x17)
>>>>>> +??????? result = 
>>>>>> _cpuid_info.ext_cpuid_1E_ebx.bits.threads_per_core +
>>>>>> 1;
>>>>>> +????? else
>>>>>> +??????? result = _cpuid_info.std_cpuid1_ebx.bits.threads_per_cpu /
>>>>>> +???????????????? cores_per_cpu();
>>>>>> ??????? }
>>>>>> ??????? return (result == 0 ? 1 : result);
>>>>>> ????? }
>>>>>>
>>>>>>
>>>>>> Please let me know your comments
>>>>>>
>>>>>> Thanks for your time.
>>>>>>
>>>>>> Regards,
>>>>>> Rohit
>>>>>>
>>>>>>
>>>>>>>> Thanks,
>>>>>>>> David
>>>>>>>> -----
>>>>>>>>
>>>>>>>>
>>>>>>>>> Reference:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> https://support.amd.com/TechDocs/54945_PPR_Family_17h_Models_00h-0Fh.pdf 
>>>>>>>>>
>>>>>>>>> [Pg 82]
>>>>>>>>>
>>>>>>>>> ??????? CPUID_Fn8000001E_EBX [Core Identifiers] (CoreId)
>>>>>>>>> ????????? 15:8 ThreadsPerCore: threads per core. Read-only. Reset:
>>>>>>>>> XXh.
>>>>>>>>> The number of threads per core is ThreadsPerCore+1.
>>>>>>>>>
>>>>>>>>> diff --git a/src/cpu/x86/vm/vm_version_x86.cpp
>>>>>>>>> b/src/cpu/x86/vm/vm_version_x86.cpp
>>>>>>>>> --- a/src/cpu/x86/vm/vm_version_x86.cpp
>>>>>>>>> +++ b/src/cpu/x86/vm/vm_version_x86.cpp
>>>>>>>>> @@ -70,7 +70,7 @@
>>>>>>>>> ???????? bool use_evex = FLAG_IS_DEFAULT(UseAVX) || (UseAVX > 2);
>>>>>>>>>
>>>>>>>>> ???????? Label detect_486, cpu486, detect_586, std_cpuid1, 
>>>>>>>>> std_cpuid4;
>>>>>>>>> -??? Label sef_cpuid, ext_cpuid, ext_cpuid1, ext_cpuid5, 
>>>>>>>>> ext_cpuid7,
>>>>>>>>> done, wrapup;
>>>>>>>>> +??? Label sef_cpuid, ext_cpuid, ext_cpuid1, ext_cpuid5, 
>>>>>>>>> ext_cpuid7,
>>>>>>>>> ext_cpuid8, done, wrapup;
>>>>>>>>> ???????? Label legacy_setup, save_restore_except, 
>>>>>>>>> legacy_save_restore,
>>>>>>>>> start_simd_check;
>>>>>>>>>
>>>>>>>>> ???????? StubCodeMark mark(this, "VM_Version", 
>>>>>>>>> "get_cpu_info_stub");
>>>>>>>>> @@ -272,9 +272,23 @@
>>>>>>>>> ???????? __ jccb(Assembler::belowEqual, ext_cpuid5);
>>>>>>>>> ???????? __ cmpl(rax, 0x80000007);???? // Is cpuid(0x80000008)
>>>>>>>>> supported?
>>>>>>>>> ???????? __ jccb(Assembler::belowEqual, ext_cpuid7);
>>>>>>>>> +??? __ cmpl(rax, 0x80000008);???? // Is cpuid(0x8000001E) 
>>>>>>>>> supported?
>>>>>>>>> +??? __ jccb(Assembler::belowEqual, ext_cpuid8);
>>>>>>>>> +??? //
>>>>>>>>> +??? // Extended cpuid(0x8000001E)
>>>>>>>>> +??? //
>>>>>>>>> +??? __ movl(rax, 0x8000001E);
>>>>>>>>> +??? __ cpuid();
>>>>>>>>> +??? __ lea(rsi, Address(rbp,
>>>>>>>>> in_bytes(VM_Version::ext_cpuid1E_offset())));
>>>>>>>>> +??? __ movl(Address(rsi, 0), rax);
>>>>>>>>> +??? __ movl(Address(rsi, 4), rbx);
>>>>>>>>> +??? __ movl(Address(rsi, 8), rcx);
>>>>>>>>> +??? __ movl(Address(rsi,12), rdx);
>>>>>>>>> +
>>>>>>>>> ???????? //
>>>>>>>>> ???????? // Extended cpuid(0x80000008)
>>>>>>>>> ???????? //
>>>>>>>>> +??? __ bind(ext_cpuid8);
>>>>>>>>> ???????? __ movl(rax, 0x80000008);
>>>>>>>>> ???????? __ cpuid();
>>>>>>>>> ???????? __ lea(rsi, Address(rbp,
>>>>>>>>> in_bytes(VM_Version::ext_cpuid8_offset())));
>>>>>>>>> @@ -1109,11 +1123,27 @@
>>>>>>>>> ???????? }
>>>>>>>>>
>>>>>>>>> ???? #ifdef COMPILER2
>>>>>>>>> -??? if (MaxVectorSize > 16) {
>>>>>>>>> -????? // Limit vectors size to 16 bytes on current AMD cpus.
>>>>>>>>> +??? if (cpu_family() < 0x17 && MaxVectorSize > 16) {
>>>>>>>>> +????? // Limit vectors size to 16 bytes on AMD cpus < 17h.
>>>>>>>>> ?????????? FLAG_SET_DEFAULT(MaxVectorSize, 16);
>>>>>>>>> ???????? }
>>>>>>>>> ???? #endif // COMPILER2
>>>>>>>>> +
>>>>>>>>> +??? // Some defaults for AMD family 17h
>>>>>>>>> +??? if ( cpu_family() == 0x17 ) {
>>>>>>>>> +????? // On family 17h processors use XMM and UnalignedLoadStores
>>>>>>>>> for
>>>>>>>>> Array Copy
>>>>>>>>> +????? if (supports_sse2() && 
>>>>>>>>> FLAG_IS_DEFAULT(UseXMMForArrayCopy)) {
>>>>>>>>> +??????? FLAG_SET_DEFAULT(UseXMMForArrayCopy, true);
>>>>>>>>> +????? }
>>>>>>>>> +????? if (supports_sse2() &&
>>>>>>>>> FLAG_IS_DEFAULT(UseUnalignedLoadStores))
>>>>>>>>> {
>>>>>>>>> +??????? FLAG_SET_DEFAULT(UseUnalignedLoadStores, true);
>>>>>>>>> +????? }
>>>>>>>>> +#ifdef COMPILER2
>>>>>>>>> +????? if (supports_sse4_2() && 
>>>>>>>>> FLAG_IS_DEFAULT(UseFPUForSpilling)) {
>>>>>>>>> +??????? FLAG_SET_DEFAULT(UseFPUForSpilling, true);
>>>>>>>>> +????? }
>>>>>>>>> +#endif
>>>>>>>>> +??? }
>>>>>>>>> ?????? }
>>>>>>>>>
>>>>>>>>> ?????? if( is_intel() ) { // Intel cpus specific settings
>>>>>>>>> diff --git a/src/cpu/x86/vm/vm_version_x86.hpp
>>>>>>>>> b/src/cpu/x86/vm/vm_version_x86.hpp
>>>>>>>>> --- a/src/cpu/x86/vm/vm_version_x86.hpp
>>>>>>>>> +++ b/src/cpu/x86/vm/vm_version_x86.hpp
>>>>>>>>> @@ -228,6 +228,15 @@
>>>>>>>>> ???????? } bits;
>>>>>>>>> ?????? };
>>>>>>>>>
>>>>>>>>> +? union ExtCpuid1EEx {
>>>>>>>>> +??? uint32_t value;
>>>>>>>>> +??? struct {
>>>>>>>>> +????? uint32_t????????????????? : 8,
>>>>>>>>> +?????????????? threads_per_core : 8,
>>>>>>>>> +??????????????????????????????? : 16;
>>>>>>>>> +??? } bits;
>>>>>>>>> +? };
>>>>>>>>> +
>>>>>>>>> ?????? union XemXcr0Eax {
>>>>>>>>> ???????? uint32_t value;
>>>>>>>>> ???????? struct {
>>>>>>>>> @@ -398,6 +407,12 @@
>>>>>>>>> ???????? ExtCpuid8Ecx ext_cpuid8_ecx;
>>>>>>>>> ???????? uint32_t???? ext_cpuid8_edx; // reserved
>>>>>>>>>
>>>>>>>>> +??? // cpuid function 0x8000001E // AMD 17h
>>>>>>>>> +??? uint32_t???? ext_cpuid1E_eax;
>>>>>>>>> +??? ExtCpuid1EEx ext_cpuid1E_ebx; // threads per core (AMD17h)
>>>>>>>>> +??? uint32_t???? ext_cpuid1E_ecx;
>>>>>>>>> +??? uint32_t???? ext_cpuid1E_edx; // unused currently
>>>>>>>>> +
>>>>>>>>> ???????? // extended control register XCR0 (the 
>>>>>>>>> XFEATURE_ENABLED_MASK
>>>>>>>>> register)
>>>>>>>>> ???????? XemXcr0Eax?? xem_xcr0_eax;
>>>>>>>>> ???????? uint32_t???? xem_xcr0_edx; // reserved
>>>>>>>>> @@ -505,6 +520,14 @@
>>>>>>>>> ?????????? result |= CPU_CLMUL;
>>>>>>>>> ???????? if (_cpuid_info.sef_cpuid7_ebx.bits.rtm != 0)
>>>>>>>>> ?????????? result |= CPU_RTM;
>>>>>>>>> +??? if(_cpuid_info.sef_cpuid7_ebx.bits.adx != 0)
>>>>>>>>> +?????? result |= CPU_ADX;
>>>>>>>>> +??? if(_cpuid_info.sef_cpuid7_ebx.bits.bmi2 != 0)
>>>>>>>>> +????? result |= CPU_BMI2;
>>>>>>>>> +??? if (_cpuid_info.sef_cpuid7_ebx.bits.sha != 0)
>>>>>>>>> +????? result |= CPU_SHA;
>>>>>>>>> +??? if (_cpuid_info.std_cpuid1_ecx.bits.fma != 0)
>>>>>>>>> +????? result |= CPU_FMA;
>>>>>>>>>
>>>>>>>>> ???????? // AMD features.
>>>>>>>>> ???????? if (is_amd()) {
>>>>>>>>> @@ -518,16 +541,8 @@
>>>>>>>>> ???????? }
>>>>>>>>> ???????? // Intel features.
>>>>>>>>> ???????? if(is_intel()) {
>>>>>>>>> -????? if(_cpuid_info.sef_cpuid7_ebx.bits.adx != 0)
>>>>>>>>> -???????? result |= CPU_ADX;
>>>>>>>>> -????? if(_cpuid_info.sef_cpuid7_ebx.bits.bmi2 != 0)
>>>>>>>>> -??????? result |= CPU_BMI2;
>>>>>>>>> -????? if (_cpuid_info.sef_cpuid7_ebx.bits.sha != 0)
>>>>>>>>> -??????? result |= CPU_SHA;
>>>>>>>>> ?????????? if(_cpuid_info.ext_cpuid1_ecx.bits.lzcnt_intel != 0)
>>>>>>>>> ???????????? result |= CPU_LZCNT;
>>>>>>>>> -????? if (_cpuid_info.std_cpuid1_ecx.bits.fma != 0)
>>>>>>>>> -??????? result |= CPU_FMA;
>>>>>>>>> ?????????? // for Intel, ecx.bits.misalignsse bit (bit 8) 
>>>>>>>>> indicates
>>>>>>>>> support for prefetchw
>>>>>>>>> ?????????? if (_cpuid_info.ext_cpuid1_ecx.bits.misalignsse != 0) {
>>>>>>>>> ???????????? result |= CPU_3DNOW_PREFETCH;
>>>>>>>>> @@ -590,6 +605,7 @@
>>>>>>>>> ?????? static ByteSize ext_cpuid5_offset() { return
>>>>>>>>> byte_offset_of(CpuidInfo, ext_cpuid5_eax); }
>>>>>>>>> ?????? static ByteSize ext_cpuid7_offset() { return
>>>>>>>>> byte_offset_of(CpuidInfo, ext_cpuid7_eax); }
>>>>>>>>> ?????? static ByteSize ext_cpuid8_offset() { return
>>>>>>>>> byte_offset_of(CpuidInfo, ext_cpuid8_eax); }
>>>>>>>>> +? static ByteSize ext_cpuid1E_offset() { return
>>>>>>>>> byte_offset_of(CpuidInfo, ext_cpuid1E_eax); }
>>>>>>>>> ?????? static ByteSize tpl_cpuidB0_offset() { return
>>>>>>>>> byte_offset_of(CpuidInfo, tpl_cpuidB0_eax); }
>>>>>>>>> ?????? static ByteSize tpl_cpuidB1_offset() { return
>>>>>>>>> byte_offset_of(CpuidInfo, tpl_cpuidB1_eax); }
>>>>>>>>> ?????? static ByteSize tpl_cpuidB2_offset() { return
>>>>>>>>> byte_offset_of(CpuidInfo, tpl_cpuidB2_eax); }
>>>>>>>>> @@ -673,8 +689,11 @@
>>>>>>>>> ???????? if (is_intel() && supports_processor_topology()) {
>>>>>>>>> ?????????? result = _cpuid_info.tpl_cpuidB0_ebx.bits.logical_cpus;
>>>>>>>>> ???????? } else if (_cpuid_info.std_cpuid1_edx.bits.ht != 0) {
>>>>>>>>> -????? result = _cpuid_info.std_cpuid1_ebx.bits.threads_per_cpu /
>>>>>>>>> -?????????????? cores_per_cpu();
>>>>>>>>> +????? if (cpu_family() >= 0x17)
>>>>>>>>> +??????? result = 
>>>>>>>>> _cpuid_info.ext_cpuid1E_ebx.bits.threads_per_core +
>>>>>>>>> 1;
>>>>>>>>> +????? else
>>>>>>>>> +??????? result = 
>>>>>>>>> _cpuid_info.std_cpuid1_ebx.bits.threads_per_cpu /
>>>>>>>>> +???????????????? cores_per_cpu();
>>>>>>>>> ???????? }
>>>>>>>>> ???????? return (result == 0 ? 1 : result);
>>>>>>>>> ?????? }
>>>>>>>>>
>>>>>>>>> I have attached the patch for review.
>>>>>>>>> Please let me know your comments.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Rohit
>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Vladimir
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> src/cpu/x86/vm/vm_version_x86.cpp
>>>>>>>>>>>
>>>>>>>>>>> No comments on AMD specific changes.
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> David
>>>>>>>>>>> -----
>>>>>>>>>>>
>>>>>>>>>>> On 5/09/2017 3:43 PM, David Holmes wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On 5/09/2017 3:29 PM, Rohit Arul Raj wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Hello David,
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Tue, Sep 5, 2017 at 10:31 AM, David Holmes
>>>>>>>>>>>>> <david.holmes at oracle.com>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi Rohit,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I was unable to apply your patch to latest jdk10/hs/hotspot
>>>>>>>>>>>>>> repo.
>>>>>>>>>>>>>>
>>>>>>>>>>>>> I checked out the latest jdk10/hs/hotspot [parent:
>>>>>>>>>>>>> 13548:1a9c2e07a826]
>>>>>>>>>>>>> and was able to apply the patch
>>>>>>>>>>>>> [epyc-amd17h-defaults-3Sept.patch]
>>>>>>>>>>>>> without any issues.
>>>>>>>>>>>>> Can you share the error message that you are getting?
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> I was getting this:
>>>>>>>>>>>>
>>>>>>>>>>>> applying hotspot.patch
>>>>>>>>>>>> patching file src/cpu/x86/vm/vm_version_x86.cpp
>>>>>>>>>>>> Hunk #1 FAILED at 1108
>>>>>>>>>>>> 1 out of 1 hunks FAILED -- saving rejects to file
>>>>>>>>>>>> src/cpu/x86/vm/vm_version_x86.cpp.rej
>>>>>>>>>>>> patching file src/cpu/x86/vm/vm_version_x86.hpp
>>>>>>>>>>>> Hunk #2 FAILED at 522
>>>>>>>>>>>> 1 out of 2 hunks FAILED -- saving rejects to file
>>>>>>>>>>>> src/cpu/x86/vm/vm_version_x86.hpp.rej
>>>>>>>>>>>> abort: patch failed to apply
>>>>>>>>>>>>
>>>>>>>>>>>> but I started again and this time it applied fine, so not sure
>>>>>>>>>>>> what
>>>>>>>>>>>> was
>>>>>>>>>>>> going on there.
>>>>>>>>>>>>
>>>>>>>>>>>> Cheers,
>>>>>>>>>>>> David
>>>>>>>>>>>>
>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>> Rohit
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 4/09/2017 2:42 AM, Rohit Arul Raj wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hello Vladimir,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Sat, Sep 2, 2017 at 11:25 PM, Vladimir Kozlov
>>>>>>>>>>>>>>> <vladimir.kozlov at oracle.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hi Rohit,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On 9/2/17 1:16 AM, Rohit Arul Raj wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Hello Vladimir,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Changes look good. Only question I have is about
>>>>>>>>>>>>>>>>>> MaxVectorSize.
>>>>>>>>>>>>>>>>>> It
>>>>>>>>>>>>>>>>>> is
>>>>>>>>>>>>>>>>>> set
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> 16 only in presence of AVX:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> http://hg.openjdk.java.net/jdk10/hs/hotspot/file/046eab27258f/src/cpu/x86/vm/vm_version_x86.cpp#l945 
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Does that code works for AMD 17h too?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks for pointing that out. Yes, the code works fine for
>>>>>>>>>>>>>>>>> AMD
>>>>>>>>>>>>>>>>> 17h.
>>>>>>>>>>>>>>>>> So
>>>>>>>>>>>>>>>>> I have removed the surplus check for MaxVectorSize from my
>>>>>>>>>>>>>>>>> patch.
>>>>>>>>>>>>>>>>> I
>>>>>>>>>>>>>>>>> have updated, re-tested and attached the patch.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Which check you removed?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> My older patch had the below mentioned check which was 
>>>>>>>>>>>>>>> required
>>>>>>>>>>>>>>> on
>>>>>>>>>>>>>>> JDK9 where the default MaxVectorSize was 64. It has been
>>>>>>>>>>>>>>> handled
>>>>>>>>>>>>>>> better in openJDK10. So this check is not required anymore.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> +??? // Some defaults for AMD family 17h
>>>>>>>>>>>>>>> +??? if ( cpu_family() == 0x17 ) {
>>>>>>>>>>>>>>> ...
>>>>>>>>>>>>>>> ...
>>>>>>>>>>>>>>> +????? if (MaxVectorSize > 32) {
>>>>>>>>>>>>>>> +??????? FLAG_SET_DEFAULT(MaxVectorSize, 32);
>>>>>>>>>>>>>>> +????? }
>>>>>>>>>>>>>>> ..
>>>>>>>>>>>>>>> ..
>>>>>>>>>>>>>>> +????? }
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I have one query regarding the setting of UseSHA flag:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> http://hg.openjdk.java.net/jdk10/hs/hotspot/file/046eab27258f/src/cpu/x86/vm/vm_version_x86.cpp#l821 
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> AMD 17h has support for SHA.
>>>>>>>>>>>>>>>>> AMD 15h doesn't have? support for SHA. Still "UseSHA" flag
>>>>>>>>>>>>>>>>> gets
>>>>>>>>>>>>>>>>> enabled for it based on the availability of BMI2 and 
>>>>>>>>>>>>>>>>> AVX2. Is
>>>>>>>>>>>>>>>>> there
>>>>>>>>>>>>>>>>> an
>>>>>>>>>>>>>>>>> underlying reason for this? I have handled this in the 
>>>>>>>>>>>>>>>>> patch
>>>>>>>>>>>>>>>>> but
>>>>>>>>>>>>>>>>> just
>>>>>>>>>>>>>>>>> wanted to confirm.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> It was done with next changes which use only AVX2 and BMI2
>>>>>>>>>>>>>>>> instructions
>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>> calculate SHA-256:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> http://hg.openjdk.java.net/jdk10/hs/hotspot/rev/6a17c49de974 
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I don't know if AMD 15h supports these instructions and can
>>>>>>>>>>>>>>>> execute
>>>>>>>>>>>>>>>> that
>>>>>>>>>>>>>>>> code. You need to test it.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Ok, got it. Since AMD15h has support for AVX2 and BMI2
>>>>>>>>>>>>>>> instructions,
>>>>>>>>>>>>>>> it should work.
>>>>>>>>>>>>>>> Confirmed by running following sanity tests:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> ./hotspot/test/compiler/intrinsics/sha/sanity/TestSHA1Intrinsics.java 
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> ./hotspot/test/compiler/intrinsics/sha/sanity/TestSHA512Intrinsics.java 
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> ./hotspot/test/compiler/intrinsics/sha/sanity/TestSHA256Intrinsics.java 
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> So I have removed those SHA checks from my patch too.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Please find attached updated, re-tested patch.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> diff --git a/src/cpu/x86/vm/vm_version_x86.cpp
>>>>>>>>>>>>>>> b/src/cpu/x86/vm/vm_version_x86.cpp
>>>>>>>>>>>>>>> --- a/src/cpu/x86/vm/vm_version_x86.cpp
>>>>>>>>>>>>>>> +++ b/src/cpu/x86/vm/vm_version_x86.cpp
>>>>>>>>>>>>>>> @@ -1109,11 +1109,27 @@
>>>>>>>>>>>>>>> ?????????? }
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> ?????? #ifdef COMPILER2
>>>>>>>>>>>>>>> -??? if (MaxVectorSize > 16) {
>>>>>>>>>>>>>>> -????? // Limit vectors size to 16 bytes on current AMD 
>>>>>>>>>>>>>>> cpus.
>>>>>>>>>>>>>>> +??? if (cpu_family() < 0x17 && MaxVectorSize > 16) {
>>>>>>>>>>>>>>> +????? // Limit vectors size to 16 bytes on AMD cpus < 17h.
>>>>>>>>>>>>>>> ???????????? FLAG_SET_DEFAULT(MaxVectorSize, 16);
>>>>>>>>>>>>>>> ?????????? }
>>>>>>>>>>>>>>> ?????? #endif // COMPILER2
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>> +??? // Some defaults for AMD family 17h
>>>>>>>>>>>>>>> +??? if ( cpu_family() == 0x17 ) {
>>>>>>>>>>>>>>> +????? // On family 17h processors use XMM and
>>>>>>>>>>>>>>> UnalignedLoadStores
>>>>>>>>>>>>>>> for
>>>>>>>>>>>>>>> Array Copy
>>>>>>>>>>>>>>> +????? if (supports_sse2() &&
>>>>>>>>>>>>>>> FLAG_IS_DEFAULT(UseXMMForArrayCopy)) {
>>>>>>>>>>>>>>> +??????? FLAG_SET_DEFAULT(UseXMMForArrayCopy, true);
>>>>>>>>>>>>>>> +????? }
>>>>>>>>>>>>>>> +????? if (supports_sse2() &&
>>>>>>>>>>>>>>> FLAG_IS_DEFAULT(UseUnalignedLoadStores))
>>>>>>>>>>>>>>> {
>>>>>>>>>>>>>>> +??????? FLAG_SET_DEFAULT(UseUnalignedLoadStores, true);
>>>>>>>>>>>>>>> +????? }
>>>>>>>>>>>>>>> +#ifdef COMPILER2
>>>>>>>>>>>>>>> +????? if (supports_sse4_2() &&
>>>>>>>>>>>>>>> FLAG_IS_DEFAULT(UseFPUForSpilling))
>>>>>>>>>>>>>>> {
>>>>>>>>>>>>>>> +??????? FLAG_SET_DEFAULT(UseFPUForSpilling, true);
>>>>>>>>>>>>>>> +????? }
>>>>>>>>>>>>>>> +#endif
>>>>>>>>>>>>>>> +??? }
>>>>>>>>>>>>>>> ???????? }
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> ???????? if( is_intel() ) { // Intel cpus specific settings
>>>>>>>>>>>>>>> diff --git a/src/cpu/x86/vm/vm_version_x86.hpp
>>>>>>>>>>>>>>> b/src/cpu/x86/vm/vm_version_x86.hpp
>>>>>>>>>>>>>>> --- a/src/cpu/x86/vm/vm_version_x86.hpp
>>>>>>>>>>>>>>> +++ b/src/cpu/x86/vm/vm_version_x86.hpp
>>>>>>>>>>>>>>> @@ -505,6 +505,14 @@
>>>>>>>>>>>>>>> ???????????? result |= CPU_CLMUL;
>>>>>>>>>>>>>>> ?????????? if (_cpuid_info.sef_cpuid7_ebx.bits.rtm != 0)
>>>>>>>>>>>>>>> ???????????? result |= CPU_RTM;
>>>>>>>>>>>>>>> +??? if(_cpuid_info.sef_cpuid7_ebx.bits.adx != 0)
>>>>>>>>>>>>>>> +?????? result |= CPU_ADX;
>>>>>>>>>>>>>>> +??? if(_cpuid_info.sef_cpuid7_ebx.bits.bmi2 != 0)
>>>>>>>>>>>>>>> +????? result |= CPU_BMI2;
>>>>>>>>>>>>>>> +??? if (_cpuid_info.sef_cpuid7_ebx.bits.sha != 0)
>>>>>>>>>>>>>>> +????? result |= CPU_SHA;
>>>>>>>>>>>>>>> +??? if (_cpuid_info.std_cpuid1_ecx.bits.fma != 0)
>>>>>>>>>>>>>>> +????? result |= CPU_FMA;
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> ?????????? // AMD features.
>>>>>>>>>>>>>>> ?????????? if (is_amd()) {
>>>>>>>>>>>>>>> @@ -515,19 +523,13 @@
>>>>>>>>>>>>>>> ?????????????? result |= CPU_LZCNT;
>>>>>>>>>>>>>>> ???????????? if (_cpuid_info.ext_cpuid1_ecx.bits.sse4a != 0)
>>>>>>>>>>>>>>> ?????????????? result |= CPU_SSE4A;
>>>>>>>>>>>>>>> +????? if(_cpuid_info.std_cpuid1_edx.bits.ht != 0)
>>>>>>>>>>>>>>> +??????? result |= CPU_HT;
>>>>>>>>>>>>>>> ?????????? }
>>>>>>>>>>>>>>> ?????????? // Intel features.
>>>>>>>>>>>>>>> ?????????? if(is_intel()) {
>>>>>>>>>>>>>>> -????? if(_cpuid_info.sef_cpuid7_ebx.bits.adx != 0)
>>>>>>>>>>>>>>> -???????? result |= CPU_ADX;
>>>>>>>>>>>>>>> -????? if(_cpuid_info.sef_cpuid7_ebx.bits.bmi2 != 0)
>>>>>>>>>>>>>>> -??????? result |= CPU_BMI2;
>>>>>>>>>>>>>>> -????? if (_cpuid_info.sef_cpuid7_ebx.bits.sha != 0)
>>>>>>>>>>>>>>> -??????? result |= CPU_SHA;
>>>>>>>>>>>>>>>              
>>>>>>>>>>>>>>> if(_cpuid_info.ext_cpuid1_ecx.bits.lzcnt_intel !=
>>>>>>>>>>>>>>> 0)
>>>>>>>>>>>>>>> ?????????????? result |= CPU_LZCNT;
>>>>>>>>>>>>>>> -????? if (_cpuid_info.std_cpuid1_ecx.bits.fma != 0)
>>>>>>>>>>>>>>> -??????? result |= CPU_FMA;
>>>>>>>>>>>>>>> ???????????? // for Intel, ecx.bits.misalignsse bit (bit 8)
>>>>>>>>>>>>>>> indicates
>>>>>>>>>>>>>>> support for prefetchw
>>>>>>>>>>>>>>> ???????????? if 
>>>>>>>>>>>>>>> (_cpuid_info.ext_cpuid1_ecx.bits.misalignsse !=
>>>>>>>>>>>>>>> 0)
>>>>>>>>>>>>>>> {
>>>>>>>>>>>>>>> ?????????????? result |= CPU_3DNOW_PREFETCH;
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Please let me know your comments.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks for your time.
>>>>>>>>>>>>>>> Rohit
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks for taking time to review the code.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> diff --git a/src/cpu/x86/vm/vm_version_x86.cpp
>>>>>>>>>>>>>>>>> b/src/cpu/x86/vm/vm_version_x86.cpp
>>>>>>>>>>>>>>>>> --- a/src/cpu/x86/vm/vm_version_x86.cpp
>>>>>>>>>>>>>>>>> +++ b/src/cpu/x86/vm/vm_version_x86.cpp
>>>>>>>>>>>>>>>>> @@ -1088,6 +1088,22 @@
>>>>>>>>>>>>>>>>> ????????????? }
>>>>>>>>>>>>>>>>> ????????????? FLAG_SET_DEFAULT(UseSSE42Intrinsics, false);
>>>>>>>>>>>>>>>>> ??????????? }
>>>>>>>>>>>>>>>>> +??? if (supports_sha()) {
>>>>>>>>>>>>>>>>> +????? if (FLAG_IS_DEFAULT(UseSHA)) {
>>>>>>>>>>>>>>>>> +??????? FLAG_SET_DEFAULT(UseSHA, true);
>>>>>>>>>>>>>>>>> +????? }
>>>>>>>>>>>>>>>>> +??? } else if (UseSHA || UseSHA1Intrinsics ||
>>>>>>>>>>>>>>>>> UseSHA256Intrinsics
>>>>>>>>>>>>>>>>> ||
>>>>>>>>>>>>>>>>> UseSHA512Intrinsics) {
>>>>>>>>>>>>>>>>> +????? if (!FLAG_IS_DEFAULT(UseSHA) ||
>>>>>>>>>>>>>>>>> +????????? !FLAG_IS_DEFAULT(UseSHA1Intrinsics) ||
>>>>>>>>>>>>>>>>> +????????? !FLAG_IS_DEFAULT(UseSHA256Intrinsics) ||
>>>>>>>>>>>>>>>>> +????????? !FLAG_IS_DEFAULT(UseSHA512Intrinsics)) {
>>>>>>>>>>>>>>>>> +??????? warning("SHA instructions are not available on 
>>>>>>>>>>>>>>>>> this
>>>>>>>>>>>>>>>>> CPU");
>>>>>>>>>>>>>>>>> +????? }
>>>>>>>>>>>>>>>>> +????? FLAG_SET_DEFAULT(UseSHA, false);
>>>>>>>>>>>>>>>>> +????? FLAG_SET_DEFAULT(UseSHA1Intrinsics, false);
>>>>>>>>>>>>>>>>> +????? FLAG_SET_DEFAULT(UseSHA256Intrinsics, false);
>>>>>>>>>>>>>>>>> +????? FLAG_SET_DEFAULT(UseSHA512Intrinsics, false);
>>>>>>>>>>>>>>>>> +??? }
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> ??????????? // some defaults for AMD family 15h
>>>>>>>>>>>>>>>>> ??????????? if ( cpu_family() == 0x15 ) {
>>>>>>>>>>>>>>>>> @@ -1109,11 +1125,40 @@
>>>>>>>>>>>>>>>>> ??????????? }
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> ??????? #ifdef COMPILER2
>>>>>>>>>>>>>>>>> -??? if (MaxVectorSize > 16) {
>>>>>>>>>>>>>>>>> -????? // Limit vectors size to 16 bytes on current AMD 
>>>>>>>>>>>>>>>>> cpus.
>>>>>>>>>>>>>>>>> +??? if (cpu_family() < 0x17 && MaxVectorSize > 16) {
>>>>>>>>>>>>>>>>> +????? // Limit vectors size to 16 bytes on AMD cpus < 
>>>>>>>>>>>>>>>>> 17h.
>>>>>>>>>>>>>>>>> ????????????? FLAG_SET_DEFAULT(MaxVectorSize, 16);
>>>>>>>>>>>>>>>>> ??????????? }
>>>>>>>>>>>>>>>>> ??????? #endif // COMPILER2
>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>> +??? // Some defaults for AMD family 17h
>>>>>>>>>>>>>>>>> +??? if ( cpu_family() == 0x17 ) {
>>>>>>>>>>>>>>>>> +????? // On family 17h processors use XMM and
>>>>>>>>>>>>>>>>> UnalignedLoadStores
>>>>>>>>>>>>>>>>> for
>>>>>>>>>>>>>>>>> Array Copy
>>>>>>>>>>>>>>>>> +????? if (supports_sse2() &&
>>>>>>>>>>>>>>>>> FLAG_IS_DEFAULT(UseXMMForArrayCopy))
>>>>>>>>>>>>>>>>> {
>>>>>>>>>>>>>>>>> +??????? FLAG_SET_DEFAULT(UseXMMForArrayCopy, true);
>>>>>>>>>>>>>>>>> +????? }
>>>>>>>>>>>>>>>>> +????? if (supports_sse2() &&
>>>>>>>>>>>>>>>>> FLAG_IS_DEFAULT(UseUnalignedLoadStores)) {
>>>>>>>>>>>>>>>>> +??????? FLAG_SET_DEFAULT(UseUnalignedLoadStores, true);
>>>>>>>>>>>>>>>>> +????? }
>>>>>>>>>>>>>>>>> +????? if (supports_bmi2() &&
>>>>>>>>>>>>>>>>> FLAG_IS_DEFAULT(UseBMI2Instructions))
>>>>>>>>>>>>>>>>> {
>>>>>>>>>>>>>>>>> +??????? FLAG_SET_DEFAULT(UseBMI2Instructions, true);
>>>>>>>>>>>>>>>>> +????? }
>>>>>>>>>>>>>>>>> +????? if (UseSHA) {
>>>>>>>>>>>>>>>>> +??????? if (FLAG_IS_DEFAULT(UseSHA512Intrinsics)) {
>>>>>>>>>>>>>>>>> +????????? FLAG_SET_DEFAULT(UseSHA512Intrinsics, false);
>>>>>>>>>>>>>>>>> +??????? } else if (UseSHA512Intrinsics) {
>>>>>>>>>>>>>>>>> +????????? warning("Intrinsics for SHA-384 and SHA-512 
>>>>>>>>>>>>>>>>> crypto
>>>>>>>>>>>>>>>>> hash
>>>>>>>>>>>>>>>>> functions not available on this CPU.");
>>>>>>>>>>>>>>>>> +????????? FLAG_SET_DEFAULT(UseSHA512Intrinsics, false);
>>>>>>>>>>>>>>>>> +??????? }
>>>>>>>>>>>>>>>>> +????? }
>>>>>>>>>>>>>>>>> +#ifdef COMPILER2
>>>>>>>>>>>>>>>>> +????? if (supports_sse4_2()) {
>>>>>>>>>>>>>>>>> +??????? if (FLAG_IS_DEFAULT(UseFPUForSpilling)) {
>>>>>>>>>>>>>>>>> +????????? FLAG_SET_DEFAULT(UseFPUForSpilling, true);
>>>>>>>>>>>>>>>>> +??????? }
>>>>>>>>>>>>>>>>> +????? }
>>>>>>>>>>>>>>>>> +#endif
>>>>>>>>>>>>>>>>> +??? }
>>>>>>>>>>>>>>>>> ????????? }
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> ????????? if( is_intel() ) { // Intel cpus specific 
>>>>>>>>>>>>>>>>> settings
>>>>>>>>>>>>>>>>> diff --git a/src/cpu/x86/vm/vm_version_x86.hpp
>>>>>>>>>>>>>>>>> b/src/cpu/x86/vm/vm_version_x86.hpp
>>>>>>>>>>>>>>>>> --- a/src/cpu/x86/vm/vm_version_x86.hpp
>>>>>>>>>>>>>>>>> +++ b/src/cpu/x86/vm/vm_version_x86.hpp
>>>>>>>>>>>>>>>>> @@ -505,6 +505,14 @@
>>>>>>>>>>>>>>>>> ????????????? result |= CPU_CLMUL;
>>>>>>>>>>>>>>>>> ??????????? if (_cpuid_info.sef_cpuid7_ebx.bits.rtm != 0)
>>>>>>>>>>>>>>>>> ????????????? result |= CPU_RTM;
>>>>>>>>>>>>>>>>> +??? if(_cpuid_info.sef_cpuid7_ebx.bits.adx != 0)
>>>>>>>>>>>>>>>>> +?????? result |= CPU_ADX;
>>>>>>>>>>>>>>>>> +??? if(_cpuid_info.sef_cpuid7_ebx.bits.bmi2 != 0)
>>>>>>>>>>>>>>>>> +????? result |= CPU_BMI2;
>>>>>>>>>>>>>>>>> +??? if (_cpuid_info.sef_cpuid7_ebx.bits.sha != 0)
>>>>>>>>>>>>>>>>> +????? result |= CPU_SHA;
>>>>>>>>>>>>>>>>> +??? if (_cpuid_info.std_cpuid1_ecx.bits.fma != 0)
>>>>>>>>>>>>>>>>> +????? result |= CPU_FMA;
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> ??????????? // AMD features.
>>>>>>>>>>>>>>>>> ??????????? if (is_amd()) {
>>>>>>>>>>>>>>>>> @@ -515,19 +523,13 @@
>>>>>>>>>>>>>>>>> ??????????????? result |= CPU_LZCNT;
>>>>>>>>>>>>>>>>> ????????????? if (_cpuid_info.ext_cpuid1_ecx.bits.sse4a 
>>>>>>>>>>>>>>>>> != 0)
>>>>>>>>>>>>>>>>> ??????????????? result |= CPU_SSE4A;
>>>>>>>>>>>>>>>>> +????? if(_cpuid_info.std_cpuid1_edx.bits.ht != 0)
>>>>>>>>>>>>>>>>> +??????? result |= CPU_HT;
>>>>>>>>>>>>>>>>> ??????????? }
>>>>>>>>>>>>>>>>> ??????????? // Intel features.
>>>>>>>>>>>>>>>>> ??????????? if(is_intel()) {
>>>>>>>>>>>>>>>>> -????? if(_cpuid_info.sef_cpuid7_ebx.bits.adx != 0)
>>>>>>>>>>>>>>>>> -???????? result |= CPU_ADX;
>>>>>>>>>>>>>>>>> -????? if(_cpuid_info.sef_cpuid7_ebx.bits.bmi2 != 0)
>>>>>>>>>>>>>>>>> -??????? result |= CPU_BMI2;
>>>>>>>>>>>>>>>>> -????? if (_cpuid_info.sef_cpuid7_ebx.bits.sha != 0)
>>>>>>>>>>>>>>>>> -??????? result |= CPU_SHA;
>>>>>>>>>>>>>>>>>               
>>>>>>>>>>>>>>>>> if(_cpuid_info.ext_cpuid1_ecx.bits.lzcnt_intel
>>>>>>>>>>>>>>>>> !=
>>>>>>>>>>>>>>>>> 0)
>>>>>>>>>>>>>>>>> ??????????????? result |= CPU_LZCNT;
>>>>>>>>>>>>>>>>> -????? if (_cpuid_info.std_cpuid1_ecx.bits.fma != 0)
>>>>>>>>>>>>>>>>> -??????? result |= CPU_FMA;
>>>>>>>>>>>>>>>>> ????????????? // for Intel, ecx.bits.misalignsse bit 
>>>>>>>>>>>>>>>>> (bit 8)
>>>>>>>>>>>>>>>>> indicates
>>>>>>>>>>>>>>>>> support for prefetchw
>>>>>>>>>>>>>>>>> ????????????? if 
>>>>>>>>>>>>>>>>> (_cpuid_info.ext_cpuid1_ecx.bits.misalignsse
>>>>>>>>>>>>>>>>> !=
>>>>>>>>>>>>>>>>> 0) {
>>>>>>>>>>>>>>>>> ??????????????? result |= CPU_3DNOW_PREFETCH;
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>> Rohit
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On 9/1/17 8:04 AM, Rohit Arul Raj wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On Fri, Sep 1, 2017 at 10:27 AM, Rohit Arul Raj
>>>>>>>>>>>>>>>>>>> <rohitarulraj at gmail.com>
>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On Fri, Sep 1, 2017 at 3:01 AM, David Holmes
>>>>>>>>>>>>>>>>>>>> <david.holmes at oracle.com>
>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Hi Rohit,
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> I think the patch needs updating for jdk10 as I 
>>>>>>>>>>>>>>>>>>>>> already
>>>>>>>>>>>>>>>>>>>>> see
>>>>>>>>>>>>>>>>>>>>> a
>>>>>>>>>>>>>>>>>>>>> lot of
>>>>>>>>>>>>>>>>>>>>> logic
>>>>>>>>>>>>>>>>>>>>> around UseSHA in vm_version_x86.cpp.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Thanks David, I will update the patch wrt JDK10 source
>>>>>>>>>>>>>>>>>>>> base,
>>>>>>>>>>>>>>>>>>>> test
>>>>>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>>> resubmit for review.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>>>>> Rohit
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Hi All,
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I have updated the patch wrt openjdk10/hotspot (parent:
>>>>>>>>>>>>>>>>>>> 13519:71337910df60), did regression testing using jtreg
>>>>>>>>>>>>>>>>>>> ($make
>>>>>>>>>>>>>>>>>>> default) and didnt find any regressions.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Can anyone please volunteer to review this patch? which
>>>>>>>>>>>>>>>>>>> sets
>>>>>>>>>>>>>>>>>>> flag/ISA
>>>>>>>>>>>>>>>>>>> defaults for newer AMD 17h (EPYC) processor?
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> ************************* Patch
>>>>>>>>>>>>>>>>>>> ****************************
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> diff --git a/src/cpu/x86/vm/vm_version_x86.cpp
>>>>>>>>>>>>>>>>>>> b/src/cpu/x86/vm/vm_version_x86.cpp
>>>>>>>>>>>>>>>>>>> --- a/src/cpu/x86/vm/vm_version_x86.cpp
>>>>>>>>>>>>>>>>>>> +++ b/src/cpu/x86/vm/vm_version_x86.cpp
>>>>>>>>>>>>>>>>>>> @@ -1088,6 +1088,22 @@
>>>>>>>>>>>>>>>>>>> ?????????????? }
>>>>>>>>>>>>>>>>>>> ?????????????? FLAG_SET_DEFAULT(UseSSE42Intrinsics, 
>>>>>>>>>>>>>>>>>>> false);
>>>>>>>>>>>>>>>>>>> ???????????? }
>>>>>>>>>>>>>>>>>>> +??? if (supports_sha()) {
>>>>>>>>>>>>>>>>>>> +????? if (FLAG_IS_DEFAULT(UseSHA)) {
>>>>>>>>>>>>>>>>>>> +??????? FLAG_SET_DEFAULT(UseSHA, true);
>>>>>>>>>>>>>>>>>>> +????? }
>>>>>>>>>>>>>>>>>>> +??? } else if (UseSHA || UseSHA1Intrinsics ||
>>>>>>>>>>>>>>>>>>> UseSHA256Intrinsics
>>>>>>>>>>>>>>>>>>> ||
>>>>>>>>>>>>>>>>>>> UseSHA512Intrinsics) {
>>>>>>>>>>>>>>>>>>> +????? if (!FLAG_IS_DEFAULT(UseSHA) ||
>>>>>>>>>>>>>>>>>>> +????????? !FLAG_IS_DEFAULT(UseSHA1Intrinsics) ||
>>>>>>>>>>>>>>>>>>> +????????? !FLAG_IS_DEFAULT(UseSHA256Intrinsics) ||
>>>>>>>>>>>>>>>>>>> +????????? !FLAG_IS_DEFAULT(UseSHA512Intrinsics)) {
>>>>>>>>>>>>>>>>>>> +??????? warning("SHA instructions are not available on
>>>>>>>>>>>>>>>>>>> this
>>>>>>>>>>>>>>>>>>> CPU");
>>>>>>>>>>>>>>>>>>> +????? }
>>>>>>>>>>>>>>>>>>> +????? FLAG_SET_DEFAULT(UseSHA, false);
>>>>>>>>>>>>>>>>>>> +????? FLAG_SET_DEFAULT(UseSHA1Intrinsics, false);
>>>>>>>>>>>>>>>>>>> +????? FLAG_SET_DEFAULT(UseSHA256Intrinsics, false);
>>>>>>>>>>>>>>>>>>> +????? FLAG_SET_DEFAULT(UseSHA512Intrinsics, false);
>>>>>>>>>>>>>>>>>>> +??? }
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> ???????????? // some defaults for AMD family 15h
>>>>>>>>>>>>>>>>>>> ???????????? if ( cpu_family() == 0x15 ) {
>>>>>>>>>>>>>>>>>>> @@ -1109,11 +1125,43 @@
>>>>>>>>>>>>>>>>>>> ???????????? }
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> ???????? #ifdef COMPILER2
>>>>>>>>>>>>>>>>>>> -??? if (MaxVectorSize > 16) {
>>>>>>>>>>>>>>>>>>> -????? // Limit vectors size to 16 bytes on current AMD
>>>>>>>>>>>>>>>>>>> cpus.
>>>>>>>>>>>>>>>>>>> +??? if (cpu_family() < 0x17 && MaxVectorSize > 16) {
>>>>>>>>>>>>>>>>>>> +????? // Limit vectors size to 16 bytes on AMD cpus 
>>>>>>>>>>>>>>>>>>> < 17h.
>>>>>>>>>>>>>>>>>>> ?????????????? FLAG_SET_DEFAULT(MaxVectorSize, 16);
>>>>>>>>>>>>>>>>>>> ???????????? }
>>>>>>>>>>>>>>>>>>> ???????? #endif // COMPILER2
>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>> +??? // Some defaults for AMD family 17h
>>>>>>>>>>>>>>>>>>> +??? if ( cpu_family() == 0x17 ) {
>>>>>>>>>>>>>>>>>>> +????? // On family 17h processors use XMM and
>>>>>>>>>>>>>>>>>>> UnalignedLoadStores
>>>>>>>>>>>>>>>>>>> for
>>>>>>>>>>>>>>>>>>> Array Copy
>>>>>>>>>>>>>>>>>>> +????? if (supports_sse2() &&
>>>>>>>>>>>>>>>>>>> FLAG_IS_DEFAULT(UseXMMForArrayCopy))
>>>>>>>>>>>>>>>>>>> {
>>>>>>>>>>>>>>>>>>> +??????? UseXMMForArrayCopy = true;
>>>>>>>>>>>>>>>>>>> +????? }
>>>>>>>>>>>>>>>>>>> +????? if (supports_sse2() &&
>>>>>>>>>>>>>>>>>>> FLAG_IS_DEFAULT(UseUnalignedLoadStores))
>>>>>>>>>>>>>>>>>>> {
>>>>>>>>>>>>>>>>>>> +??????? UseUnalignedLoadStores = true;
>>>>>>>>>>>>>>>>>>> +????? }
>>>>>>>>>>>>>>>>>>> +????? if (supports_bmi2() &&
>>>>>>>>>>>>>>>>>>> FLAG_IS_DEFAULT(UseBMI2Instructions)) {
>>>>>>>>>>>>>>>>>>> +??????? UseBMI2Instructions = true;
>>>>>>>>>>>>>>>>>>> +????? }
>>>>>>>>>>>>>>>>>>> +????? if (MaxVectorSize > 32) {
>>>>>>>>>>>>>>>>>>> +??????? FLAG_SET_DEFAULT(MaxVectorSize, 32);
>>>>>>>>>>>>>>>>>>> +????? }
>>>>>>>>>>>>>>>>>>> +????? if (UseSHA) {
>>>>>>>>>>>>>>>>>>> +??????? if (FLAG_IS_DEFAULT(UseSHA512Intrinsics)) {
>>>>>>>>>>>>>>>>>>> +????????? FLAG_SET_DEFAULT(UseSHA512Intrinsics, false);
>>>>>>>>>>>>>>>>>>> +??????? } else if (UseSHA512Intrinsics) {
>>>>>>>>>>>>>>>>>>> +????????? warning("Intrinsics for SHA-384 and SHA-512
>>>>>>>>>>>>>>>>>>> crypto
>>>>>>>>>>>>>>>>>>> hash
>>>>>>>>>>>>>>>>>>> functions not available on this CPU.");
>>>>>>>>>>>>>>>>>>> +????????? FLAG_SET_DEFAULT(UseSHA512Intrinsics, false);
>>>>>>>>>>>>>>>>>>> +??????? }
>>>>>>>>>>>>>>>>>>> +????? }
>>>>>>>>>>>>>>>>>>> +#ifdef COMPILER2
>>>>>>>>>>>>>>>>>>> +????? if (supports_sse4_2()) {
>>>>>>>>>>>>>>>>>>> +??????? if (FLAG_IS_DEFAULT(UseFPUForSpilling)) {
>>>>>>>>>>>>>>>>>>> +????????? FLAG_SET_DEFAULT(UseFPUForSpilling, true);
>>>>>>>>>>>>>>>>>>> +??????? }
>>>>>>>>>>>>>>>>>>> +????? }
>>>>>>>>>>>>>>>>>>> +#endif
>>>>>>>>>>>>>>>>>>> +??? }
>>>>>>>>>>>>>>>>>>> ?????????? }
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> ?????????? if( is_intel() ) { // Intel cpus specific
>>>>>>>>>>>>>>>>>>> settings
>>>>>>>>>>>>>>>>>>> diff --git a/src/cpu/x86/vm/vm_version_x86.hpp
>>>>>>>>>>>>>>>>>>> b/src/cpu/x86/vm/vm_version_x86.hpp
>>>>>>>>>>>>>>>>>>> --- a/src/cpu/x86/vm/vm_version_x86.hpp
>>>>>>>>>>>>>>>>>>> +++ b/src/cpu/x86/vm/vm_version_x86.hpp
>>>>>>>>>>>>>>>>>>> @@ -505,6 +505,14 @@
>>>>>>>>>>>>>>>>>>> ?????????????? result |= CPU_CLMUL;
>>>>>>>>>>>>>>>>>>> ???????????? if (_cpuid_info.sef_cpuid7_ebx.bits.rtm 
>>>>>>>>>>>>>>>>>>> != 0)
>>>>>>>>>>>>>>>>>>> ?????????????? result |= CPU_RTM;
>>>>>>>>>>>>>>>>>>> +??? if(_cpuid_info.sef_cpuid7_ebx.bits.adx != 0)
>>>>>>>>>>>>>>>>>>> +?????? result |= CPU_ADX;
>>>>>>>>>>>>>>>>>>> +??? if(_cpuid_info.sef_cpuid7_ebx.bits.bmi2 != 0)
>>>>>>>>>>>>>>>>>>> +????? result |= CPU_BMI2;
>>>>>>>>>>>>>>>>>>> +??? if (_cpuid_info.sef_cpuid7_ebx.bits.sha != 0)
>>>>>>>>>>>>>>>>>>> +????? result |= CPU_SHA;
>>>>>>>>>>>>>>>>>>> +??? if (_cpuid_info.std_cpuid1_ecx.bits.fma != 0)
>>>>>>>>>>>>>>>>>>> +????? result |= CPU_FMA;
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> ???????????? // AMD features.
>>>>>>>>>>>>>>>>>>> ???????????? if (is_amd()) {
>>>>>>>>>>>>>>>>>>> @@ -515,19 +523,13 @@
>>>>>>>>>>>>>>>>>>> ???????????????? result |= CPU_LZCNT;
>>>>>>>>>>>>>>>>>>> ?????????????? if 
>>>>>>>>>>>>>>>>>>> (_cpuid_info.ext_cpuid1_ecx.bits.sse4a !=
>>>>>>>>>>>>>>>>>>> 0)
>>>>>>>>>>>>>>>>>>> ???????????????? result |= CPU_SSE4A;
>>>>>>>>>>>>>>>>>>> +????? if(_cpuid_info.std_cpuid1_edx.bits.ht != 0)
>>>>>>>>>>>>>>>>>>> +??????? result |= CPU_HT;
>>>>>>>>>>>>>>>>>>> ???????????? }
>>>>>>>>>>>>>>>>>>> ???????????? // Intel features.
>>>>>>>>>>>>>>>>>>> ???????????? if(is_intel()) {
>>>>>>>>>>>>>>>>>>> -????? if(_cpuid_info.sef_cpuid7_ebx.bits.adx != 0)
>>>>>>>>>>>>>>>>>>> -???????? result |= CPU_ADX;
>>>>>>>>>>>>>>>>>>> -????? if(_cpuid_info.sef_cpuid7_ebx.bits.bmi2 != 0)
>>>>>>>>>>>>>>>>>>> -??????? result |= CPU_BMI2;
>>>>>>>>>>>>>>>>>>> -????? if (_cpuid_info.sef_cpuid7_ebx.bits.sha != 0)
>>>>>>>>>>>>>>>>>>> -??????? result |= CPU_SHA;
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> if(_cpuid_info.ext_cpuid1_ecx.bits.lzcnt_intel
>>>>>>>>>>>>>>>>>>> != 0)
>>>>>>>>>>>>>>>>>>> ???????????????? result |= CPU_LZCNT;
>>>>>>>>>>>>>>>>>>> -????? if (_cpuid_info.std_cpuid1_ecx.bits.fma != 0)
>>>>>>>>>>>>>>>>>>> -??????? result |= CPU_FMA;
>>>>>>>>>>>>>>>>>>> ?????????????? // for Intel, ecx.bits.misalignsse bit 
>>>>>>>>>>>>>>>>>>> (bit
>>>>>>>>>>>>>>>>>>> 8)
>>>>>>>>>>>>>>>>>>> indicates
>>>>>>>>>>>>>>>>>>> support for prefetchw
>>>>>>>>>>>>>>>>>>> ?????????????? if
>>>>>>>>>>>>>>>>>>> (_cpuid_info.ext_cpuid1_ecx.bits.misalignsse
>>>>>>>>>>>>>>>>>>> !=
>>>>>>>>>>>>>>>>>>> 0) {
>>>>>>>>>>>>>>>>>>> ???????????????? result |= CPU_3DNOW_PREFETCH;
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> ************************************************************** 
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>> Rohit
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> On 1/09/2017 1:11 AM, Rohit Arul Raj wrote:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> On Thu, Aug 31, 2017 at 5:59 PM, David Holmes
>>>>>>>>>>>>>>>>>>>>>> <david.holmes at oracle.com>
>>>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Hi Rohit,
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> On 31/08/2017 7:03 PM, Rohit Arul Raj wrote:
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> I would like an volunteer to review this patch
>>>>>>>>>>>>>>>>>>>>>>>> (openJDK9)
>>>>>>>>>>>>>>>>>>>>>>>> which
>>>>>>>>>>>>>>>>>>>>>>>> sets
>>>>>>>>>>>>>>>>>>>>>>>> flag/ISA defaults for newer AMD 17h (EPYC) 
>>>>>>>>>>>>>>>>>>>>>>>> processor
>>>>>>>>>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>>>>>>> help
>>>>>>>>>>>>>>>>>>>>>>>> us
>>>>>>>>>>>>>>>>>>>>>>>> with
>>>>>>>>>>>>>>>>>>>>>>>> the commit process.
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> Webrev:
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> https://www.dropbox.com/sh/08bsxaxupg8kbam/AADurTXLGIZ6C-tiIAi_Glyka?dl=0 
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Unfortunately patches can not be accepted from 
>>>>>>>>>>>>>>>>>>>>>>> systems
>>>>>>>>>>>>>>>>>>>>>>> outside
>>>>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>> OpenJDK
>>>>>>>>>>>>>>>>>>>>>>> infrastructure and ...
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> I have also attached the patch (hg diff -g) for
>>>>>>>>>>>>>>>>>>>>>>>> reference.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> ... unfortunately patches tend to get stripped by 
>>>>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>> mail
>>>>>>>>>>>>>>>>>>>>>>> servers.
>>>>>>>>>>>>>>>>>>>>>>> If
>>>>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>> patch is small please include it inline. 
>>>>>>>>>>>>>>>>>>>>>>> Otherwise you
>>>>>>>>>>>>>>>>>>>>>>> will
>>>>>>>>>>>>>>>>>>>>>>> need
>>>>>>>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>>>> find
>>>>>>>>>>>>>>>>>>>>>>> an
>>>>>>>>>>>>>>>>>>>>>>> OpenJDK Author who can host it for you on
>>>>>>>>>>>>>>>>>>>>>>> cr.openjdk.java.net.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> 3) I have done regression testing using jtreg 
>>>>>>>>>>>>>>>>>>>>>>>> ($make
>>>>>>>>>>>>>>>>>>>>>>>> default)
>>>>>>>>>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>>>>>>> didnt find any regressions.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Sounds good, but until I see the patch it is hard to
>>>>>>>>>>>>>>>>>>>>>>> comment
>>>>>>>>>>>>>>>>>>>>>>> on
>>>>>>>>>>>>>>>>>>>>>>> testing
>>>>>>>>>>>>>>>>>>>>>>> requirements.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Thanks David,
>>>>>>>>>>>>>>>>>>>>>> Yes, it's a small patch.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> diff --git a/src/cpu/x86/vm/vm_version_x86.cpp
>>>>>>>>>>>>>>>>>>>>>> b/src/cpu/x86/vm/vm_version_x86.cpp
>>>>>>>>>>>>>>>>>>>>>> --- a/src/cpu/x86/vm/vm_version_x86.cpp
>>>>>>>>>>>>>>>>>>>>>> +++ b/src/cpu/x86/vm/vm_version_x86.cpp
>>>>>>>>>>>>>>>>>>>>>> @@ -1051,6 +1051,22 @@
>>>>>>>>>>>>>>>>>>>>>> ??????????????? }
>>>>>>>>>>>>>>>>>>>>>> ??????????????? FLAG_SET_DEFAULT(UseSSE42Intrinsics,
>>>>>>>>>>>>>>>>>>>>>> false);
>>>>>>>>>>>>>>>>>>>>>> ????????????? }
>>>>>>>>>>>>>>>>>>>>>> +??? if (supports_sha()) {
>>>>>>>>>>>>>>>>>>>>>> +????? if (FLAG_IS_DEFAULT(UseSHA)) {
>>>>>>>>>>>>>>>>>>>>>> +??????? FLAG_SET_DEFAULT(UseSHA, true);
>>>>>>>>>>>>>>>>>>>>>> +????? }
>>>>>>>>>>>>>>>>>>>>>> +??? } else if (UseSHA || UseSHA1Intrinsics ||
>>>>>>>>>>>>>>>>>>>>>> UseSHA256Intrinsics
>>>>>>>>>>>>>>>>>>>>>> ||
>>>>>>>>>>>>>>>>>>>>>> UseSHA512Intrinsics) {
>>>>>>>>>>>>>>>>>>>>>> +????? if (!FLAG_IS_DEFAULT(UseSHA) ||
>>>>>>>>>>>>>>>>>>>>>> +????????? !FLAG_IS_DEFAULT(UseSHA1Intrinsics) ||
>>>>>>>>>>>>>>>>>>>>>> +????????? !FLAG_IS_DEFAULT(UseSHA256Intrinsics) ||
>>>>>>>>>>>>>>>>>>>>>> +????????? !FLAG_IS_DEFAULT(UseSHA512Intrinsics)) {
>>>>>>>>>>>>>>>>>>>>>> +??????? warning("SHA instructions are not 
>>>>>>>>>>>>>>>>>>>>>> available on
>>>>>>>>>>>>>>>>>>>>>> this
>>>>>>>>>>>>>>>>>>>>>> CPU");
>>>>>>>>>>>>>>>>>>>>>> +????? }
>>>>>>>>>>>>>>>>>>>>>> +????? FLAG_SET_DEFAULT(UseSHA, false);
>>>>>>>>>>>>>>>>>>>>>> +????? FLAG_SET_DEFAULT(UseSHA1Intrinsics, false);
>>>>>>>>>>>>>>>>>>>>>> +????? FLAG_SET_DEFAULT(UseSHA256Intrinsics, false);
>>>>>>>>>>>>>>>>>>>>>> +????? FLAG_SET_DEFAULT(UseSHA512Intrinsics, false);
>>>>>>>>>>>>>>>>>>>>>> +??? }
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> ????????????? // some defaults for AMD family 15h
>>>>>>>>>>>>>>>>>>>>>> ????????????? if ( cpu_family() == 0x15 ) {
>>>>>>>>>>>>>>>>>>>>>> @@ -1072,11 +1088,43 @@
>>>>>>>>>>>>>>>>>>>>>> ????????????? }
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> ????????? #ifdef COMPILER2
>>>>>>>>>>>>>>>>>>>>>> -??? if (MaxVectorSize > 16) {
>>>>>>>>>>>>>>>>>>>>>> -????? // Limit vectors size to 16 bytes on 
>>>>>>>>>>>>>>>>>>>>>> current AMD
>>>>>>>>>>>>>>>>>>>>>> cpus.
>>>>>>>>>>>>>>>>>>>>>> +??? if (cpu_family() < 0x17 && MaxVectorSize > 16) {
>>>>>>>>>>>>>>>>>>>>>> +????? // Limit vectors size to 16 bytes on AMD 
>>>>>>>>>>>>>>>>>>>>>> cpus <
>>>>>>>>>>>>>>>>>>>>>> 17h.
>>>>>>>>>>>>>>>>>>>>>> ??????????????? FLAG_SET_DEFAULT(MaxVectorSize, 16);
>>>>>>>>>>>>>>>>>>>>>> ????????????? }
>>>>>>>>>>>>>>>>>>>>>> ????????? #endif // COMPILER2
>>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>> +??? // Some defaults for AMD family 17h
>>>>>>>>>>>>>>>>>>>>>> +??? if ( cpu_family() == 0x17 ) {
>>>>>>>>>>>>>>>>>>>>>> +????? // On family 17h processors use XMM and
>>>>>>>>>>>>>>>>>>>>>> UnalignedLoadStores
>>>>>>>>>>>>>>>>>>>>>> for
>>>>>>>>>>>>>>>>>>>>>> Array Copy
>>>>>>>>>>>>>>>>>>>>>> +????? if (supports_sse2() &&
>>>>>>>>>>>>>>>>>>>>>> FLAG_IS_DEFAULT(UseXMMForArrayCopy))
>>>>>>>>>>>>>>>>>>>>>> {
>>>>>>>>>>>>>>>>>>>>>> +??????? UseXMMForArrayCopy = true;
>>>>>>>>>>>>>>>>>>>>>> +????? }
>>>>>>>>>>>>>>>>>>>>>> +????? if (supports_sse2() &&
>>>>>>>>>>>>>>>>>>>>>> FLAG_IS_DEFAULT(UseUnalignedLoadStores))
>>>>>>>>>>>>>>>>>>>>>> {
>>>>>>>>>>>>>>>>>>>>>> +??????? UseUnalignedLoadStores = true;
>>>>>>>>>>>>>>>>>>>>>> +????? }
>>>>>>>>>>>>>>>>>>>>>> +????? if (supports_bmi2() &&
>>>>>>>>>>>>>>>>>>>>>> FLAG_IS_DEFAULT(UseBMI2Instructions))
>>>>>>>>>>>>>>>>>>>>>> {
>>>>>>>>>>>>>>>>>>>>>> +??????? UseBMI2Instructions = true;
>>>>>>>>>>>>>>>>>>>>>> +????? }
>>>>>>>>>>>>>>>>>>>>>> +????? if (MaxVectorSize > 32) {
>>>>>>>>>>>>>>>>>>>>>> +??????? FLAG_SET_DEFAULT(MaxVectorSize, 32);
>>>>>>>>>>>>>>>>>>>>>> +????? }
>>>>>>>>>>>>>>>>>>>>>> +????? if (UseSHA) {
>>>>>>>>>>>>>>>>>>>>>> +??????? if (FLAG_IS_DEFAULT(UseSHA512Intrinsics)) {
>>>>>>>>>>>>>>>>>>>>>> +????????? FLAG_SET_DEFAULT(UseSHA512Intrinsics, 
>>>>>>>>>>>>>>>>>>>>>> false);
>>>>>>>>>>>>>>>>>>>>>> +??????? } else if (UseSHA512Intrinsics) {
>>>>>>>>>>>>>>>>>>>>>> +????????? warning("Intrinsics for SHA-384 and 
>>>>>>>>>>>>>>>>>>>>>> SHA-512
>>>>>>>>>>>>>>>>>>>>>> crypto
>>>>>>>>>>>>>>>>>>>>>> hash
>>>>>>>>>>>>>>>>>>>>>> functions not available on this CPU.");
>>>>>>>>>>>>>>>>>>>>>> +????????? FLAG_SET_DEFAULT(UseSHA512Intrinsics, 
>>>>>>>>>>>>>>>>>>>>>> false);
>>>>>>>>>>>>>>>>>>>>>> +??????? }
>>>>>>>>>>>>>>>>>>>>>> +????? }
>>>>>>>>>>>>>>>>>>>>>> +#ifdef COMPILER2
>>>>>>>>>>>>>>>>>>>>>> +????? if (supports_sse4_2()) {
>>>>>>>>>>>>>>>>>>>>>> +??????? if (FLAG_IS_DEFAULT(UseFPUForSpilling)) {
>>>>>>>>>>>>>>>>>>>>>> +????????? FLAG_SET_DEFAULT(UseFPUForSpilling, true);
>>>>>>>>>>>>>>>>>>>>>> +??????? }
>>>>>>>>>>>>>>>>>>>>>> +????? }
>>>>>>>>>>>>>>>>>>>>>> +#endif
>>>>>>>>>>>>>>>>>>>>>> +??? }
>>>>>>>>>>>>>>>>>>>>>> ??????????? }
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> ??????????? if( is_intel() ) { // Intel cpus specific
>>>>>>>>>>>>>>>>>>>>>> settings
>>>>>>>>>>>>>>>>>>>>>> diff --git a/src/cpu/x86/vm/vm_version_x86.hpp
>>>>>>>>>>>>>>>>>>>>>> b/src/cpu/x86/vm/vm_version_x86.hpp
>>>>>>>>>>>>>>>>>>>>>> --- a/src/cpu/x86/vm/vm_version_x86.hpp
>>>>>>>>>>>>>>>>>>>>>> +++ b/src/cpu/x86/vm/vm_version_x86.hpp
>>>>>>>>>>>>>>>>>>>>>> @@ -513,6 +513,16 @@
>>>>>>>>>>>>>>>>>>>>>> ????????????????? result |= CPU_LZCNT;
>>>>>>>>>>>>>>>>>>>>>> ??????????????? if 
>>>>>>>>>>>>>>>>>>>>>> (_cpuid_info.ext_cpuid1_ecx.bits.sse4a
>>>>>>>>>>>>>>>>>>>>>> !=
>>>>>>>>>>>>>>>>>>>>>> 0)
>>>>>>>>>>>>>>>>>>>>>> ????????????????? result |= CPU_SSE4A;
>>>>>>>>>>>>>>>>>>>>>> +????? if(_cpuid_info.sef_cpuid7_ebx.bits.bmi2 != 0)
>>>>>>>>>>>>>>>>>>>>>> +??????? result |= CPU_BMI2;
>>>>>>>>>>>>>>>>>>>>>> +????? if(_cpuid_info.std_cpuid1_edx.bits.ht != 0)
>>>>>>>>>>>>>>>>>>>>>> +??????? result |= CPU_HT;
>>>>>>>>>>>>>>>>>>>>>> +????? if(_cpuid_info.sef_cpuid7_ebx.bits.adx != 0)
>>>>>>>>>>>>>>>>>>>>>> +??????? result |= CPU_ADX;
>>>>>>>>>>>>>>>>>>>>>> +????? if (_cpuid_info.sef_cpuid7_ebx.bits.sha != 0)
>>>>>>>>>>>>>>>>>>>>>> +??????? result |= CPU_SHA;
>>>>>>>>>>>>>>>>>>>>>> +????? if (_cpuid_info.std_cpuid1_ecx.bits.fma != 0)
>>>>>>>>>>>>>>>>>>>>>> +??????? result |= CPU_FMA;
>>>>>>>>>>>>>>>>>>>>>> ????????????? }
>>>>>>>>>>>>>>>>>>>>>> ????????????? // Intel features.
>>>>>>>>>>>>>>>>>>>>>> ????????????? if(is_intel()) {
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>>>>>>> Rohit
>>>>>>>>>>>>>>>>>>>>>>
> 

From kim.barrett at oracle.com  Tue Oct 17 19:52:22 2017
From: kim.barrett at oracle.com (Kim Barrett)
Date: Tue, 17 Oct 2017 15:52:22 -0400
Subject: RFR: 8189359: Move native weak oops cleaning out of
 ReferenceProcessor
In-Reply-To: <8c0aafa1-ca06-105c-72f9-7bd11d382452@oracle.com>
References: <8c0aafa1-ca06-105c-72f9-7bd11d382452@oracle.com>
Message-ID: <CAAF1C58-E370-40AB-B383-C1EBF37CF10A@oracle.com>

> On Oct 16, 2017, at 10:14 AM, Stefan Karlsson <stefan.karlsson at oracle.com> wrote:
> 
> Hi all,
> 
> Please review this patch to move the JNI global weak handle processing out of the ReferenceProcessor into a new class, WeakProcessor, that will be used to gather processing and cleaning of "native weak" oops.
> 
> After this patch the ReferenceProcessor will only deal with the Java level java.lang.ref weak references.
> 
> http://cr.openjdk.java.net/~stefank/8189359/webrev.00/
> https://bugs.openjdk.java.net/browse/JDK-8189359
> 
> Note this patch only moves the JNIHandles::weak_oops_do calls into the new WeakProcessor. A subsequent patch for JDK-8189359 will move the JvmtiExport::weak_oops_do from JNIHandleBlock into the WeakProcessor.
> 
> Future patches like JDK-8171119, for example, will be able to add it's set of native weak oops into the new WeakProcessor functions and won't have to duplicate the code for all GCs or add call inside the ReferenceProcessor.
> 
> Tested with JPRT.
> 
> Thanks,
> StefanK

Mostly OK, and nice to have this cleaned up, esp. with the JDK-8189359
followup.

------------------------------------------------------------------------------  
src/hotspot/share/gc/shared/weakProcessor.hpp

I don't understand the name of unlink_or_oops_do.  A little hint as to
the semantics of the two functions in WeakProcessor might help.  Right
now, it's not at all obvious how they differ, other than by signature.

------------------------------------------------------------------------------  
src/hotspot/share/gc/g1/g1CollectedHeap.cpp

This change seems to remove the only call to process_weak_jni_handles().

------------------------------------------------------------------------------


From kim.barrett at oracle.com  Tue Oct 17 19:55:06 2017
From: kim.barrett at oracle.com (Kim Barrett)
Date: Tue, 17 Oct 2017 15:55:06 -0400
Subject: 8189360: JvmtiExport::weak_oops_do is called for all
 JNIHandleBlock instances
In-Reply-To: <8e8b2dd7-3e49-ef54-6e3b-f13fb847cbd8@oracle.com>
References: <8e8b2dd7-3e49-ef54-6e3b-f13fb847cbd8@oracle.com>
Message-ID: <C19A334B-DCC6-4A0E-B65A-19A343FF416E@oracle.com>

> On Oct 16, 2017, at 11:40 AM, Stefan Karlsson <stefan.karlsson at oracle.com> wrote:
> 
> Hi all,
> 
> Please review this patch to move the call of the static JvmtiExport::weak_oops_do out of the JNIHandleBlock::weak_oops_do member function into the new WeakProcessor.
> 
> Today, this isn't causing any bugs because there's only one instance of JNIHandleBlock, the _weak_global_handles. However, in prototypes with more than one JNIHandleBlock, this results in multiple calls to JvmtiExport::weak_oops_do.
> 
> http://cr.openjdk.java.net/~stefank/8189360/webrev.00/
> https://bugs.openjdk.java.net/browse/JDK-8189360
> 
> This patch builds upon the patch in:
> http://mail.openjdk.java.net/pipermail/hotspot-dev/2017-October/028684.html
> 
> Tested with JPRT.
> 
> Thanks,
> StefanK

src/hotspot/share/runtime/jniHandles.cpp
Maybe remove #include ?prims/jvmtiExport.hpp? ?

Otherwise looks good.  I don?t need another webrev for that #include removal.


From stefan.karlsson at oracle.com  Tue Oct 17 20:57:15 2017
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Tue, 17 Oct 2017 22:57:15 +0200
Subject: RFR: 8189359: Move native weak oops cleaning out of
 ReferenceProcessor
In-Reply-To: <CAAF1C58-E370-40AB-B383-C1EBF37CF10A@oracle.com>
References: <8c0aafa1-ca06-105c-72f9-7bd11d382452@oracle.com>
 <CAAF1C58-E370-40AB-B383-C1EBF37CF10A@oracle.com>
Message-ID: <c913cf29-7b15-d98e-9ab9-34fd5ea60caa@oracle.com>

Hi Kim,

On 2017-10-17 21:52, Kim Barrett wrote:
>> On Oct 16, 2017, at 10:14 AM, Stefan Karlsson <stefan.karlsson at oracle.com> wrote:
>>
>> Hi all,
>>
>> Please review this patch to move the JNI global weak handle processing out of the ReferenceProcessor into a new class, WeakProcessor, that will be used to gather processing and cleaning of "native weak" oops.
>>
>> After this patch the ReferenceProcessor will only deal with the Java level java.lang.ref weak references.
>>
>> http://cr.openjdk.java.net/~stefank/8189359/webrev.00/
>> https://bugs.openjdk.java.net/browse/JDK-8189359
>>
>> Note this patch only moves the JNIHandles::weak_oops_do calls into the new WeakProcessor. A subsequent patch for JDK-8189359 will move the JvmtiExport::weak_oops_do from JNIHandleBlock into the WeakProcessor.
>>
>> Future patches like JDK-8171119, for example, will be able to add it's set of native weak oops into the new WeakProcessor functions and won't have to duplicate the code for all GCs or add call inside the ReferenceProcessor.
>>
>> Tested with JPRT.
>>
>> Thanks,
>> StefanK
> Mostly OK, and nice to have this cleaned up, esp. with the JDK-8189359
> followup.

Thanks for reviewing!

>
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/shared/weakProcessor.hpp
>
> I don't understand the name of unlink_or_oops_do.  A little hint as to
> the semantics of the two functions in WeakProcessor might help.  Right
> now, it's not at all obvious how they differ, other than by signature.

I've renamed it to weak_oops_do and added comments to hopefully explain 
what they do.

>
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/g1CollectedHeap.cpp
>
> This change seems to remove the only call to process_weak_jni_handles().

Removed.

Here are the updated webrevs:
 ?http://cr.openjdk.java.net/~stefank/8189359/webrev.01.delta
 ?http://cr.openjdk.java.net/~stefank/8189359/webrev.01

Thanks,
StefanK

>
> ------------------------------------------------------------------------------
>


From stefan.karlsson at oracle.com  Tue Oct 17 20:59:20 2017
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Tue, 17 Oct 2017 22:59:20 +0200
Subject: 8189360: JvmtiExport::weak_oops_do is called for all
 JNIHandleBlock instances
In-Reply-To: <C19A334B-DCC6-4A0E-B65A-19A343FF416E@oracle.com>
References: <8e8b2dd7-3e49-ef54-6e3b-f13fb847cbd8@oracle.com>
 <C19A334B-DCC6-4A0E-B65A-19A343FF416E@oracle.com>
Message-ID: <e4373db2-5208-a5bf-eb26-5f0bf6dc3c9c@oracle.com>

On 2017-10-17 21:55, Kim Barrett wrote:
>> On Oct 16, 2017, at 11:40 AM, Stefan Karlsson <stefan.karlsson at oracle.com> wrote:
>>
>> Hi all,
>>
>> Please review this patch to move the call of the static JvmtiExport::weak_oops_do out of the JNIHandleBlock::weak_oops_do member function into the new WeakProcessor.
>>
>> Today, this isn't causing any bugs because there's only one instance of JNIHandleBlock, the _weak_global_handles. However, in prototypes with more than one JNIHandleBlock, this results in multiple calls to JvmtiExport::weak_oops_do.
>>
>> http://cr.openjdk.java.net/~stefank/8189360/webrev.00/
>> https://bugs.openjdk.java.net/browse/JDK-8189360
>>
>> This patch builds upon the patch in:
>> http://mail.openjdk.java.net/pipermail/hotspot-dev/2017-October/028684.html
>>
>> Tested with JPRT.
>>
>> Thanks,
>> StefanK
> src/hotspot/share/runtime/jniHandles.cpp
> Maybe remove #include ?prims/jvmtiExport.hpp? ?
>
> Otherwise looks good.  I don?t need another webrev for that #include removal.
Thanks!

StefanK


From coleen.phillimore at oracle.com  Tue Oct 17 21:03:51 2017
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Tue, 17 Oct 2017 17:03:51 -0400
Subject: RFR: 8184914: Use MacroAssembler::cmpoop() consistently when
 comparing heap objects
In-Reply-To: <8d667010-f17e-7d1b-088b-106999e3b005@redhat.com>
References: <8d667010-f17e-7d1b-088b-106999e3b005@redhat.com>
Message-ID: <9b629556-b3f0-e52e-35e0-711c6a767e95@oracle.com>


This looks reasonable to me.? Maybe the compiler group should review the 
c1 part.? I changed the mailing list to hotspot-dev.
I can sponsor this for you.
Thanks,
Coleen

On 10/17/17 4:22 PM, Roman Kennke wrote:
> (Not sure if this is the correct list to ask.. if not, please let me 
> know and/or redirect me)
>
> Currently, cmpoop() is only declared for 32-bit x86, and only used in 
> 2 places in C1 to compare oops. In other places, oops are compared 
> using cmpptr(). It would be useful to distinguish normal pointer 
> comparisons from heap object comparisons, and use cmpoop() 
> consistently for heap object comparisons. This would remove clutter in 
> several places where we have #ifdef _LP64 around comparisons, and 
> would also allow to insert necessary barriers for GCs that need them 
> (e.g. Shenandoah) later.
>
> http://cr.openjdk.java.net/~rkennke/8184914/webrev.00/ 
> <http://cr.openjdk.java.net/%7Erkennke/8184914/webrev.00/>
>
> Tested by running hotspot_gc jtreg tests.
>
> Can I get a review please?
>
> Thanks, Roman
>
>


From rkennke at redhat.com  Tue Oct 17 21:05:29 2017
From: rkennke at redhat.com (Roman Kennke)
Date: Tue, 17 Oct 2017 23:05:29 +0200
Subject: RFR: 8184914: Use MacroAssembler::cmpoop() consistently when
 comparing heap objects
In-Reply-To: <9b629556-b3f0-e52e-35e0-711c6a767e95@oracle.com>
References: <8d667010-f17e-7d1b-088b-106999e3b005@redhat.com>
 <9b629556-b3f0-e52e-35e0-711c6a767e95@oracle.com>
Message-ID: <55bb0f72-df71-44bc-53a0-7d982ab1ca04@redhat.com>


>
> This looks reasonable to me.? Maybe the compiler group should review 
> the c1 part.? I changed the mailing list to hotspot-dev.
> I can sponsor this for you.
Thanks, thanks and thanks! ;-)

Roman

> Thanks,
> Coleen
>
> On 10/17/17 4:22 PM, Roman Kennke wrote:
>> (Not sure if this is the correct list to ask.. if not, please let me 
>> know and/or redirect me)
>>
>> Currently, cmpoop() is only declared for 32-bit x86, and only used in 
>> 2 places in C1 to compare oops. In other places, oops are compared 
>> using cmpptr(). It would be useful to distinguish normal pointer 
>> comparisons from heap object comparisons, and use cmpoop() 
>> consistently for heap object comparisons. This would remove clutter 
>> in several places where we have #ifdef _LP64 around comparisons, and 
>> would also allow to insert necessary barriers for GCs that need them 
>> (e.g. Shenandoah) later.
>>
>> http://cr.openjdk.java.net/~rkennke/8184914/webrev.00/ 
>> <http://cr.openjdk.java.net/%7Erkennke/8184914/webrev.00/>
>>
>> Tested by running hotspot_gc jtreg tests.
>>
>> Can I get a review please?
>>
>> Thanks, Roman
>>
>>
>


From kim.barrett at oracle.com  Tue Oct 17 21:11:07 2017
From: kim.barrett at oracle.com (Kim Barrett)
Date: Tue, 17 Oct 2017 17:11:07 -0400
Subject: RFR: 8189359: Move native weak oops cleaning out of
 ReferenceProcessor
In-Reply-To: <c913cf29-7b15-d98e-9ab9-34fd5ea60caa@oracle.com>
References: <8c0aafa1-ca06-105c-72f9-7bd11d382452@oracle.com>
 <CAAF1C58-E370-40AB-B383-C1EBF37CF10A@oracle.com>
 <c913cf29-7b15-d98e-9ab9-34fd5ea60caa@oracle.com>
Message-ID: <5F447CCE-7412-43C3-A27E-F89B393B05D9@oracle.com>

> On Oct 17, 2017, at 4:57 PM, Stefan Karlsson <stefan.karlsson at oracle.com> wrote:
> Here are the updated webrevs:
>  http://cr.openjdk.java.net/~stefank/8189359/webrev.01.delta
>  http://cr.openjdk.java.net/~stefank/8189359/webrev.01

------------------------------------------------------------------------------
src/hotspot/share/gc/shared/weakProcessor.hpp 
  33 // New contains of weak oops added to this class will automatically

Sorry, but that's garbled, and I'm not sure what is intended.

Previous version had "sets" instead of "contains", which seemed okay
to me.

------------------------------------------------------------------------------
src/hotspot/share/gc/shared/weakProcessor.hpp 
  45   // Visit all oop*s and apply the given clousre.

s/clousre/closure/

------------------------------------------------------------------------------ 
src/hotspot/share/gc/shared/weakProcessor.hpp 
  41   // The complete closure is used as a post-processing step called
  42   // after each container has been processed.

I think a comma is needed between "step" and "called".  But we were
discussing in chat whether this closure is even needed.  I think it
isn't...

------------------------------------------------------------------------------ 
src/hotspot/share/gc/shared/weakProcessor.hpp 
  37   // Visit all oop*s and either apply the keep_alive closure if the referenced
  38   // object is considered alive by the is_alive closure, otherwise do some
  39   // container specific cleanup of element holding the oop.

Suggest

s/either//
s/, otherwise/. Otherwise/

------------------------------------------------------------------------------


From stefan.karlsson at oracle.com  Tue Oct 17 21:22:54 2017
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Tue, 17 Oct 2017 23:22:54 +0200
Subject: RFR: 8189359: Move native weak oops cleaning out of
 ReferenceProcessor
In-Reply-To: <5F447CCE-7412-43C3-A27E-F89B393B05D9@oracle.com>
References: <8c0aafa1-ca06-105c-72f9-7bd11d382452@oracle.com>
 <CAAF1C58-E370-40AB-B383-C1EBF37CF10A@oracle.com>
 <c913cf29-7b15-d98e-9ab9-34fd5ea60caa@oracle.com>
 <5F447CCE-7412-43C3-A27E-F89B393B05D9@oracle.com>
Message-ID: <73dcae02-f18e-83aa-25ae-087dd1d917ca@oracle.com>

On 2017-10-17 23:11, Kim Barrett wrote:
>> On Oct 17, 2017, at 4:57 PM, Stefan Karlsson <stefan.karlsson at oracle.com> wrote:
>> Here are the updated webrevs:
>>   http://cr.openjdk.java.net/~stefank/8189359/webrev.01.delta
>>   http://cr.openjdk.java.net/~stefank/8189359/webrev.01

Obviously, this is getting too late for me to do these kinds of changes. 
But let me try one more time. :)

> ------------------------------------------------------------------------------
> src/hotspot/share/gc/shared/weakProcessor.hpp
>    33 // New contains of weak oops added to this class will automatically
>
> Sorry, but that's garbled, and I'm not sure what is intended.
>
> Previous version had "sets" instead of "contains", which seemed okay
> to me.

I used the word container in the comment for weak_oops_do, so I wanted 
to use the same word here. I can change to set(s) if that makes more sense.

> ------------------------------------------------------------------------------
> src/hotspot/share/gc/shared/weakProcessor.hpp
>    45   // Visit all oop*s and apply the given clousre.
>
> s/clousre/closure/

Done.

>
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/shared/weakProcessor.hpp
>    41   // The complete closure is used as a post-processing step called
>    42   // after each container has been processed.
>
> I think a comma is needed between "step" and "called".

Done.

>    But we were
> discussing in chat whether this closure is even needed.  I think it
> isn't...

I agree. But I'd rather think about that a bit more and then remove that 
as a separate patch.

>
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/shared/weakProcessor.hpp
>    37   // Visit all oop*s and either apply the keep_alive closure if the referenced
>    38   // object is considered alive by the is_alive closure, otherwise do some
>    39   // container specific cleanup of element holding the oop.
>
> Suggest
>
> s/either//
> s/, otherwise/. Otherwise/
>
> ------------------------------------------------------------------------------
Done.

http://cr.openjdk.java.net/~stefank/8189359/webrev.02.delta
http://cr.openjdk.java.net/~stefank/8189359/webrev.02

Thanks,
StefanK

>


From per.liden at oracle.com  Tue Oct 17 21:38:05 2017
From: per.liden at oracle.com (Per Liden)
Date: Tue, 17 Oct 2017 23:38:05 +0200
Subject: RFR: 8189359: Move native weak oops cleaning out of
 ReferenceProcessor
In-Reply-To: <c913cf29-7b15-d98e-9ab9-34fd5ea60caa@oracle.com>
References: <8c0aafa1-ca06-105c-72f9-7bd11d382452@oracle.com>
 <CAAF1C58-E370-40AB-B383-C1EBF37CF10A@oracle.com>
 <c913cf29-7b15-d98e-9ab9-34fd5ea60caa@oracle.com>
Message-ID: <f9597843-9e46-23be-6bbe-b6b198dbd07a@oracle.com>

Hi,

On 2017-10-17 22:57, Stefan Karlsson wrote:
[...]
> 
> Here are the updated webrevs:
>  ?http://cr.openjdk.java.net/~stefank/8189359/webrev.01.delta
>  ?http://cr.openjdk.java.net/~stefank/8189359/webrev.01

Looks good. Just two comments.

share/gc/parallel/psScavenge.cpp:

  446     {
  447       GCTraceTime(Debug, gc, phases) tm("Weak Processing", 
&_gc_timer);
  448       WeakProcessor::weak_oops_do(&_is_alive_closure, &root_closure);
  449     }

I see you've kept the "complete" closure in 
WeakProcessor::weak_oops_do(), which is fine and we can clean that out 
later, but here you don't seem to mimic exactly what the old code did. I 
think you want to pass in &evac_followers here, right?

share/gc/serial/defNewGeneration.cpp:

  662   WeakProcessor::weak_oops_do(&is_alive, &keep_alive);

Same here, pass in &evacuate_followers?

I don't need to see a new webrev.

cheers,
Per

From per.liden at oracle.com  Tue Oct 17 21:43:59 2017
From: per.liden at oracle.com (Per Liden)
Date: Tue, 17 Oct 2017 23:43:59 +0200
Subject: 8189360: JvmtiExport::weak_oops_do is called for all
 JNIHandleBlock instances
In-Reply-To: <8e8b2dd7-3e49-ef54-6e3b-f13fb847cbd8@oracle.com>
References: <8e8b2dd7-3e49-ef54-6e3b-f13fb847cbd8@oracle.com>
Message-ID: <67b8baf1-0e2b-7ebc-2826-de81da5cf770@oracle.com>

Hi,

On 2017-10-16 17:40, Stefan Karlsson wrote:
> Hi all,
> 
> Please review this patch to move the call of the static 
> JvmtiExport::weak_oops_do out of the JNIHandleBlock::weak_oops_do member 
> function into the new WeakProcessor.
> 
> Today, this isn't causing any bugs because there's only one instance of 
> JNIHandleBlock, the _weak_global_handles. However, in prototypes with 
> more than one JNIHandleBlock, this results in multiple calls to 
> JvmtiExport::weak_oops_do.
> 
> http://cr.openjdk.java.net/~stefank/8189360/webrev.00/
> https://bugs.openjdk.java.net/browse/JDK-8189360

   30 void WeakProcessor::unlink_or_oops_do(BoolObjectClosure* is_alive, 
OopClosure* keep_alive, VoidClosure* complete) {
   31   JNIHandles::weak_oops_do(is_alive, keep_alive);
   32   if (complete != NULL) {
   33     complete->do_void();
   34   }
   35
   36   JvmtiExport::weak_oops_do(is_alive, keep_alive);
   37   if (complete != NULL) {
   38     complete->do_void();
   39   }
   40 }

Should you really be calling complete->do_void() twice here. It seems to 
me that doing it once, after both calls to weak_oops_do() would mimic 
what the old code did?

cheers,
Per

> 
> This patch builds upon the patch in:
> http://mail.openjdk.java.net/pipermail/hotspot-dev/2017-October/028684.html
> 
> Tested with JPRT.
> 
> Thanks,
> StefanK

From kim.barrett at oracle.com  Tue Oct 17 23:04:19 2017
From: kim.barrett at oracle.com (Kim Barrett)
Date: Tue, 17 Oct 2017 19:04:19 -0400
Subject: RFR(XXS): 8187462: IntegralConstant should not be AllStatic
Message-ID: <7B2A73A3-3D83-4D29-A6D0-42C158575E28@oracle.com>

Please review this small change to the IntegralConstant class so that
it actually behaves as documented.

CR:
https://bugs.openjdk.java.net/browse/JDK-8187462

Webrev:
http://cr.openjdk.java.net/~kbarrett/8187462/open.00/

Testing:
Built on all platforms supported by JPRT.


From serguei.spitsyn at oracle.com  Tue Oct 17 23:08:51 2017
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Tue, 17 Oct 2017 16:08:51 -0700
Subject: 8189360: JvmtiExport::weak_oops_do is called for all
 JNIHandleBlock instances
In-Reply-To: <C19A334B-DCC6-4A0E-B65A-19A343FF416E@oracle.com>
References: <8e8b2dd7-3e49-ef54-6e3b-f13fb847cbd8@oracle.com>
 <C19A334B-DCC6-4A0E-B65A-19A343FF416E@oracle.com>
Message-ID: <f9baa353-c886-8acb-24c2-67b6006ae60e@oracle.com>

Hi Stefan,

Looks good.
+1 for the removal of #include ?prims/jvmtiExport.hpp?.

Thanks,
Serguei


On 10/17/17 12:55, Kim Barrett wrote:
>> On Oct 16, 2017, at 11:40 AM, Stefan Karlsson <stefan.karlsson at oracle.com> wrote:
>>
>> Hi all,
>>
>> Please review this patch to move the call of the static JvmtiExport::weak_oops_do out of the JNIHandleBlock::weak_oops_do member function into the new WeakProcessor.
>>
>> Today, this isn't causing any bugs because there's only one instance of JNIHandleBlock, the _weak_global_handles. However, in prototypes with more than one JNIHandleBlock, this results in multiple calls to JvmtiExport::weak_oops_do.
>>
>> http://cr.openjdk.java.net/~stefank/8189360/webrev.00/
>> https://bugs.openjdk.java.net/browse/JDK-8189360
>>
>> This patch builds upon the patch in:
>> http://mail.openjdk.java.net/pipermail/hotspot-dev/2017-October/028684.html
>>
>> Tested with JPRT.
>>
>> Thanks,
>> StefanK
> src/hotspot/share/runtime/jniHandles.cpp
> Maybe remove #include ?prims/jvmtiExport.hpp? ?
>
> Otherwise looks good.  I don?t need another webrev for that #include removal.
>


From coleen.phillimore at oracle.com  Tue Oct 17 23:09:04 2017
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Tue, 17 Oct 2017 19:09:04 -0400
Subject: RFR(XXS): 8187462: IntegralConstant should not be AllStatic
In-Reply-To: <7B2A73A3-3D83-4D29-A6D0-42C158575E28@oracle.com>
References: <7B2A73A3-3D83-4D29-A6D0-42C158575E28@oracle.com>
Message-ID: <4574a14e-5375-ab81-ac86-b13393f33f70@oracle.com>

This looks good.? I'm pretty sure this can be checked in under the 
"trivial" rule.
Coleen

On 10/17/17 7:04 PM, Kim Barrett wrote:
> Please review this small change to the IntegralConstant class so that
> it actually behaves as documented.
>
> CR:
> https://bugs.openjdk.java.net/browse/JDK-8187462
>
> Webrev:
> http://cr.openjdk.java.net/~kbarrett/8187462/open.00/
>
> Testing:
> Built on all platforms supported by JPRT.
>


From kim.barrett at oracle.com  Tue Oct 17 23:57:53 2017
From: kim.barrett at oracle.com (Kim Barrett)
Date: Tue, 17 Oct 2017 19:57:53 -0400
Subject: RFR(XXS): 8187462: IntegralConstant should not be AllStatic
In-Reply-To: <4574a14e-5375-ab81-ac86-b13393f33f70@oracle.com>
References: <7B2A73A3-3D83-4D29-A6D0-42C158575E28@oracle.com>
 <4574a14e-5375-ab81-ac86-b13393f33f70@oracle.com>
Message-ID: <CA529421-4991-4C13-968C-1BF0F60E17D4@oracle.com>

> On Oct 17, 2017, at 7:09 PM, coleen.phillimore at oracle.com wrote:
> 
> This looks good.  I'm pretty sure this can be checked in under the "trivial" rule.
> Coleen

Thanks.  Agree that it?s trivial.

> 
> On 10/17/17 7:04 PM, Kim Barrett wrote:
>> Please review this small change to the IntegralConstant class so that
>> it actually behaves as documented.
>> 
>> CR:
>> https://bugs.openjdk.java.net/browse/JDK-8187462
>> 
>> Webrev:
>> http://cr.openjdk.java.net/~kbarrett/8187462/open.00/
>> 
>> Testing:
>> Built on all platforms supported by JPRT.


From kim.barrett at oracle.com  Wed Oct 18 00:09:30 2017
From: kim.barrett at oracle.com (Kim Barrett)
Date: Tue, 17 Oct 2017 20:09:30 -0400
Subject: RFR: 8189359: Move native weak oops cleaning out of
 ReferenceProcessor
In-Reply-To: <f9597843-9e46-23be-6bbe-b6b198dbd07a@oracle.com>
References: <8c0aafa1-ca06-105c-72f9-7bd11d382452@oracle.com>
 <CAAF1C58-E370-40AB-B383-C1EBF37CF10A@oracle.com>
 <c913cf29-7b15-d98e-9ab9-34fd5ea60caa@oracle.com>
 <f9597843-9e46-23be-6bbe-b6b198dbd07a@oracle.com>
Message-ID: <8209F13B-72CA-4135-B589-09D72A0B54AA@oracle.com>

> On Oct 17, 2017, at 5:38 PM, Per Liden <per.liden at oracle.com> wrote:
> 
> Hi,
> 
> On 2017-10-17 22:57, Stefan Karlsson wrote:
> [...]
>> Here are the updated webrevs:
>>  http://cr.openjdk.java.net/~stefank/8189359/webrev.01.delta
>>  http://cr.openjdk.java.net/~stefank/8189359/webrev.01
> 
> Looks good. Just two comments.
> 
> share/gc/parallel/psScavenge.cpp:
> 
> 446     {
> 447       GCTraceTime(Debug, gc, phases) tm("Weak Processing", &_gc_timer);
> 448       WeakProcessor::weak_oops_do(&_is_alive_closure, &root_closure);
> 449     }
> 
> I see you've kept the "complete" closure in WeakProcessor::weak_oops_do(), which is fine and we can clean that out later, but here you don't seem to mimic exactly what the old code did. I think you want to pass in &evac_followers here, right?
> 
> share/gc/serial/defNewGeneration.cpp:
> 
> 662   WeakProcessor::weak_oops_do(&is_alive, &keep_alive);
> 
> Same here, pass in &evacuate_followers?
> 
> I don't need to see a new webrev.
> 
> cheers,
> Per

Oh, I missed that.  Same thing in cms/parNewGeneration.cpp, I think.

Otherwise, looks good.

I don?t need a new webrev either.


From thomas.stuefe at gmail.com  Wed Oct 18 07:10:51 2017
From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=)
Date: Wed, 18 Oct 2017 09:10:51 +0200
Subject: RFR(xxs): 8187230: [aix] Leave OS guard page size at default for
 non-java threads instead of explicitly setting it
In-Reply-To: <CAA-vtUzz-F8WR2_+xtRMGjRe7LUz9+rscHpbC5Yp+PZH69yORQ@mail.gmail.com>
References: <CAA-vtUy_3HMA20D1iiOeVu8tj4d0wmh3ORXhPD5NDXb=6GAtzQ@mail.gmail.com>
 <af39eea1-5dff-4a2e-fa9c-a8ba0db10b95@oracle.com>
 <368f252c8d5440e785e1ee341f4a918e@sap.com>
 <CAA-vtUzz-F8WR2_+xtRMGjRe7LUz9+rscHpbC5Yp+PZH69yORQ@mail.gmail.com>
Message-ID: <CAA-vtUwoeBxh+gy7T0S0E6mVxCN1Z1mpANAN=s57zmn27PcJeg@mail.gmail.com>

Hi all,

I am cleaning up my backlog of old issues which did not make it into the
repo before the consolidation.

Bug:
https://bugs.openjdk.java.net/browse/JDK-8187230

Last Webrev (just rebased to the new repo structure, no changes):
http://cr.openjdk.java.net/~stuefe/webrevs/8187230-aix-leave-os-guard-page-size-at-default-for-non-java-threads/webrev.02/webrev/

For your convenience, here the original message:

<<<
The change is very subtle.

Before, we would set the OS guard page size for every thread - for java
threads disable them, for non-java threads we'd set them to 4K.

Now, we still disable them for java threads but leave them at the OS
default size for non-java threads.

The really important part is the disabling of OS guard pages for java
threads, where we have a VM guard pages in place and do not want to spend
more memory on OS guards. We do not really care for the exact size of the
OS guard pages for non-java threads, and therefore should not set it - we
should leave the size in place the OS deems sufficient. That also spares us
the complexity of handling the thread stack page size, which on AIX may be
different from os::vm_page_size().
>>>

@Chris: you did ask whether this would make sense for Linux too. I think
you are right, but as Goetz pointed out matters are more complicated as
glibc pthread_create does not substract OS guard size from the user
specified stack size, so it requires us to know the OS guard size and add
it to the specified stack size (funny, the same issue we have with VM
guards and -Xss). So, for now, I'd prefer this to keep AIX only.

I think I need a second reviewer beside Goetz.

Thanks!

Thomas


On Fri, Sep 8, 2017 at 10:48 AM, Thomas St?fe <thomas.stuefe at gmail.com>
wrote:

> Hi Guys,
>
> On Fri, Sep 8, 2017 at 9:51 AM, Lindenmaier, Goetz <
> goetz.lindenmaier at sap.com> wrote:
>
>> Hi Chris,
>>
>> on linux the pthread implementation is a bit strange, or buggy.
>> It takes the OS guard pages out of the stack size specified.
>> We need to set it so we can predict the additional space
>> that must be allocated for the stack.
>>
>> See also the comment in os_linux.cpp, create_thread().
>>
>
> Goetz, I know we talked about this off list yesterday, but now I am not
> sure this is actually needed. Yes, to correctly calculate the stack size,
> we need to know the OS guard page size, but we do not need to set it, we
> just need to know it. So, for non-java threads (java threads get the OS
> guard set to zero), it would probably be sufficient to:
>
> - pthread_attr_init() (sets default thread attribute values to the
> attribute structure) and then
> - pthread_attr_getguardsize() to read the guard size from that structure.
>
> That way we leave the OS guard page at the size glibc deems best. I think
> that is a better option. Consider a situation where the glibc changes the
> size of the OS guard pages, for whatever reason - we probably should follow
> suit.
>
> See e.g. this security issue - admittedly only loosely related, since the
> fix for this issue seemed to be a fix to stack banging, not changing the OS
> guard size: https://access.redhat.com/security/vulnerabilities/stackguard
>
> So, in short, I think we could change this for Linux too. If you guys
> agree, I'll add this to the patch. Since I am on vacation and the depot is
> closed, it may take some time.
>
> Kind Regards, Thomas
>
>
>
>
>
>>
>> Best regards,
>>   Goetz.
>>
>> > -----Original Message-----
>> > From: ppc-aix-port-dev [mailto:ppc-aix-port-dev-bounc
>> es at openjdk.java.net]
>> > On Behalf Of Chris Plummer
>> > Sent: Thursday, September 07, 2017 11:07 PM
>> > To: Thomas St?fe <thomas.stuefe at gmail.com>; ppc-aix-port-
>> > dev at openjdk.java.net
>> > Cc: HotSpot Open Source Developers <hotspot-dev at openjdk.java.net>
>> > Subject: Re: RFR(xxs): 8187230: [aix] Leave OS guard page size at
>> default for
>> > non-java threads instead of explicitly setting it
>> >
>> > Hi Thomas,
>> >
>> > Is there a reason this shouldn't also be done for linux?
>> >
>> > thanks,
>> >
>> > Chris
>> >
>> > On 9/7/17 3:02 AM, Thomas St?fe wrote:
>> > > Hi all,
>> > >
>> > > may I please have a review for this small change:
>> > >
>> > > Bug:
>> > > https://bugs.openjdk.java.net/browse/JDK-8187230
>> > >
>> > > Webrev:
>> > > http://cr.openjdk.java.net/~stuefe/webrevs/8187230-aix-
>> > > leave-os-guard-page-size-at-default-for-non-java-
>> > threads/webrev.00/webrev/
>> > >
>> > > The change is very subtle.
>> > >
>> > > Before, we would set the OS guard page size for every thread - for
>> java
>> > > threads disable them, for non-java threads we'd set them to 4K.
>> > >
>> > > Now, we still disable them for java threads but leave them at the OS
>> > > default size for non-java threads.
>> > >
>> > > The really important part is the disabling of OS guard pages for java
>> > > threads, where we have a VM guard pages in place and do not want to
>> > spend
>> > > more memory on OS guards. We do not really care for the exact size of
>> the
>> > > OS guard pages for non-java threads, and therefore should not set it
>> - we
>> > > should leave the size in place the OS deems sufficient. That also
>> spares us
>> > > the complexity of handling the thread stack page size, which on AIX
>> may be
>> > > different from os::vm_page_size().
>> > >
>> > > Thank you and Kind Regards, Thomas
>> >
>> >
>>
>>
>

From david.holmes at oracle.com  Wed Oct 18 07:12:50 2017
From: david.holmes at oracle.com (David Holmes)
Date: Wed, 18 Oct 2017 17:12:50 +1000
Subject: RFR(xxs): 8187230: [aix] Leave OS guard page size at default for
 non-java threads instead of explicitly setting it
In-Reply-To: <CAA-vtUwoeBxh+gy7T0S0E6mVxCN1Z1mpANAN=s57zmn27PcJeg@mail.gmail.com>
References: <CAA-vtUy_3HMA20D1iiOeVu8tj4d0wmh3ORXhPD5NDXb=6GAtzQ@mail.gmail.com>
 <af39eea1-5dff-4a2e-fa9c-a8ba0db10b95@oracle.com>
 <368f252c8d5440e785e1ee341f4a918e@sap.com>
 <CAA-vtUzz-F8WR2_+xtRMGjRe7LUz9+rscHpbC5Yp+PZH69yORQ@mail.gmail.com>
 <CAA-vtUwoeBxh+gy7T0S0E6mVxCN1Z1mpANAN=s57zmn27PcJeg@mail.gmail.com>
Message-ID: <a7e2ceb9-b487-65c4-5be2-9f099eb3c341@oracle.com>

Looks fine to me.

Cheers,
David

On 18/10/2017 5:10 PM, Thomas St?fe wrote:
> Hi all,
> 
> I am cleaning up my backlog of old issues which did not make it into the 
> repo before the consolidation.
> 
> Bug:
> https://bugs.openjdk.java.net/browse/JDK-8187230 
> <https://bugs.openjdk.java.net/browse/JDK-8187230>
> 
> Last Webrev (just rebased to the new repo structure, no changes):
> http://cr.openjdk.java.net/~stuefe/webrevs/8187230-aix-leave-os-guard-page-size-at-default-for-non-java-threads/webrev.02/webrev/
> 
> For your convenience, here the original message:
> 
> <<<
> The change is very subtle.
> 
> Before, we would set the OS guard page size for every thread - for java 
> threads disable them, for non-java threads we'd set them to 4K.
> 
> Now, we still disable them for java threads but leave them at the OS 
> default size for non-java threads.
> 
> The really important part is the disabling of OS guard pages for java 
> threads, where we have a VM guard pages in place and do not want to 
> spend more memory on OS guards. We do not really care for the exact size 
> of the OS guard pages for non-java threads, and therefore should not set 
> it - we should leave the size in place the OS deems sufficient. That 
> also spares us the complexity of handling the thread stack page size, 
> which on AIX may be different from os::vm_page_size().
>  >>>
> 
> @Chris: you did ask whether this would make sense for Linux too. I think 
> you are right, but as Goetz pointed out matters are more complicated as 
> glibc pthread_create does not substract OS guard size from the user 
> specified stack size, so it requires us to know the OS guard size and 
> add it to the specified stack size (funny, the same issue we have with 
> VM guards and -Xss). So, for now, I'd prefer this to keep AIX only.
> 
> I think I need a second reviewer beside Goetz.
> 
> Thanks!
> 
> Thomas
> 
> 
> 
> On Fri, Sep 8, 2017 at 10:48 AM, Thomas St?fe <thomas.stuefe at gmail.com 
> <mailto:thomas.stuefe at gmail.com>> wrote:
> 
>     Hi Guys,
> 
>     On Fri, Sep 8, 2017 at 9:51 AM, Lindenmaier, Goetz
>     <goetz.lindenmaier at sap.com <mailto:goetz.lindenmaier at sap.com>> wrote:
> 
>         Hi Chris,
> 
>         on linux the pthread implementation is a bit strange, or buggy.
>         It takes the OS guard pages out of the stack size specified.
>         We need to set it so we can predict the additional space
>         that must be allocated for the stack.
> 
>         See also the comment in os_linux.cpp, create_thread().
> 
> 
>     Goetz, I know we talked about this off list yesterday, but now I am
>     not sure this is actually needed. Yes, to correctly calculate the
>     stack size, we need to know the OS guard page size, but we do not
>     need to set it, we just need to know it. So, for non-java threads
>     (java threads get the OS guard set to zero), it would probably be
>     sufficient to:
> 
>     - pthread_attr_init() (sets default thread attribute values to the
>     attribute structure) and then
>     - pthread_attr_getguardsize() to read the guard size from that
>     structure.
> 
>     That way we leave the OS guard page at the size glibc deems best. I
>     think that is a better option. Consider a situation where the glibc
>     changes the size of the OS guard pages, for whatever reason - we
>     probably should follow suit.
> 
>     See e.g. this security issue - admittedly only loosely related,
>     since the fix for this issue seemed to be a fix to stack banging,
>     not changing the OS guard size:
>     https://access.redhat.com/security/vulnerabilities/stackguard
>     <https://access.redhat.com/security/vulnerabilities/stackguard>
> 
>     So, in short, I think we could change this for Linux too. If you
>     guys agree, I'll add this to the patch. Since I am on vacation and
>     the depot is closed, it may take some time.
> 
>     Kind Regards, Thomas
> 
> 
> 
> 
>         Best regards,
>          ? Goetz.
> 
>         > -----Original Message-----
>         > From: ppc-aix-port-dev [mailto:ppc-aix-port-dev-bounces at openjdk.java.net
>         <mailto:ppc-aix-port-dev-bounces at openjdk.java.net>]
>         > On Behalf Of Chris Plummer
>         > Sent: Thursday, September 07, 2017 11:07 PM
>         > To: Thomas St?fe <thomas.stuefe at gmail.com <mailto:thomas.stuefe at gmail.com>>;
>         ppc-aix-port-
>         > dev at openjdk.java.net <mailto:dev at openjdk.java.net>
>         > Cc: HotSpot Open Source Developers <hotspot-dev at openjdk.java.net <mailto:hotspot-dev at openjdk.java.net>>
>         > Subject: Re: RFR(xxs): 8187230: [aix] Leave OS guard page size at default for
>         > non-java threads instead of explicitly setting it
>         >
>          > Hi Thomas,
>          >
>          > Is there a reason this shouldn't also be done for linux?
>          >
>          > thanks,
>          >
>          > Chris
>          >
>          > On 9/7/17 3:02 AM, Thomas St?fe wrote:
>          > > Hi all,
>          > >
>          > > may I please have a review for this small change:
>          > >
>          > > Bug:
>          > > https://bugs.openjdk.java.net/browse/JDK-8187230
>         <https://bugs.openjdk.java.net/browse/JDK-8187230>
>          > >
>          > > Webrev:
>          > > http://cr.openjdk.java.net/~stuefe/webrevs/8187230-aix-
>         <http://cr.openjdk.java.net/~stuefe/webrevs/8187230-aix->
>          > > leave-os-guard-page-size-at-default-for-non-java-
>          > threads/webrev.00/webrev/
>          > >
>          > > The change is very subtle.
>          > >
>          > > Before, we would set the OS guard page size for every
>         thread - for java
>          > > threads disable them, for non-java threads we'd set them to 4K.
>          > >
>          > > Now, we still disable them for java threads but leave them
>         at the OS
>          > > default size for non-java threads.
>          > >
>          > > The really important part is the disabling of OS guard
>         pages for java
>          > > threads, where we have a VM guard pages in place and do not
>         want to
>          > spend
>          > > more memory on OS guards. We do not really care for the
>         exact size of the
>          > > OS guard pages for non-java threads, and therefore should
>         not set it - we
>          > > should leave the size in place the OS deems sufficient.
>         That also spares us
>          > > the complexity of handling the thread stack page size,
>         which on AIX may be
>          > > different from os::vm_page_size().
>          > >
>          > > Thank you and Kind Regards, Thomas
>          >
>          >
> 
> 
> 

From thomas.stuefe at gmail.com  Wed Oct 18 07:27:24 2017
From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=)
Date: Wed, 18 Oct 2017 09:27:24 +0200
Subject: RFR(xxs): 8187230: [aix] Leave OS guard page size at default for
 non-java threads instead of explicitly setting it
In-Reply-To: <a7e2ceb9-b487-65c4-5be2-9f099eb3c341@oracle.com>
References: <CAA-vtUy_3HMA20D1iiOeVu8tj4d0wmh3ORXhPD5NDXb=6GAtzQ@mail.gmail.com>
 <af39eea1-5dff-4a2e-fa9c-a8ba0db10b95@oracle.com>
 <368f252c8d5440e785e1ee341f4a918e@sap.com>
 <CAA-vtUzz-F8WR2_+xtRMGjRe7LUz9+rscHpbC5Yp+PZH69yORQ@mail.gmail.com>
 <CAA-vtUwoeBxh+gy7T0S0E6mVxCN1Z1mpANAN=s57zmn27PcJeg@mail.gmail.com>
 <a7e2ceb9-b487-65c4-5be2-9f099eb3c341@oracle.com>
Message-ID: <CAA-vtUxza+PjfqZDvtxqZEHxGeDyStd_YKVw7PLLm4hDbaKx3Q@mail.gmail.com>

On Wed, Oct 18, 2017 at 9:12 AM, David Holmes <david.holmes at oracle.com>
wrote:

> Looks fine to me.
>
> Cheers,
> David
>
>
Thanks David!


> On 18/10/2017 5:10 PM, Thomas St?fe wrote:
>
>> Hi all,
>>
>> I am cleaning up my backlog of old issues which did not make it into the
>> repo before the consolidation.
>>
>> Bug:
>> https://bugs.openjdk.java.net/browse/JDK-8187230 <
>> https://bugs.openjdk.java.net/browse/JDK-8187230>
>>
>> Last Webrev (just rebased to the new repo structure, no changes):
>> http://cr.openjdk.java.net/~stuefe/webrevs/8187230-aix-leave
>> -os-guard-page-size-at-default-for-non-java-threads/webrev.02/webrev/
>>
>> For your convenience, here the original message:
>>
>> <<<
>> The change is very subtle.
>>
>> Before, we would set the OS guard page size for every thread - for java
>> threads disable them, for non-java threads we'd set them to 4K.
>>
>> Now, we still disable them for java threads but leave them at the OS
>> default size for non-java threads.
>>
>> The really important part is the disabling of OS guard pages for java
>> threads, where we have a VM guard pages in place and do not want to spend
>> more memory on OS guards. We do not really care for the exact size of the
>> OS guard pages for non-java threads, and therefore should not set it - we
>> should leave the size in place the OS deems sufficient. That also spares us
>> the complexity of handling the thread stack page size, which on AIX may be
>> different from os::vm_page_size().
>>  >>>
>>
>> @Chris: you did ask whether this would make sense for Linux too. I think
>> you are right, but as Goetz pointed out matters are more complicated as
>> glibc pthread_create does not substract OS guard size from the user
>> specified stack size, so it requires us to know the OS guard size and add
>> it to the specified stack size (funny, the same issue we have with VM
>> guards and -Xss). So, for now, I'd prefer this to keep AIX only.
>>
>> I think I need a second reviewer beside Goetz.
>>
>> Thanks!
>>
>> Thomas
>>
>>
>>
>> On Fri, Sep 8, 2017 at 10:48 AM, Thomas St?fe <thomas.stuefe at gmail.com
>> <mailto:thomas.stuefe at gmail.com>> wrote:
>>
>>     Hi Guys,
>>
>>     On Fri, Sep 8, 2017 at 9:51 AM, Lindenmaier, Goetz
>>     <goetz.lindenmaier at sap.com <mailto:goetz.lindenmaier at sap.com>> wrote:
>>
>>         Hi Chris,
>>
>>         on linux the pthread implementation is a bit strange, or buggy.
>>         It takes the OS guard pages out of the stack size specified.
>>         We need to set it so we can predict the additional space
>>         that must be allocated for the stack.
>>
>>         See also the comment in os_linux.cpp, create_thread().
>>
>>
>>     Goetz, I know we talked about this off list yesterday, but now I am
>>     not sure this is actually needed. Yes, to correctly calculate the
>>     stack size, we need to know the OS guard page size, but we do not
>>     need to set it, we just need to know it. So, for non-java threads
>>     (java threads get the OS guard set to zero), it would probably be
>>     sufficient to:
>>
>>     - pthread_attr_init() (sets default thread attribute values to the
>>     attribute structure) and then
>>     - pthread_attr_getguardsize() to read the guard size from that
>>     structure.
>>
>>     That way we leave the OS guard page at the size glibc deems best. I
>>     think that is a better option. Consider a situation where the glibc
>>     changes the size of the OS guard pages, for whatever reason - we
>>     probably should follow suit.
>>
>>     See e.g. this security issue - admittedly only loosely related,
>>     since the fix for this issue seemed to be a fix to stack banging,
>>     not changing the OS guard size:
>>     https://access.redhat.com/security/vulnerabilities/stackguard
>>     <https://access.redhat.com/security/vulnerabilities/stackguard>
>>
>>     So, in short, I think we could change this for Linux too. If you
>>     guys agree, I'll add this to the patch. Since I am on vacation and
>>     the depot is closed, it may take some time.
>>
>>     Kind Regards, Thomas
>>
>>
>>
>>
>>         Best regards,
>>            Goetz.
>>
>>         > -----Original Message-----
>>         > From: ppc-aix-port-dev [mailto:ppc-aix-port-dev-bounc
>> es at openjdk.java.net
>>         <mailto:ppc-aix-port-dev-bounces at openjdk.java.net>]
>>         > On Behalf Of Chris Plummer
>>         > Sent: Thursday, September 07, 2017 11:07 PM
>>         > To: Thomas St?fe <thomas.stuefe at gmail.com <mailto:
>> thomas.stuefe at gmail.com>>;
>>         ppc-aix-port-
>>         > dev at openjdk.java.net <mailto:dev at openjdk.java.net>
>>         > Cc: HotSpot Open Source Developers <
>> hotspot-dev at openjdk.java.net <mailto:hotspot-dev at openjdk.java.net>>
>>         > Subject: Re: RFR(xxs): 8187230: [aix] Leave OS guard page size
>> at default for
>>         > non-java threads instead of explicitly setting it
>>         >
>>          > Hi Thomas,
>>          >
>>          > Is there a reason this shouldn't also be done for linux?
>>          >
>>          > thanks,
>>          >
>>          > Chris
>>          >
>>          > On 9/7/17 3:02 AM, Thomas St?fe wrote:
>>          > > Hi all,
>>          > >
>>          > > may I please have a review for this small change:
>>          > >
>>          > > Bug:
>>          > > https://bugs.openjdk.java.net/browse/JDK-8187230
>>         <https://bugs.openjdk.java.net/browse/JDK-8187230>
>>          > >
>>          > > Webrev:
>>          > > http://cr.openjdk.java.net/~stuefe/webrevs/8187230-aix-
>>         <http://cr.openjdk.java.net/~stuefe/webrevs/8187230-aix->
>>          > > leave-os-guard-page-size-at-default-for-non-java-
>>          > threads/webrev.00/webrev/
>>          > >
>>          > > The change is very subtle.
>>          > >
>>          > > Before, we would set the OS guard page size for every
>>         thread - for java
>>          > > threads disable them, for non-java threads we'd set them to
>> 4K.
>>          > >
>>          > > Now, we still disable them for java threads but leave them
>>         at the OS
>>          > > default size for non-java threads.
>>          > >
>>          > > The really important part is the disabling of OS guard
>>         pages for java
>>          > > threads, where we have a VM guard pages in place and do not
>>         want to
>>          > spend
>>          > > more memory on OS guards. We do not really care for the
>>         exact size of the
>>          > > OS guard pages for non-java threads, and therefore should
>>         not set it - we
>>          > > should leave the size in place the OS deems sufficient.
>>         That also spares us
>>          > > the complexity of handling the thread stack page size,
>>         which on AIX may be
>>          > > different from os::vm_page_size().
>>          > >
>>          > > Thank you and Kind Regards, Thomas
>>          >
>>          >
>>
>>
>>
>>

From magnus.ihse.bursie at oracle.com  Wed Oct 18 08:04:11 2017
From: magnus.ihse.bursie at oracle.com (Magnus Ihse Bursie)
Date: Wed, 18 Oct 2017 10:04:11 +0200
Subject: RFR: JDK-8189607 Remove duplicated jvmticmlr.h
Message-ID: <c48de33b-8a7f-5b88-7dd8-518278956ff9@oracle.com>

The file jvmticmlr.h is stored twice in the repo, both in hotspot and in 
java.base. They are both identical, and only the java.base version is 
included in the final product. This might arguably have been useful in a 
pre-consolidated world, but makes absolutely no sense now.

Bug: https://bugs.openjdk.java.net/browse/JDK-8189607
WebRev: 
http://cr.openjdk.java.net/~ihse/JDK-8189607-remove-duplicated-jvmticmlr/webrev.01

/Magnus

From thomas.schatzl at oracle.com  Wed Oct 18 08:18:37 2017
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Wed, 18 Oct 2017 10:18:37 +0200
Subject: RFR(M) 8186834:Expanding old area without full GC in parallel GC
In-Reply-To: <OFA375AB07.20C796C9-ON002581BC.0041BF80-492581BC.0042C337@notes.na.collabserv.com>
References: <OFCBC5A5CD.3EF6ED1F-ON0025818A.00475480-4925818A.005443B4@notes.na.collabserv.com>
 <OF6D0AF6C7.33D6724D-ON002581B8.0047D597@LocalDomain>
 <OFA375AB07.20C796C9-ON002581BC.0041BF80-492581BC.0042C337@notes.na.collabserv.com>
Message-ID: <1508314717.4435.5.camel@oracle.com>

On Tue, 2017-10-17 at 21:09 +0900, Michihiro Horie wrote:
> Hi Thomas,
> 
> Thanks a lot for your response!
> 
> >what is the difference (in performance) to simply set -Xms==-Xmx
> here?
> This change assumes -Xms==-Xmx is not set.?
> 
> Please let me explain our situation. We have a real project where we
> need to run multiple Java processes per node with limited memory
> resource for job schedulers of parallel distributed computing
> framework such as Spark. Arbitrary Java processes actually need the
> Xmx heap, although the same JVM arguments are uniformly set for these
> job schedulers.

I am still trying to understand why in this situation the new
(additional) flag would be preferable to the mentioned alternative.

Maybe there is something about argument passing, but the description
seems to be a bit unclear.

Let me recap if I understood the problem and the need for this
solution correctly:

- there are at least two different kinds of VMs, job schedulers and the
big data processing worker VMs
- (assumption) the job schedulers and the worker VMs have different
memory requirements
- to ease VM management (assumption), both job schedulers and the
worker VMs need to be passed the same VM arguments?

So in your case you would add the new
-XX:+UseAdaptiveGenerationSizePolicyBeforeMajorCollection to both, and
the worker VM would benefit from it, while the job scheduler would
never ever expand the heap anyway?

Otherwise, if you were able to pass different VM arguments to the
different VMs, the use of -Xms (instead of that new flag) would seem
straightforward to me (Only specifying -Xms will not actually commit
the memory, so there is no difference in actual memory use).

Particularly if, as you mention, full gc will not yield a significant
amount of freed memory, both methods seem to achieve the exact same
effect.

Or is there another difference between passing -Xms instead of
-XX:UseAdaptiveGenerationSizePolicyBeforeMajorCollection?

> Besides, only a limited number of objects are
> collected in the full GCs that occur during the heap expansion. So,
> full GC here is especially expensive.

Did you ever try G1 for these workloads? There are some (old) reports
[0] where G1 outperforms Parallel GC with some tuning.

It generally does not use full gcs to expand the heap.

With recent improvements in JDK9, it should perform even slightly
better, but I am not sure if Spark already works with JDK9.

> >And why not make the (first) full gc expand the heap more
> > aggressively?
> >(I think there is at least one way to do that, something like
> >Min/MaxFreeHeapRatio or so, I can look it up if needed).
> Thank you for telling the Min/MaxHeapFreeRatio. I think they surely
> help for our purpose, but I think this change would be still
> effective with them.
> 
> Best regards,

Thanks,
  Thomas

[0] https://databricks.com/blog/2015/05/28/tuning-java-garbage-collecti
on-for-spark-applications.html

> --
> Michihiro,
> IBM Research - Tokyo
> 
> Thomas Schatzl ---2017/10/13 22:04:38---Hi, On Tue, 2017-08-29 at
> 00:20 +0900, Michihiro Horie wrote:
> 
> From: Thomas Schatzl <thomas.schatzl at oracle.com>
> To: Michihiro Horie <HORIE at jp.ibm.com>, hotspot-dev at openjdk.java.net
> Cc: Hiroshi H Horii <HORII at jp.ibm.com>
> Date: 2017/10/13 22:04
> Subject: Re: RFR(M) 8186834:Expanding old area without full GC in
> parallel GC
> 
> 
> 
> Hi,
> 
> On Tue, 2017-08-29 at 00:20 +0900, Michihiro Horie wrote:
> > Dear all,
> >?
> > Would you please review the following change?
> > bug: https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.open
> jdk.java.net_browse_JDK-2D8186834&d=DwIFaQ&c=jf_iaSHvJObTbx-
> siA1ZOg&r=oecsIpYF-cifqq2i1JEH0Q&m=CaV8n9mhlYuwwkSthJ3tAKsxYWXA8YW-
> A_scv5JwjxE&s=RN7_XLvlvAligv4Bmsj1fMFsKTHsrQQFEaLRIrjYm9Y&e=
> > webrev: https://urldefense.proofpoint.com/v2/url?u=http-3A__cr.open
> jdk.java.net_-7Emhorie_8186834_webrev.00_&d=DwIFaQ&c=jf_iaSHvJObTbx-
> siA1ZOg&r=oecsIpYF-cifqq2i1JEH0Q&m=CaV8n9mhlYuwwkSthJ3tAKsxYWXA8YW-
> A_scv5JwjxE&s=Lkjbx2hQv0H19iIiNH-7wwN0HKn5xxhXinMHhoPIvqI&e=
> >?
> > In parallel GC, old area is expanded only after a full GC occurs.
> > I am wondering if we could give an option to expand old area
> without
> > full GC. So, I added an option
> > UseAdaptiveGenerationSizePolicyBeforeMajorCollection
> 
> Sorry for the late (and probably stupid) question, but what is the
> difference (in performance) to simply set -Xms==-Xmx here?
> 
> And why not make the (first) full gc expand the heap more
> aggressively?
> (I think there is at least one way to do that, something like
> Min/MaxFreeHeapRatio or so, I can look it up if needed).
> 
> Thanks,
> ?Thomas
> 
> > Following is a simple micro benchmark I used to see the benefit of
> > this change.
> > As a result, pause time of full GC reduced by 30%. Full GC count
> > reduced by 54%.
> > Elapsed time reduced by 7%.
> >?
> > import java.util.HashMap;
> > import java.util.Map;
> > public class HeapExpandTest {
> > ? static Map<Integer, byte[]> map = new HashMap<>();
> > ? public static void main(String[] args) throws Exception {
> > ????long start = System.currentTimeMillis();
> > ????for (int i = 0; i < 2200; ++i) {
> > ??????map.put(i, new byte[1024*1024]); // 1MB
> > ????}
> > ????System.out.println("elapsed= " + (System.currentTimeMillis() -
> > start));
> > ? }
> > }
> >?
> > JVM options: -XX:+UseParallelGC -XX:+UseAdaptiveSizePolicy
> > -XX:ParallelGCThreads=8 -Xms64m -Xmx3g
> > -XX:+UseAdaptiveGenerationSizePolicyBeforeMajorCollection
> 
> 
> 


From erik.joelsson at oracle.com  Wed Oct 18 08:26:06 2017
From: erik.joelsson at oracle.com (Erik Joelsson)
Date: Wed, 18 Oct 2017 10:26:06 +0200
Subject: RFR: JDK-8189607 Remove duplicated jvmticmlr.h
In-Reply-To: <c48de33b-8a7f-5b88-7dd8-518278956ff9@oracle.com>
References: <c48de33b-8a7f-5b88-7dd8-518278956ff9@oracle.com>
Message-ID: <c12b9dca-eb4f-986c-11a3-61fca60d251c@oracle.com>

On 2017-10-18 10:04, Magnus Ihse Bursie wrote:
> The file jvmticmlr.h is stored twice in the repo, both in hotspot and 
> in java.base. They are both identical, and only the java.base version 
> is included in the final product. This might arguably have been useful 
> in a pre-consolidated world, but makes absolutely no sense now.
>
> Bug: https://bugs.openjdk.java.net/browse/JDK-8189607
> WebRev: 
> http://cr.openjdk.java.net/~ihse/JDK-8189607-remove-duplicated-jvmticmlr/webrev.01
>
The question is, which file location makes the most sense. I think your 
pick of java.base/share/native/include probably makes more sense as that 
makes it much clearer that this is an exported header file.

Looks good to me.

/Erik


From magnus.ihse.bursie at oracle.com  Wed Oct 18 08:37:18 2017
From: magnus.ihse.bursie at oracle.com (Magnus Ihse Bursie)
Date: Wed, 18 Oct 2017 10:37:18 +0200
Subject: RFR: JDK-8189607 Remove duplicated jvmticmlr.h
In-Reply-To: <c12b9dca-eb4f-986c-11a3-61fca60d251c@oracle.com>
References: <c48de33b-8a7f-5b88-7dd8-518278956ff9@oracle.com>
 <c12b9dca-eb4f-986c-11a3-61fca60d251c@oracle.com>
Message-ID: <e2a6bcf7-d91e-248d-bcd2-c759bd19509d@oracle.com>

On 2017-10-18 10:26, Erik Joelsson wrote:
> On 2017-10-18 10:04, Magnus Ihse Bursie wrote:
>> The file jvmticmlr.h is stored twice in the repo, both in hotspot and 
>> in java.base. They are both identical, and only the java.base version 
>> is included in the final product. This might arguably have been 
>> useful in a pre-consolidated world, but makes absolutely no sense now.
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8189607
>> WebRev: 
>> http://cr.openjdk.java.net/~ihse/JDK-8189607-remove-duplicated-jvmticmlr/webrev.01
>>
> The question is, which file location makes the most sense. I think 
> your pick of java.base/share/native/include probably makes more sense 
> as that makes it much clearer that this is an exported header file.
Yes, that was my reasoning. Also, the file is not really tied to hotspot 
per se -- if you were to plug in another VM, you'd still need this file. 
Combined with the fact that this was the file that was exported to the 
world. (Which doesn't *really* make any difference in this case, since 
the files were identical...)

> Looks good to me.
Thanks.

/Magnus
>
> /Erik
>


From robbin.ehn at oracle.com  Wed Oct 18 08:51:33 2017
From: robbin.ehn at oracle.com (Robbin Ehn)
Date: Wed, 18 Oct 2017 10:51:33 +0200
Subject: RFR(XL): 8185640: Thread-local handshakes
In-Reply-To: <59E62216.5070401@oracle.com>
References: <6f2f6259-73f1-c09c-063e-39ae528fb96f@oracle.com>
 <59E62216.5070401@oracle.com>
Message-ID: <e0efc3d9-a7c6-72c8-67a4-807e827aacc6@oracle.com>

Thanks Erik,

On 2017-10-17 17:30, Erik ?sterlund wrote:
> Hi Robbin,
> 
> Looks fantastic.

We have to credit Mikael Gerdin for much of the work.
Since you have been involved also, I count you as one of the contributors, and 
view your review as a bit biased but really appreciated of course :)

/Robbin

> 
> Thanks,
> /Erik
> 
> On 2017-10-11 15:37, Robbin Ehn wrote:
>> Hi all,
>>
>> Starting the review of the code while JEP work is still not completed.
>>
>> JEP: https://bugs.openjdk.java.net/browse/JDK-8185640
>>
>> This JEP introduces a way to execute a callback on threads without performing 
>> a global VM safepoint. It makes it both possible and cheap to stop individual 
>> threads and not just all threads or none.
>>
>> Entire changeset:
>> http://cr.openjdk.java.net/~rehn/8185640/v0/flat/
>>
>> Divided into 3-parts,
>> SafepointMechanism abstraction:
>> http://cr.openjdk.java.net/~rehn/8185640/v0/SafepointMechanism-0/
>> Consolidating polling page allocation:
>> http://cr.openjdk.java.net/~rehn/8185640/v0/PollingPage-1/
>> Handshakes:
>> http://cr.openjdk.java.net/~rehn/8185640/v0/Handshakes-2/
>>
>> A handshake operation is a callback that is executed for each JavaThread while 
>> that thread is in a safepoint safe state. The callback is executed either by 
>> the thread itself or by the VM thread while keeping the thread in a blocked 
>> state. The big difference between safepointing and handshaking is that the per 
>> thread operation will be performed on all threads as soon as possible and they 
>> will continue to execute as soon as it?s own operation is completed. If a 
>> JavaThread is known to be running, then a handshake can be performed with that 
>> single JavaThread as well.
>>
>> The current safepointing scheme is modified to perform an indirection through 
>> a per-thread pointer which will allow a single thread's execution to be forced 
>> to trap on the guard page. In order to force a thread to yield the VM updates 
>> the per-thread pointer for the corresponding thread to point to the guarded page.
>>
>> Example of potential use-cases:
>> -Biased lock revocation
>> -External requests for stack traces
>> -Deoptimization
>> -Async exception delivery
>> -External suspension
>> -Eliding memory barriers
>>
>> All of these will benefit the VM moving towards becoming more low-latency 
>> friendly by reducing the number of global safepoints.
>> Platforms that do not yet implement the per JavaThread poll, a fallback to 
>> normal safepoint is in place. HandshakeOneThread will then be a normal 
>> safepoint. The supported platforms are Linux x64 and Solaris SPARC.
>>
>> Tested heavily with various test suits and comes with a few new tests.
>>
>> Performance testing using standardized benchmark show no signification 
>> changes, the latest number was -0.7% on Linux x64 and +1.5% Solaris SPARC (not 
>> statistically ensured). A minor regression for the load vs load load on x64 is 
>> expected and a slight increase on SPARC due to the cost of ?materializing? the 
>> page vs load load.
>> The time to trigger a safepoint was measured on a large machine to not be an 
>> issue. The looping over threads and arming the polling page will benefit from 
>> the work on JavaThread life-cycle (8167108 - SMR and JavaThread Lifecycle: 
>> http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2017-October/024773.html) 
>> which puts all JavaThreads in an array instead of a linked list.
>>
>> Thanks, Robbin
> 

From magnus.ihse.bursie at oracle.com  Wed Oct 18 08:53:51 2017
From: magnus.ihse.bursie at oracle.com (Magnus Ihse Bursie)
Date: Wed, 18 Oct 2017 10:53:51 +0200
Subject: RFR: JDK-8189608 Remove duplicated jni.h
Message-ID: <05451b2c-9905-e1cc-7cfb-39fbe1d1c983@oracle.com>

The file jni.h is stored twice in the repo, both in hotspot and in 
java.base. They are both identical, and only the java.base version is 
included in the final product.

This bug is a part of the umbrella effort JDK-8167078 "Duplicate header 
files in hotspot and jdk". As for JDK-8189607, my reasoning is that the 
java.base version is the one to keep. (In this case, there was actually 
a small difference between the two files -- the hotspot version first 
copyright year was 1997, but the java.base version was 1996. It makes 
sense to keep the oldest one.)

My assumption was that hotspot include files should be sorted according 
to the containing directory, and since jni.h no longer resides in 
"prims", I've rearranged the include line where needed.

The -I path added in CompileJvm.gmk is identical to the one in 
JDK-8189607, and will be merged to the same change (depending on which 
fix enters first.)

Bug: https://bugs.openjdk.java.net/browse/JDK-8189608
WebRev: 
http://cr.openjdk.java.net/~ihse/JDK-8189608-remove-duplicated-jni/webrev.01

/Magnus

From serguei.spitsyn at oracle.com  Wed Oct 18 09:00:09 2017
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Wed, 18 Oct 2017 02:00:09 -0700
Subject: RFR: JDK-8189607 Remove duplicated jvmticmlr.h
In-Reply-To: <c48de33b-8a7f-5b88-7dd8-518278956ff9@oracle.com>
References: <c48de33b-8a7f-5b88-7dd8-518278956ff9@oracle.com>
Message-ID: <d9a34be0-aea1-5832-2b2a-8a05f26f24a2@oracle.com>

Hi Magnus,

The fix looks good to me.
Thank you for doing this cleanup.

Thanks,
Serguei


On 10/18/17 01:04, Magnus Ihse Bursie wrote:
> The file jvmticmlr.h is stored twice in the repo, both in hotspot and 
> in java.base. They are both identical, and only the java.base version 
> is included in the final product. This might arguably have been useful 
> in a pre-consolidated world, but makes absolutely no sense now.
>
> Bug: https://bugs.openjdk.java.net/browse/JDK-8189607
> WebRev: 
> http://cr.openjdk.java.net/~ihse/JDK-8189607-remove-duplicated-jvmticmlr/webrev.01
>
> /Magnus


From robbin.ehn at oracle.com  Wed Oct 18 09:06:57 2017
From: robbin.ehn at oracle.com (Robbin Ehn)
Date: Wed, 18 Oct 2017 11:06:57 +0200
Subject: RFR(XL): 8185640: Thread-local handshakes
In-Reply-To: <15dd917732444959b7785efbe6640952@sap.com>
References: <6f2f6259-73f1-c09c-063e-39ae528fb96f@oracle.com>
 <15dd917732444959b7785efbe6640952@sap.com>
Message-ID: <ba5210b2-d40c-b559-c9bb-35e245d2208e@oracle.com>

Thanks for looking at this.

On 2017-10-17 19:58, Doerr, Martin wrote:
> Hi Robbin,
> 
> my first impression is very good. Thanks for providing the webrev.

Great!

> 
> I only don't like that "poll_page_val | poll_bit()" is used in shared code. I'd prefer to use either one or the other mechanism.
> Would it be ok to move the decision between what to use to platform code?
> (Some platforms could still use both if this is beneficial.)
> 
> E.g. on PPC64, we'd like to use conditional trap instructions with special bit patterns if UseSIGTRAP is on. Would be excellent if we could implement set functions for _poll_armed_value and _poll_disarmed_value in platform code. poll_bit() also fits better into platform code in my opinion.

I see no issue with this.
Maybe SafepointMechanism::local_poll_armed should be possibly platform specific.
Can we do this incremental when adding the platform support for PPC64?

Thanks, Robbin

> 
> Best regards,
> Martin
> 
> 
> -----Original Message-----
> From: hotspot-dev [mailto:hotspot-dev-bounces at openjdk.java.net] On Behalf Of Robbin Ehn
> Sent: Mittwoch, 11. Oktober 2017 15:38
> To: hotspot-dev developers <hotspot-dev at openjdk.java.net>
> Subject: RFR(XL): 8185640: Thread-local handshakes
> 
> Hi all,
> 
> Starting the review of the code while JEP work is still not completed.
> 
> JEP: https://bugs.openjdk.java.net/browse/JDK-8185640
> 
> This JEP introduces a way to execute a callback on threads without performing a global VM safepoint. It makes it both possible and cheap to stop individual threads and not
> just all threads or none.
> 
> Entire changeset:
> http://cr.openjdk.java.net/~rehn/8185640/v0/flat/
> 
> Divided into 3-parts,
> SafepointMechanism abstraction:
> http://cr.openjdk.java.net/~rehn/8185640/v0/SafepointMechanism-0/
> Consolidating polling page allocation:
> http://cr.openjdk.java.net/~rehn/8185640/v0/PollingPage-1/
> Handshakes:
> http://cr.openjdk.java.net/~rehn/8185640/v0/Handshakes-2/
> 
> A handshake operation is a callback that is executed for each JavaThread while that thread is in a safepoint safe state. The callback is executed either by the thread
> itself or by the VM thread while keeping the thread in a blocked state. The big difference between safepointing and handshaking is that the per thread operation will be
> performed on all threads as soon as possible and they will continue to execute as soon as it?s own operation is completed. If a JavaThread is known to be running, then a
> handshake can be performed with that single JavaThread as well.
> 
> The current safepointing scheme is modified to perform an indirection through a per-thread pointer which will allow a single thread's execution to be forced to trap on the
> guard page. In order to force a thread to yield the VM updates the per-thread pointer for the corresponding thread to point to the guarded page.
> 
> Example of potential use-cases:
> -Biased lock revocation
> -External requests for stack traces
> -Deoptimization
> -Async exception delivery
> -External suspension
> -Eliding memory barriers
> 
> All of these will benefit the VM moving towards becoming more low-latency friendly by reducing the number of global safepoints.
> Platforms that do not yet implement the per JavaThread poll, a fallback to normal safepoint is in place. HandshakeOneThread will then be a normal safepoint. The supported
> platforms are Linux x64 and Solaris SPARC.
> 
> Tested heavily with various test suits and comes with a few new tests.
> 
> Performance testing using standardized benchmark show no signification changes, the latest number was -0.7% on Linux x64 and +1.5% Solaris SPARC (not statistically
> ensured). A minor regression for the load vs load load on x64 is expected and a slight increase on SPARC due to the cost of ?materializing? the page vs load load.
> The time to trigger a safepoint was measured on a large machine to not be an issue. The looping over threads and arming the polling page will benefit from the work on
> JavaThread life-cycle (8167108 - SMR and JavaThread Lifecycle: http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2017-October/024773.html) which puts all
> JavaThreads in an array instead of a linked list.
> 
> Thanks, Robbin
> 

From robbin.ehn at oracle.com  Wed Oct 18 09:09:31 2017
From: robbin.ehn at oracle.com (Robbin Ehn)
Date: Wed, 18 Oct 2017 11:09:31 +0200
Subject: RFR(XL): 8185640: Thread-local handshakes
In-Reply-To: <e3dc8510-f79e-1865-ab2c-7e9e1f69f183@oracle.com>
References: <6f2f6259-73f1-c09c-063e-39ae528fb96f@oracle.com>
 <e3dc8510-f79e-1865-ab2c-7e9e1f69f183@oracle.com>
Message-ID: <72a1da33-4680-1570-7d43-8ce28788f01c@oracle.com>

Thanks Nils for looking at that!

/Robbin

On 2017-10-17 16:37, Nils Eliasson wrote:
> Hi Robbin,
> 
> I have reviewed the compiler parts of the patch - c1, c2, jvmci and cpu*.
> 
> Look great!
> 
> Regards,
> 
> Nils
> 
> 
> On 2017-10-11 15:37, Robbin Ehn wrote:
>> Hi all,
>>
>> Starting the review of the code while JEP work is still not completed.
>>
>> JEP: https://bugs.openjdk.java.net/browse/JDK-8185640
>>
>> This JEP introduces a way to execute a callback on threads without performing 
>> a global VM safepoint. It makes it both possible and cheap to stop individual 
>> threads and not just all threads or none.
>>
>> Entire changeset:
>> http://cr.openjdk.java.net/~rehn/8185640/v0/flat/
>>
>> Divided into 3-parts,
>> SafepointMechanism abstraction:
>> http://cr.openjdk.java.net/~rehn/8185640/v0/SafepointMechanism-0/
>> Consolidating polling page allocation:
>> http://cr.openjdk.java.net/~rehn/8185640/v0/PollingPage-1/
>> Handshakes:
>> http://cr.openjdk.java.net/~rehn/8185640/v0/Handshakes-2/
>>
>> A handshake operation is a callback that is executed for each JavaThread while 
>> that thread is in a safepoint safe state. The callback is executed either by 
>> the thread itself or by the VM thread while keeping the thread in a blocked 
>> state. The big difference between safepointing and handshaking is that the per 
>> thread operation will be performed on all threads as soon as possible and they 
>> will continue to execute as soon as it?s own operation is completed. If a 
>> JavaThread is known to be running, then a handshake can be performed with that 
>> single JavaThread as well.
>>
>> The current safepointing scheme is modified to perform an indirection through 
>> a per-thread pointer which will allow a single thread's execution to be forced 
>> to trap on the guard page. In order to force a thread to yield the VM updates 
>> the per-thread pointer for the corresponding thread to point to the guarded page.
>>
>> Example of potential use-cases:
>> -Biased lock revocation
>> -External requests for stack traces
>> -Deoptimization
>> -Async exception delivery
>> -External suspension
>> -Eliding memory barriers
>>
>> All of these will benefit the VM moving towards becoming more low-latency 
>> friendly by reducing the number of global safepoints.
>> Platforms that do not yet implement the per JavaThread poll, a fallback to 
>> normal safepoint is in place. HandshakeOneThread will then be a normal 
>> safepoint. The supported platforms are Linux x64 and Solaris SPARC.
>>
>> Tested heavily with various test suits and comes with a few new tests.
>>
>> Performance testing using standardized benchmark show no signification 
>> changes, the latest number was -0.7% on Linux x64 and +1.5% Solaris SPARC (not 
>> statistically ensured). A minor regression for the load vs load load on x64 is 
>> expected and a slight increase on SPARC due to the cost of ?materializing? the 
>> page vs load load.
>> The time to trigger a safepoint was measured on a large machine to not be an 
>> issue. The looping over threads and arming the polling page will benefit from 
>> the work on JavaThread life-cycle (8167108 - SMR and JavaThread Lifecycle: 
>> http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2017-October/024773.html) 
>> which puts all JavaThreads in an array instead of a linked list.
>>
>> Thanks, Robbin
> 

From erik.joelsson at oracle.com  Wed Oct 18 09:15:31 2017
From: erik.joelsson at oracle.com (Erik Joelsson)
Date: Wed, 18 Oct 2017 11:15:31 +0200
Subject: RFR: JDK-8189608 Remove duplicated jni.h
In-Reply-To: <05451b2c-9905-e1cc-7cfb-39fbe1d1c983@oracle.com>
References: <05451b2c-9905-e1cc-7cfb-39fbe1d1c983@oracle.com>
Message-ID: <3ce96f6a-e7fe-b7eb-2212-07bba5b5043f@oracle.com>

Looks good to me.

/Erik


On 2017-10-18 10:53, Magnus Ihse Bursie wrote:
> The file jni.h is stored twice in the repo, both in hotspot and in 
> java.base. They are both identical, and only the java.base version is 
> included in the final product.
>
> This bug is a part of the umbrella effort JDK-8167078 "Duplicate 
> header files in hotspot and jdk". As for JDK-8189607, my reasoning is 
> that the java.base version is the one to keep. (In this case, there 
> was actually a small difference between the two files -- the hotspot 
> version first copyright year was 1997, but the java.base version was 
> 1996. It makes sense to keep the oldest one.)
>
> My assumption was that hotspot include files should be sorted 
> according to the containing directory, and since jni.h no longer 
> resides in "prims", I've rearranged the include line where needed.
>
> The -I path added in CompileJvm.gmk is identical to the one in 
> JDK-8189607, and will be merged to the same change (depending on which 
> fix enters first.)
>
> Bug: https://bugs.openjdk.java.net/browse/JDK-8189608
> WebRev: 
> http://cr.openjdk.java.net/~ihse/JDK-8189608-remove-duplicated-jni/webrev.01
>
> /Magnus


From robbin.ehn at oracle.com  Wed Oct 18 09:15:53 2017
From: robbin.ehn at oracle.com (Robbin Ehn)
Date: Wed, 18 Oct 2017 11:15:53 +0200
Subject: RFR(XL): 8185640: Thread-local handshakes
In-Reply-To: <6f2f6259-73f1-c09c-063e-39ae528fb96f@oracle.com>
References: <6f2f6259-73f1-c09c-063e-39ae528fb96f@oracle.com>
Message-ID: <82848a04-21dd-119e-3d53-101a7f25cb54@oracle.com>

Hi all,

Update after re-base with new atomic implementation:
http://cr.openjdk.java.net/~rehn/8185640/v1/Atomic-Update-Rebase-3/
This goes on top of the Handshakes-2.

Let me know if you want some other kinds of webrevs.

I would like to point out that Mikael Gerdin and Erik ?sterlund also are 
contributors of this changeset.

Thanks, Robbin

On 2017-10-11 15:37, Robbin Ehn wrote:
> Hi all,
> 
> Starting the review of the code while JEP work is still not completed.
> 
> JEP: https://bugs.openjdk.java.net/browse/JDK-8185640
> 
> This JEP introduces a way to execute a callback on threads without performing a 
> global VM safepoint. It makes it both possible and cheap to stop individual 
> threads and not just all threads or none.
> 
> Entire changeset:
> http://cr.openjdk.java.net/~rehn/8185640/v0/flat/
> 
> Divided into 3-parts,
> SafepointMechanism abstraction:
> http://cr.openjdk.java.net/~rehn/8185640/v0/SafepointMechanism-0/
> Consolidating polling page allocation:
> http://cr.openjdk.java.net/~rehn/8185640/v0/PollingPage-1/
> Handshakes:
> http://cr.openjdk.java.net/~rehn/8185640/v0/Handshakes-2/
> 
> A handshake operation is a callback that is executed for each JavaThread while 
> that thread is in a safepoint safe state. The callback is executed either by the 
> thread itself or by the VM thread while keeping the thread in a blocked state. 
> The big difference between safepointing and handshaking is that the per thread 
> operation will be performed on all threads as soon as possible and they will 
> continue to execute as soon as it?s own operation is completed. If a JavaThread 
> is known to be running, then a handshake can be performed with that single 
> JavaThread as well.
> 
> The current safepointing scheme is modified to perform an indirection through a 
> per-thread pointer which will allow a single thread's execution to be forced to 
> trap on the guard page. In order to force a thread to yield the VM updates the 
> per-thread pointer for the corresponding thread to point to the guarded page.
> 
> Example of potential use-cases:
> -Biased lock revocation
> -External requests for stack traces
> -Deoptimization
> -Async exception delivery
> -External suspension
> -Eliding memory barriers
> 
> All of these will benefit the VM moving towards becoming more low-latency 
> friendly by reducing the number of global safepoints.
> Platforms that do not yet implement the per JavaThread poll, a fallback to 
> normal safepoint is in place. HandshakeOneThread will then be a normal 
> safepoint. The supported platforms are Linux x64 and Solaris SPARC.
> 
> Tested heavily with various test suits and comes with a few new tests.
> 
> Performance testing using standardized benchmark show no signification changes, 
> the latest number was -0.7% on Linux x64 and +1.5% Solaris SPARC (not 
> statistically ensured). A minor regression for the load vs load load on x64 is 
> expected and a slight increase on SPARC due to the cost of ?materializing? the 
> page vs load load.
> The time to trigger a safepoint was measured on a large machine to not be an 
> issue. The looping over threads and arming the polling page will benefit from 
> the work on JavaThread life-cycle (8167108 - SMR and JavaThread Lifecycle: 
> http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2017-October/024773.html) 
> which puts all JavaThreads in an array instead of a linked list.
> 
> Thanks, Robbin

From david.holmes at oracle.com  Wed Oct 18 09:29:50 2017
From: david.holmes at oracle.com (David Holmes)
Date: Wed, 18 Oct 2017 19:29:50 +1000
Subject: RFR: JDK-8189607 Remove duplicated jvmticmlr.h
In-Reply-To: <c48de33b-8a7f-5b88-7dd8-518278956ff9@oracle.com>
References: <c48de33b-8a7f-5b88-7dd8-518278956ff9@oracle.com>
Message-ID: <20e2aee8-17cc-e668-ae45-3d782794f9d3@oracle.com>

Hi Magnus,

This seems fine to me.

Sanity check: the various -Ixxx will be processed in order and the first 
file found will be used - right? ie we won't unintentionally pick up the 
java.base jni.h.

Thanks,
David

On 18/10/2017 6:04 PM, Magnus Ihse Bursie wrote:
> The file jvmticmlr.h is stored twice in the repo, both in hotspot and in 
> java.base. They are both identical, and only the java.base version is 
> included in the final product. This might arguably have been useful in a 
> pre-consolidated world, but makes absolutely no sense now.
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8189607
> WebRev: 
> http://cr.openjdk.java.net/~ihse/JDK-8189607-remove-duplicated-jvmticmlr/webrev.01 
> 
> 
> /Magnus

From erik.joelsson at oracle.com  Wed Oct 18 09:32:12 2017
From: erik.joelsson at oracle.com (Erik Joelsson)
Date: Wed, 18 Oct 2017 11:32:12 +0200
Subject: RFR: JDK-8189607 Remove duplicated jvmticmlr.h
In-Reply-To: <20e2aee8-17cc-e668-ae45-3d782794f9d3@oracle.com>
References: <c48de33b-8a7f-5b88-7dd8-518278956ff9@oracle.com>
 <20e2aee8-17cc-e668-ae45-3d782794f9d3@oracle.com>
Message-ID: <a944ebea-1881-f232-05fe-8c151dc5fba2@oracle.com>

Hello David,


On 2017-10-18 11:29, David Holmes wrote:
>
> Sanity check: the various -Ixxx will be processed in order and the 
> first file found will be used - right? ie we won't unintentionally 
> pick up the java.base jni.h.
>
Correct, the search order is the order in which the -I parameters are 
listed on the command line.

/Erik


From david.holmes at oracle.com  Wed Oct 18 09:34:41 2017
From: david.holmes at oracle.com (David Holmes)
Date: Wed, 18 Oct 2017 19:34:41 +1000
Subject: RFR: JDK-8189608 Remove duplicated jni.h
In-Reply-To: <05451b2c-9905-e1cc-7cfb-39fbe1d1c983@oracle.com>
References: <05451b2c-9905-e1cc-7cfb-39fbe1d1c983@oracle.com>
Message-ID: <0a8d2474-38eb-8dc1-aa39-5c541b466222@oracle.com>

Looks good to me.

Thanks,
David

On 18/10/2017 6:53 PM, Magnus Ihse Bursie wrote:
> The file jni.h is stored twice in the repo, both in hotspot and in 
> java.base. They are both identical, and only the java.base version is 
> included in the final product.
> 
> This bug is a part of the umbrella effort JDK-8167078 "Duplicate header 
> files in hotspot and jdk". As for JDK-8189607, my reasoning is that the 
> java.base version is the one to keep. (In this case, there was actually 
> a small difference between the two files -- the hotspot version first 
> copyright year was 1997, but the java.base version was 1996. It makes 
> sense to keep the oldest one.)
> 
> My assumption was that hotspot include files should be sorted according 
> to the containing directory, and since jni.h no longer resides in 
> "prims", I've rearranged the include line where needed.
> 
> The -I path added in CompileJvm.gmk is identical to the one in 
> JDK-8189607, and will be merged to the same change (depending on which 
> fix enters first.)
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8189608
> WebRev: 
> http://cr.openjdk.java.net/~ihse/JDK-8189608-remove-duplicated-jni/webrev.01 
> 
> 
> /Magnus

From martin.doerr at sap.com  Wed Oct 18 10:11:14 2017
From: martin.doerr at sap.com (Doerr, Martin)
Date: Wed, 18 Oct 2017 10:11:14 +0000
Subject: RFR(XL): 8185640: Thread-local handshakes
In-Reply-To: <ba5210b2-d40c-b559-c9bb-35e245d2208e@oracle.com>
References: <6f2f6259-73f1-c09c-063e-39ae528fb96f@oracle.com>
 <15dd917732444959b7785efbe6640952@sap.com>
 <ba5210b2-d40c-b559-c9bb-35e245d2208e@oracle.com>
Message-ID: <18f2001cbbbd4772aa9268e6e34b4be9@sap.com>

Hi Robbin,

so you would like to push your version first (as it does not break other platforms) and then help us to push non-Oracle platform implementations which change shared code again?
I'd be fine with that, too.

While thinking a little longer about the interpreter implementation, a new idea came into my mind.
I think we could significantly reduce impact on interpreter code size and performance by using safepoint polls only in a subset of bytecodes. E.g., we could use only bytecodes which perform any kind of jump by implementing something like
if (SafepointMechanism::uses_thread_local_poll() && t->does_dispatch()) generate_safepoint_poll();
in TemplateInterpreterGenerator::generate_and_dispatch.

Best regards,
Martin


-----Original Message-----
From: Robbin Ehn [mailto:robbin.ehn at oracle.com] 
Sent: Mittwoch, 18. Oktober 2017 11:07
To: Doerr, Martin <martin.doerr at sap.com>; hotspot-dev developers <hotspot-dev at openjdk.java.net>
Subject: Re: RFR(XL): 8185640: Thread-local handshakes

Thanks for looking at this.

On 2017-10-17 19:58, Doerr, Martin wrote:
> Hi Robbin,
> 
> my first impression is very good. Thanks for providing the webrev.

Great!

> 
> I only don't like that "poll_page_val | poll_bit()" is used in shared code. I'd prefer to use either one or the other mechanism.
> Would it be ok to move the decision between what to use to platform code?
> (Some platforms could still use both if this is beneficial.)
> 
> E.g. on PPC64, we'd like to use conditional trap instructions with special bit patterns if UseSIGTRAP is on. Would be excellent if we could implement set functions for _poll_armed_value and _poll_disarmed_value in platform code. poll_bit() also fits better into platform code in my opinion.

I see no issue with this.
Maybe SafepointMechanism::local_poll_armed should be possibly platform specific.
Can we do this incremental when adding the platform support for PPC64?

Thanks, Robbin

> 
> Best regards,
> Martin
> 
> 
> -----Original Message-----
> From: hotspot-dev [mailto:hotspot-dev-bounces at openjdk.java.net] On Behalf Of Robbin Ehn
> Sent: Mittwoch, 11. Oktober 2017 15:38
> To: hotspot-dev developers <hotspot-dev at openjdk.java.net>
> Subject: RFR(XL): 8185640: Thread-local handshakes
> 
> Hi all,
> 
> Starting the review of the code while JEP work is still not completed.
> 
> JEP: https://bugs.openjdk.java.net/browse/JDK-8185640
> 
> This JEP introduces a way to execute a callback on threads without performing a global VM safepoint. It makes it both possible and cheap to stop individual threads and not
> just all threads or none.
> 
> Entire changeset:
> http://cr.openjdk.java.net/~rehn/8185640/v0/flat/
> 
> Divided into 3-parts,
> SafepointMechanism abstraction:
> http://cr.openjdk.java.net/~rehn/8185640/v0/SafepointMechanism-0/
> Consolidating polling page allocation:
> http://cr.openjdk.java.net/~rehn/8185640/v0/PollingPage-1/
> Handshakes:
> http://cr.openjdk.java.net/~rehn/8185640/v0/Handshakes-2/
> 
> A handshake operation is a callback that is executed for each JavaThread while that thread is in a safepoint safe state. The callback is executed either by the thread
> itself or by the VM thread while keeping the thread in a blocked state. The big difference between safepointing and handshaking is that the per thread operation will be
> performed on all threads as soon as possible and they will continue to execute as soon as it?s own operation is completed. If a JavaThread is known to be running, then a
> handshake can be performed with that single JavaThread as well.
> 
> The current safepointing scheme is modified to perform an indirection through a per-thread pointer which will allow a single thread's execution to be forced to trap on the
> guard page. In order to force a thread to yield the VM updates the per-thread pointer for the corresponding thread to point to the guarded page.
> 
> Example of potential use-cases:
> -Biased lock revocation
> -External requests for stack traces
> -Deoptimization
> -Async exception delivery
> -External suspension
> -Eliding memory barriers
> 
> All of these will benefit the VM moving towards becoming more low-latency friendly by reducing the number of global safepoints.
> Platforms that do not yet implement the per JavaThread poll, a fallback to normal safepoint is in place. HandshakeOneThread will then be a normal safepoint. The supported
> platforms are Linux x64 and Solaris SPARC.
> 
> Tested heavily with various test suits and comes with a few new tests.
> 
> Performance testing using standardized benchmark show no signification changes, the latest number was -0.7% on Linux x64 and +1.5% Solaris SPARC (not statistically
> ensured). A minor regression for the load vs load load on x64 is expected and a slight increase on SPARC due to the cost of ?materializing? the page vs load load.
> The time to trigger a safepoint was measured on a large machine to not be an issue. The looping over threads and arming the polling page will benefit from the work on
> JavaThread life-cycle (8167108 - SMR and JavaThread Lifecycle: http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2017-October/024773.html) which puts all
> JavaThreads in an array instead of a linked list.
> 
> Thanks, Robbin
> 

From martin.doerr at sap.com  Wed Oct 18 10:43:37 2017
From: martin.doerr at sap.com (Doerr, Martin)
Date: Wed, 18 Oct 2017 10:43:37 +0000
Subject: 8188131: [PPC] Increase inlining thresholds to the same as other
 platforms
In-Reply-To: <OF026D4A74.0E535495-ON492581AA.00245DF9-492581AA.0024C9CA@notes.na.collabserv.com>
References: <OF026D4A74.0E535495-ON492581AA.00245DF9-492581AA.0024C9CA@notes.na.collabserv.com>
Message-ID: <6a5027ffe1c14f48a0bf39523c88aa4b@sap.com>

Hi Ogata,

sorry for the delay. I had missed this one.

The change looks feasible to me.

It may only impact the utilization of the Code Cache. Can you evaluate that (e.g. by running large benchmarks with -XX:+PrintCodeCache)?

Thanks and best regards,
Martin


-----Original Message-----
From: hotspot-dev [mailto:hotspot-dev-bounces at openjdk.java.net] On Behalf Of Kazunori Ogata
Sent: Freitag, 29. September 2017 08:42
To: hotspot-dev at openjdk.java.net; ppc-aix-port-dev at openjdk.java.net
Subject: RFR: 8188131: [PPC] Increase inlining thresholds to the same as other platforms

Hi all,

Please review a change for JDK-8188131.

Bug report: https://bugs.openjdk.java.net/browse/JDK-8188131
Webrev: http://cr.openjdk.java.net/~horii/8188131/webrev.00/

This change increases the default values of FreqInlineSize and 
InlineSmallCode in ppc64 to 325 and 2500, respectively.  These values are 
the same as aarch64.  The performance of TPC-DS Q96 was improved by about 
6% with this change.


Regards,
Ogata


From thomas.schatzl at oracle.com  Wed Oct 18 11:08:26 2017
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Wed, 18 Oct 2017 13:08:26 +0200
Subject: Request for review JDK-8187819 gc/TestFullGCALot.java fails on
 jdk10 started with "-XX:-UseCompressedOops" option
In-Reply-To: <ebef6cd3-2995-550c-6b70-4c7806fe976c@oracle.com>
References: <ebef6cd3-2995-550c-6b70-4c7806fe976c@oracle.com>
Message-ID: <1508324906.4435.15.camel@oracle.com>

Hi,

On Tue, 2017-10-03 at 14:44 -0400, Alexander Harlap wrote:
> Please review change for JDK-8187819?
> <https://bugs.openjdk.java.net/browse/JDK-8187819>?
> gc/TestFullGCALot.java fails on jdk10 started with?
> "-XX:-UseCompressedOops" option.
> 
> Change is located at http://cr.openjdk.java.net/~aharlap/8187819/webr
> ev.00/
> 
> Initialized metaspace performance counters before their potential
> use.
> 
> Tested - JPRT
> 

  - I think you should add the 8187819 number to the TestFullGCALot
test at the @bug tag

Looks good otherwise.

Thanks,
  Thomas

From stefan.karlsson at oracle.com  Wed Oct 18 11:55:34 2017
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Wed, 18 Oct 2017 13:55:34 +0200
Subject: RFR: 8189359: Move native weak oops cleaning out of
 ReferenceProcessor
In-Reply-To: <8209F13B-72CA-4135-B589-09D72A0B54AA@oracle.com>
References: <8c0aafa1-ca06-105c-72f9-7bd11d382452@oracle.com>
 <CAAF1C58-E370-40AB-B383-C1EBF37CF10A@oracle.com>
 <c913cf29-7b15-d98e-9ab9-34fd5ea60caa@oracle.com>
 <f9597843-9e46-23be-6bbe-b6b198dbd07a@oracle.com>
 <8209F13B-72CA-4135-B589-09D72A0B54AA@oracle.com>
Message-ID: <b35c7c95-ee54-2eac-8f1b-293da5ccecca@oracle.com>

Hi all,

Updated webrevs:
  http://cr.openjdk.java.net/~stefank/8189359/webrev.03.delta
  http://cr.openjdk.java.net/~stefank/8189359/webrev.03

Changes in the webrevs:
------------------------------------------------------------------------
I've added back all the missing evacuate followers closure to try to 
mimic the original code as much as possible.

This unveiled a bug for CMS with the original patch. I've re-added the 
following section to referenceProcessor.cpp:

   if (task_executor != NULL) {
     task_executor->set_single_threaded_mode();
   }

When running with CMS this executes the following code:

   void ParNewRefProcTaskExecutor::set_single_threaded_mode() {
     _state_set.flush();
     GenCollectedHeap* gch = GenCollectedHeap::heap();
     gch->save_marks();
   }

The missing call to GenCollectedHeap::save_marks() caused subsequent 
calls to the evacuate followers closure to assert that the same object 
were scanned twice.

------------------------------------------------------------------------
I also reverted to using PSKeepAliveClosure instead of 
PSScavengeRootsClosure in psScavenge.cpp.

------------------------------------------------------------------------
The comment I add for WeakProcessor::weak_oops_do previously stated that 
the function applied the "complete" closure after _each_ container had 
been processed. The next patch will move the call to 
JvmtiExport::weak_oops_do, and then the code wouldn't mimic the original 
code.

I've updated the comment to state that we only apply the "complete" 
closure once, after _all_ containers have been processed.

Thanks,
StefanK


On 2017-10-18 02:09, Kim Barrett wrote:
>> On Oct 17, 2017, at 5:38 PM, Per Liden <per.liden at oracle.com> wrote:
>>
>> Hi,
>>
>> On 2017-10-17 22:57, Stefan Karlsson wrote:
>> [...]
>>> Here are the updated webrevs:
>>>   http://cr.openjdk.java.net/~stefank/8189359/webrev.01.delta
>>>   http://cr.openjdk.java.net/~stefank/8189359/webrev.01
>>
>> Looks good. Just two comments.
>>
>> share/gc/parallel/psScavenge.cpp:
>>
>> 446     {
>> 447       GCTraceTime(Debug, gc, phases) tm("Weak Processing", &_gc_timer);
>> 448       WeakProcessor::weak_oops_do(&_is_alive_closure, &root_closure);
>> 449     }
>>
>> I see you've kept the "complete" closure in WeakProcessor::weak_oops_do(), which is fine and we can clean that out later, but here you don't seem to mimic exactly what the old code did. I think you want to pass in &evac_followers here, right?
>>
>> share/gc/serial/defNewGeneration.cpp:
>>
>> 662   WeakProcessor::weak_oops_do(&is_alive, &keep_alive);
>>
>> Same here, pass in &evacuate_followers?
>>
>> I don't need to see a new webrev.
>>
>> cheers,
>> Per
> 
> Oh, I missed that.  Same thing in cms/parNewGeneration.cpp, I think.
> 
> Otherwise, looks good.
> 
> I don?t need a new webrev either.
> 
> 

From per.liden at oracle.com  Wed Oct 18 12:00:15 2017
From: per.liden at oracle.com (Per Liden)
Date: Wed, 18 Oct 2017 14:00:15 +0200
Subject: RFR: 8189359: Move native weak oops cleaning out of
 ReferenceProcessor
In-Reply-To: <b35c7c95-ee54-2eac-8f1b-293da5ccecca@oracle.com>
References: <8c0aafa1-ca06-105c-72f9-7bd11d382452@oracle.com>
 <CAAF1C58-E370-40AB-B383-C1EBF37CF10A@oracle.com>
 <c913cf29-7b15-d98e-9ab9-34fd5ea60caa@oracle.com>
 <f9597843-9e46-23be-6bbe-b6b198dbd07a@oracle.com>
 <8209F13B-72CA-4135-B589-09D72A0B54AA@oracle.com>
 <b35c7c95-ee54-2eac-8f1b-293da5ccecca@oracle.com>
Message-ID: <9b09e340-8956-019b-fcbe-6affb844c708@oracle.com>

Looks good!

/Per

On 2017-10-18 13:55, Stefan Karlsson wrote:
> Hi all,
>
> Updated webrevs:
>  http://cr.openjdk.java.net/~stefank/8189359/webrev.03.delta
>  http://cr.openjdk.java.net/~stefank/8189359/webrev.03
>
> Changes in the webrevs:
> ------------------------------------------------------------------------
> I've added back all the missing evacuate followers closure to try to
> mimic the original code as much as possible.
>
> This unveiled a bug for CMS with the original patch. I've re-added the
> following section to referenceProcessor.cpp:
>
>   if (task_executor != NULL) {
>     task_executor->set_single_threaded_mode();
>   }
>
> When running with CMS this executes the following code:
>
>   void ParNewRefProcTaskExecutor::set_single_threaded_mode() {
>     _state_set.flush();
>     GenCollectedHeap* gch = GenCollectedHeap::heap();
>     gch->save_marks();
>   }
>
> The missing call to GenCollectedHeap::save_marks() caused subsequent
> calls to the evacuate followers closure to assert that the same object
> were scanned twice.
>
> ------------------------------------------------------------------------
> I also reverted to using PSKeepAliveClosure instead of
> PSScavengeRootsClosure in psScavenge.cpp.
>
> ------------------------------------------------------------------------
> The comment I add for WeakProcessor::weak_oops_do previously stated that
> the function applied the "complete" closure after _each_ container had
> been processed. The next patch will move the call to
> JvmtiExport::weak_oops_do, and then the code wouldn't mimic the original
> code.
>
> I've updated the comment to state that we only apply the "complete"
> closure once, after _all_ containers have been processed.
>
> Thanks,
> StefanK
>
>
>
> On 2017-10-18 02:09, Kim Barrett wrote:
>>> On Oct 17, 2017, at 5:38 PM, Per Liden <per.liden at oracle.com> wrote:
>>>
>>> Hi,
>>>
>>> On 2017-10-17 22:57, Stefan Karlsson wrote:
>>> [...]
>>>> Here are the updated webrevs:
>>>>   http://cr.openjdk.java.net/~stefank/8189359/webrev.01.delta
>>>>   http://cr.openjdk.java.net/~stefank/8189359/webrev.01
>>>
>>> Looks good. Just two comments.
>>>
>>> share/gc/parallel/psScavenge.cpp:
>>>
>>> 446     {
>>> 447       GCTraceTime(Debug, gc, phases) tm("Weak Processing",
>>> &_gc_timer);
>>> 448       WeakProcessor::weak_oops_do(&_is_alive_closure,
>>> &root_closure);
>>> 449     }
>>>
>>> I see you've kept the "complete" closure in
>>> WeakProcessor::weak_oops_do(), which is fine and we can clean that
>>> out later, but here you don't seem to mimic exactly what the old code
>>> did. I think you want to pass in &evac_followers here, right?
>>>
>>> share/gc/serial/defNewGeneration.cpp:
>>>
>>> 662   WeakProcessor::weak_oops_do(&is_alive, &keep_alive);
>>>
>>> Same here, pass in &evacuate_followers?
>>>
>>> I don't need to see a new webrev.
>>>
>>> cheers,
>>> Per
>>
>> Oh, I missed that.  Same thing in cms/parNewGeneration.cpp, I think.
>>
>> Otherwise, looks good.
>>
>> I don?t need a new webrev either.
>>
>>

From stefan.karlsson at oracle.com  Wed Oct 18 12:01:40 2017
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Wed, 18 Oct 2017 14:01:40 +0200
Subject: 8189360: JvmtiExport::weak_oops_do is called for all
 JNIHandleBlock instances
In-Reply-To: <67b8baf1-0e2b-7ebc-2826-de81da5cf770@oracle.com>
References: <8e8b2dd7-3e49-ef54-6e3b-f13fb847cbd8@oracle.com>
 <67b8baf1-0e2b-7ebc-2826-de81da5cf770@oracle.com>
Message-ID: <276b4eb8-1bb1-0ec7-72c6-6279665b58f5@oracle.com>

Hi Per,

On 2017-10-17 23:43, Per Liden wrote:
> Hi,
> 
> On 2017-10-16 17:40, Stefan Karlsson wrote:
>> Hi all,
>>
>> Please review this patch to move the call of the static 
>> JvmtiExport::weak_oops_do out of the JNIHandleBlock::weak_oops_do 
>> member function into the new WeakProcessor.
>>
>> Today, this isn't causing any bugs because there's only one instance 
>> of JNIHandleBlock, the _weak_global_handles. However, in prototypes 
>> with more than one JNIHandleBlock, this results in multiple calls to 
>> JvmtiExport::weak_oops_do.
>>
>> http://cr.openjdk.java.net/~stefank/8189360/webrev.00/
>> https://bugs.openjdk.java.net/browse/JDK-8189360
> 
>  ? 30 void WeakProcessor::unlink_or_oops_do(BoolObjectClosure* is_alive, 
> OopClosure* keep_alive, VoidClosure* complete) {
>  ? 31?? JNIHandles::weak_oops_do(is_alive, keep_alive);
>  ? 32?? if (complete != NULL) {
>  ? 33???? complete->do_void();
>  ? 34?? }
>  ? 35
>  ? 36?? JvmtiExport::weak_oops_do(is_alive, keep_alive);
>  ? 37?? if (complete != NULL) {
>  ? 38???? complete->do_void();
>  ? 39?? }
>  ? 40 }
> 
> Should you really be calling complete->do_void() twice here. It seems to 
> me that doing it once, after both calls to weak_oops_do() would mimic 
> what the old code did?

You're right. I've update the code to only call the "complete" closure 
at the end of the function.

FYI, the latest revision of the patch for 8189360 also updated the name 
unlink_or_oops_do to weak_oops_do.

I've also taken the liberty to implement oops_do as a call to 
weak_oops_do. This way we only have to list the calls to the individual 
containers once.

New webrevs:
  http://cr.openjdk.java.net/~stefank/8189360/webrev.01.delta
  http://cr.openjdk.java.net/~stefank/8189360/webrev.01

Thanks,
StefanK

> 
> cheers,
> Per
> 
>>
>> This patch builds upon the patch in:
>> http://mail.openjdk.java.net/pipermail/hotspot-dev/2017-October/028684.html 
>>
>>
>> Tested with JPRT.
>>
>> Thanks,
>> StefanK

From stefan.karlsson at oracle.com  Wed Oct 18 12:02:05 2017
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Wed, 18 Oct 2017 14:02:05 +0200
Subject: RFR: 8189359: Move native weak oops cleaning out of
 ReferenceProcessor
In-Reply-To: <9b09e340-8956-019b-fcbe-6affb844c708@oracle.com>
References: <8c0aafa1-ca06-105c-72f9-7bd11d382452@oracle.com>
 <CAAF1C58-E370-40AB-B383-C1EBF37CF10A@oracle.com>
 <c913cf29-7b15-d98e-9ab9-34fd5ea60caa@oracle.com>
 <f9597843-9e46-23be-6bbe-b6b198dbd07a@oracle.com>
 <8209F13B-72CA-4135-B589-09D72A0B54AA@oracle.com>
 <b35c7c95-ee54-2eac-8f1b-293da5ccecca@oracle.com>
 <9b09e340-8956-019b-fcbe-6affb844c708@oracle.com>
Message-ID: <0d222a7b-0a00-737d-2100-ad96b58a1ee3@oracle.com>

Thanks, Per!

StefanK

On 2017-10-18 14:00, Per Liden wrote:
> Looks good!
> 
> /Per
> 
> On 2017-10-18 13:55, Stefan Karlsson wrote:
>> Hi all,
>>
>> Updated webrevs:
>> ?http://cr.openjdk.java.net/~stefank/8189359/webrev.03.delta
>> ?http://cr.openjdk.java.net/~stefank/8189359/webrev.03
>>
>> Changes in the webrevs:
>> ------------------------------------------------------------------------
>> I've added back all the missing evacuate followers closure to try to
>> mimic the original code as much as possible.
>>
>> This unveiled a bug for CMS with the original patch. I've re-added the
>> following section to referenceProcessor.cpp:
>>
>> ? if (task_executor != NULL) {
>> ??? task_executor->set_single_threaded_mode();
>> ? }
>>
>> When running with CMS this executes the following code:
>>
>> ? void ParNewRefProcTaskExecutor::set_single_threaded_mode() {
>> ??? _state_set.flush();
>> ??? GenCollectedHeap* gch = GenCollectedHeap::heap();
>> ??? gch->save_marks();
>> ? }
>>
>> The missing call to GenCollectedHeap::save_marks() caused subsequent
>> calls to the evacuate followers closure to assert that the same object
>> were scanned twice.
>>
>> ------------------------------------------------------------------------
>> I also reverted to using PSKeepAliveClosure instead of
>> PSScavengeRootsClosure in psScavenge.cpp.
>>
>> ------------------------------------------------------------------------
>> The comment I add for WeakProcessor::weak_oops_do previously stated that
>> the function applied the "complete" closure after _each_ container had
>> been processed. The next patch will move the call to
>> JvmtiExport::weak_oops_do, and then the code wouldn't mimic the original
>> code.
>>
>> I've updated the comment to state that we only apply the "complete"
>> closure once, after _all_ containers have been processed.
>>
>> Thanks,
>> StefanK
>>
>>
>>
>> On 2017-10-18 02:09, Kim Barrett wrote:
>>>> On Oct 17, 2017, at 5:38 PM, Per Liden <per.liden at oracle.com> wrote:
>>>>
>>>> Hi,
>>>>
>>>> On 2017-10-17 22:57, Stefan Karlsson wrote:
>>>> [...]
>>>>> Here are the updated webrevs:
>>>>> ? http://cr.openjdk.java.net/~stefank/8189359/webrev.01.delta
>>>>> ? http://cr.openjdk.java.net/~stefank/8189359/webrev.01
>>>>
>>>> Looks good. Just two comments.
>>>>
>>>> share/gc/parallel/psScavenge.cpp:
>>>>
>>>> 446???? {
>>>> 447?????? GCTraceTime(Debug, gc, phases) tm("Weak Processing",
>>>> &_gc_timer);
>>>> 448?????? WeakProcessor::weak_oops_do(&_is_alive_closure,
>>>> &root_closure);
>>>> 449???? }
>>>>
>>>> I see you've kept the "complete" closure in
>>>> WeakProcessor::weak_oops_do(), which is fine and we can clean that
>>>> out later, but here you don't seem to mimic exactly what the old code
>>>> did. I think you want to pass in &evac_followers here, right?
>>>>
>>>> share/gc/serial/defNewGeneration.cpp:
>>>>
>>>> 662?? WeakProcessor::weak_oops_do(&is_alive, &keep_alive);
>>>>
>>>> Same here, pass in &evacuate_followers?
>>>>
>>>> I don't need to see a new webrev.
>>>>
>>>> cheers,
>>>> Per
>>>
>>> Oh, I missed that.? Same thing in cms/parNewGeneration.cpp, I think.
>>>
>>> Otherwise, looks good.
>>>
>>> I don?t need a new webrev either.
>>>
>>>

From per.liden at oracle.com  Wed Oct 18 12:18:42 2017
From: per.liden at oracle.com (Per Liden)
Date: Wed, 18 Oct 2017 14:18:42 +0200
Subject: 8189360: JvmtiExport::weak_oops_do is called for all
 JNIHandleBlock instances
In-Reply-To: <276b4eb8-1bb1-0ec7-72c6-6279665b58f5@oracle.com>
References: <8e8b2dd7-3e49-ef54-6e3b-f13fb847cbd8@oracle.com>
 <67b8baf1-0e2b-7ebc-2826-de81da5cf770@oracle.com>
 <276b4eb8-1bb1-0ec7-72c6-6279665b58f5@oracle.com>
Message-ID: <b54dcdb3-3db4-87d1-d31c-58f7ca1e9f4d@oracle.com>

Looks good!

/Per

On 2017-10-18 14:01, Stefan Karlsson wrote:
> Hi Per,
>
> On 2017-10-17 23:43, Per Liden wrote:
>> Hi,
>>
>> On 2017-10-16 17:40, Stefan Karlsson wrote:
>>> Hi all,
>>>
>>> Please review this patch to move the call of the static
>>> JvmtiExport::weak_oops_do out of the JNIHandleBlock::weak_oops_do
>>> member function into the new WeakProcessor.
>>>
>>> Today, this isn't causing any bugs because there's only one instance
>>> of JNIHandleBlock, the _weak_global_handles. However, in prototypes
>>> with more than one JNIHandleBlock, this results in multiple calls to
>>> JvmtiExport::weak_oops_do.
>>>
>>> http://cr.openjdk.java.net/~stefank/8189360/webrev.00/
>>> https://bugs.openjdk.java.net/browse/JDK-8189360
>>
>>    30 void WeakProcessor::unlink_or_oops_do(BoolObjectClosure*
>> is_alive, OopClosure* keep_alive, VoidClosure* complete) {
>>    31   JNIHandles::weak_oops_do(is_alive, keep_alive);
>>    32   if (complete != NULL) {
>>    33     complete->do_void();
>>    34   }
>>    35
>>    36   JvmtiExport::weak_oops_do(is_alive, keep_alive);
>>    37   if (complete != NULL) {
>>    38     complete->do_void();
>>    39   }
>>    40 }
>>
>> Should you really be calling complete->do_void() twice here. It seems
>> to me that doing it once, after both calls to weak_oops_do() would
>> mimic what the old code did?
>
> You're right. I've update the code to only call the "complete" closure
> at the end of the function.
>
> FYI, the latest revision of the patch for 8189360 also updated the name
> unlink_or_oops_do to weak_oops_do.
>
> I've also taken the liberty to implement oops_do as a call to
> weak_oops_do. This way we only have to list the calls to the individual
> containers once.
>
> New webrevs:
>  http://cr.openjdk.java.net/~stefank/8189360/webrev.01.delta
>  http://cr.openjdk.java.net/~stefank/8189360/webrev.01
>
> Thanks,
> StefanK
>
>>
>> cheers,
>> Per
>>
>>>
>>> This patch builds upon the patch in:
>>> http://mail.openjdk.java.net/pipermail/hotspot-dev/2017-October/028684.html
>>>
>>>
>>> Tested with JPRT.
>>>
>>> Thanks,
>>> StefanK

From coleen.phillimore at oracle.com  Wed Oct 18 13:14:42 2017
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Wed, 18 Oct 2017 09:14:42 -0400
Subject: RFR: JDK-8189608 Remove duplicated jni.h
In-Reply-To: <05451b2c-9905-e1cc-7cfb-39fbe1d1c983@oracle.com>
References: <05451b2c-9905-e1cc-7cfb-39fbe1d1c983@oracle.com>
Message-ID: <237f7a02-73a8-a121-d4f4-5978c7479b79@oracle.com>


This looks great.? There's also jvm.h too, which is a little more 
different but shouldn't be.

Did/could you make this change in the jdk10/hs repository since it's 
primarily hotspot files??? I can't tell from the webrev.

Thanks,
Coleen


On 10/18/17 4:53 AM, Magnus Ihse Bursie wrote:
> The file jni.h is stored twice in the repo, both in hotspot and in 
> java.base. They are both identical, and only the java.base version is 
> included in the final product.
>
> This bug is a part of the umbrella effort JDK-8167078 "Duplicate 
> header files in hotspot and jdk". As for JDK-8189607, my reasoning is 
> that the java.base version is the one to keep. (In this case, there 
> was actually a small difference between the two files -- the hotspot 
> version first copyright year was 1997, but the java.base version was 
> 1996. It makes sense to keep the oldest one.)
>
> My assumption was that hotspot include files should be sorted 
> according to the containing directory, and since jni.h no longer 
> resides in "prims", I've rearranged the include line where needed.
>
> The -I path added in CompileJvm.gmk is identical to the one in 
> JDK-8189607, and will be merged to the same change (depending on which 
> fix enters first.)
>
> Bug: https://bugs.openjdk.java.net/browse/JDK-8189608
> WebRev: 
> http://cr.openjdk.java.net/~ihse/JDK-8189608-remove-duplicated-jni/webrev.01
>
> /Magnus


From stefan.karlsson at oracle.com  Wed Oct 18 13:16:51 2017
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Wed, 18 Oct 2017 15:16:51 +0200
Subject: 8189360: JvmtiExport::weak_oops_do is called for all
 JNIHandleBlock instances
In-Reply-To: <b54dcdb3-3db4-87d1-d31c-58f7ca1e9f4d@oracle.com>
References: <8e8b2dd7-3e49-ef54-6e3b-f13fb847cbd8@oracle.com>
 <67b8baf1-0e2b-7ebc-2826-de81da5cf770@oracle.com>
 <276b4eb8-1bb1-0ec7-72c6-6279665b58f5@oracle.com>
 <b54dcdb3-3db4-87d1-d31c-58f7ca1e9f4d@oracle.com>
Message-ID: <6b8d32fe-592b-6087-3283-d46546aba044@oracle.com>

Thanks, Per!

StefanK

On 2017-10-18 14:18, Per Liden wrote:
> Looks good!
>
> /Per
>
> On 2017-10-18 14:01, Stefan Karlsson wrote:
>> Hi Per,
>>
>> On 2017-10-17 23:43, Per Liden wrote:
>>> Hi,
>>>
>>> On 2017-10-16 17:40, Stefan Karlsson wrote:
>>>> Hi all,
>>>>
>>>> Please review this patch to move the call of the static
>>>> JvmtiExport::weak_oops_do out of the JNIHandleBlock::weak_oops_do
>>>> member function into the new WeakProcessor.
>>>>
>>>> Today, this isn't causing any bugs because there's only one instance
>>>> of JNIHandleBlock, the _weak_global_handles. However, in prototypes
>>>> with more than one JNIHandleBlock, this results in multiple calls to
>>>> JvmtiExport::weak_oops_do.
>>>>
>>>> http://cr.openjdk.java.net/~stefank/8189360/webrev.00/
>>>> https://bugs.openjdk.java.net/browse/JDK-8189360
>>>
>>> ?? 30 void WeakProcessor::unlink_or_oops_do(BoolObjectClosure*
>>> is_alive, OopClosure* keep_alive, VoidClosure* complete) {
>>> ?? 31?? JNIHandles::weak_oops_do(is_alive, keep_alive);
>>> ?? 32?? if (complete != NULL) {
>>> ?? 33???? complete->do_void();
>>> ?? 34?? }
>>> ?? 35
>>> ?? 36?? JvmtiExport::weak_oops_do(is_alive, keep_alive);
>>> ?? 37?? if (complete != NULL) {
>>> ?? 38???? complete->do_void();
>>> ?? 39?? }
>>> ?? 40 }
>>>
>>> Should you really be calling complete->do_void() twice here. It seems
>>> to me that doing it once, after both calls to weak_oops_do() would
>>> mimic what the old code did?
>>
>> You're right. I've update the code to only call the "complete" closure
>> at the end of the function.
>>
>> FYI, the latest revision of the patch for 8189360 also updated the name
>> unlink_or_oops_do to weak_oops_do.
>>
>> I've also taken the liberty to implement oops_do as a call to
>> weak_oops_do. This way we only have to list the calls to the individual
>> containers once.
>>
>> New webrevs:
>> ?http://cr.openjdk.java.net/~stefank/8189360/webrev.01.delta
>> ?http://cr.openjdk.java.net/~stefank/8189360/webrev.01
>>
>> Thanks,
>> StefanK
>>
>>>
>>> cheers,
>>> Per
>>>
>>>>
>>>> This patch builds upon the patch in:
>>>> http://mail.openjdk.java.net/pipermail/hotspot-dev/2017-October/028684.html 
>>>>
>>>>
>>>>
>>>> Tested with JPRT.
>>>>
>>>> Thanks,
>>>> StefanK


From robbin.ehn at oracle.com  Wed Oct 18 13:57:35 2017
From: robbin.ehn at oracle.com (Robbin Ehn)
Date: Wed, 18 Oct 2017 15:57:35 +0200
Subject: RFR(XL): 8185640: Thread-local handshakes
In-Reply-To: <18f2001cbbbd4772aa9268e6e34b4be9@sap.com>
References: <6f2f6259-73f1-c09c-063e-39ae528fb96f@oracle.com>
 <15dd917732444959b7785efbe6640952@sap.com>
 <ba5210b2-d40c-b559-c9bb-35e245d2208e@oracle.com>
 <18f2001cbbbd4772aa9268e6e34b4be9@sap.com>
Message-ID: <770b1286-3c8e-92e5-3929-17eb4e6c3847@oracle.com>

Hi Martin,

On 2017-10-18 12:11, Doerr, Martin wrote:
> Hi Robbin,
> 
> so you would like to push your version first (as it does not break other platforms) and then help us to push non-Oracle platform implementations which change shared code again?
> I'd be fine with that, too.

Yes, great!

> 
> While thinking a little longer about the interpreter implementation, a new idea came into my mind.
> I think we could significantly reduce impact on interpreter code size and performance by using safepoint polls only in a subset of bytecodes. E.g., we could use only bytecodes which perform any kind of jump by implementing something like
> if (SafepointMechanism::uses_thread_local_poll() && t->does_dispatch()) generate_safepoint_poll();
> in TemplateInterpreterGenerator::generate_and_dispatch.

We have not seen any performance regression in simple benchmark with this.
I will do a better benchmark and compare what difference it makes.

Thanks, Robbin

> 
> Best regards,
> Martin
> 
> 
> -----Original Message-----
> From: Robbin Ehn [mailto:robbin.ehn at oracle.com]
> Sent: Mittwoch, 18. Oktober 2017 11:07
> To: Doerr, Martin <martin.doerr at sap.com>; hotspot-dev developers <hotspot-dev at openjdk.java.net>
> Subject: Re: RFR(XL): 8185640: Thread-local handshakes
> 
> Thanks for looking at this.
> 
> On 2017-10-17 19:58, Doerr, Martin wrote:
>> Hi Robbin,
>>
>> my first impression is very good. Thanks for providing the webrev.
> 
> Great!
> 
>>
>> I only don't like that "poll_page_val | poll_bit()" is used in shared code. I'd prefer to use either one or the other mechanism.
>> Would it be ok to move the decision between what to use to platform code?
>> (Some platforms could still use both if this is beneficial.)
>>
>> E.g. on PPC64, we'd like to use conditional trap instructions with special bit patterns if UseSIGTRAP is on. Would be excellent if we could implement set functions for _poll_armed_value and _poll_disarmed_value in platform code. poll_bit() also fits better into platform code in my opinion.
> 
> I see no issue with this.
> Maybe SafepointMechanism::local_poll_armed should be possibly platform specific.
> Can we do this incremental when adding the platform support for PPC64?
> 
> Thanks, Robbin
> 
>>
>> Best regards,
>> Martin
>>
>>
>> -----Original Message-----
>> From: hotspot-dev [mailto:hotspot-dev-bounces at openjdk.java.net] On Behalf Of Robbin Ehn
>> Sent: Mittwoch, 11. Oktober 2017 15:38
>> To: hotspot-dev developers <hotspot-dev at openjdk.java.net>
>> Subject: RFR(XL): 8185640: Thread-local handshakes
>>
>> Hi all,
>>
>> Starting the review of the code while JEP work is still not completed.
>>
>> JEP: https://bugs.openjdk.java.net/browse/JDK-8185640
>>
>> This JEP introduces a way to execute a callback on threads without performing a global VM safepoint. It makes it both possible and cheap to stop individual threads and not
>> just all threads or none.
>>
>> Entire changeset:
>> http://cr.openjdk.java.net/~rehn/8185640/v0/flat/
>>
>> Divided into 3-parts,
>> SafepointMechanism abstraction:
>> http://cr.openjdk.java.net/~rehn/8185640/v0/SafepointMechanism-0/
>> Consolidating polling page allocation:
>> http://cr.openjdk.java.net/~rehn/8185640/v0/PollingPage-1/
>> Handshakes:
>> http://cr.openjdk.java.net/~rehn/8185640/v0/Handshakes-2/
>>
>> A handshake operation is a callback that is executed for each JavaThread while that thread is in a safepoint safe state. The callback is executed either by the thread
>> itself or by the VM thread while keeping the thread in a blocked state. The big difference between safepointing and handshaking is that the per thread operation will be
>> performed on all threads as soon as possible and they will continue to execute as soon as it?s own operation is completed. If a JavaThread is known to be running, then a
>> handshake can be performed with that single JavaThread as well.
>>
>> The current safepointing scheme is modified to perform an indirection through a per-thread pointer which will allow a single thread's execution to be forced to trap on the
>> guard page. In order to force a thread to yield the VM updates the per-thread pointer for the corresponding thread to point to the guarded page.
>>
>> Example of potential use-cases:
>> -Biased lock revocation
>> -External requests for stack traces
>> -Deoptimization
>> -Async exception delivery
>> -External suspension
>> -Eliding memory barriers
>>
>> All of these will benefit the VM moving towards becoming more low-latency friendly by reducing the number of global safepoints.
>> Platforms that do not yet implement the per JavaThread poll, a fallback to normal safepoint is in place. HandshakeOneThread will then be a normal safepoint. The supported
>> platforms are Linux x64 and Solaris SPARC.
>>
>> Tested heavily with various test suits and comes with a few new tests.
>>
>> Performance testing using standardized benchmark show no signification changes, the latest number was -0.7% on Linux x64 and +1.5% Solaris SPARC (not statistically
>> ensured). A minor regression for the load vs load load on x64 is expected and a slight increase on SPARC due to the cost of ?materializing? the page vs load load.
>> The time to trigger a safepoint was measured on a large machine to not be an issue. The looping over threads and arming the polling page will benefit from the work on
>> JavaThread life-cycle (8167108 - SMR and JavaThread Lifecycle: http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2017-October/024773.html) which puts all
>> JavaThreads in an array instead of a linked list.
>>
>> Thanks, Robbin
>>

From coleen.phillimore at oracle.com  Wed Oct 18 14:00:26 2017
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Wed, 18 Oct 2017 10:00:26 -0400
Subject: RFR(XL): 8185640: Thread-local handshakes
In-Reply-To: <770b1286-3c8e-92e5-3929-17eb4e6c3847@oracle.com>
References: <6f2f6259-73f1-c09c-063e-39ae528fb96f@oracle.com>
 <15dd917732444959b7785efbe6640952@sap.com>
 <ba5210b2-d40c-b559-c9bb-35e245d2208e@oracle.com>
 <18f2001cbbbd4772aa9268e6e34b4be9@sap.com>
 <770b1286-3c8e-92e5-3929-17eb4e6c3847@oracle.com>
Message-ID: <5b81e9e8-eb09-4598-6da2-212ad37cb1c1@oracle.com>


On 10/18/17 9:57 AM, Robbin Ehn wrote:
>>
>> While thinking a little longer about the interpreter implementation, 
>> a new idea came into my mind.
>> I think we could significantly reduce impact on interpreter code size 
>> and performance by using safepoint polls only in a subset of 
>> bytecodes. E.g., we could use only bytecodes which perform any kind 
>> of jump by implementing something like
>> if (SafepointMechanism::uses_thread_local_poll() && 
>> t->does_dispatch()) generate_safepoint_poll();
>> in TemplateInterpreterGenerator::generate_and_dispatch.
>
> We have not seen any performance regression in simple benchmark with 
> this.
> I will do a better benchmark and compare what difference it makes. 

I think this is a good suggestion for a further RFE.? At one point, I'd 
only enabled safepoints for backward branches and returns in the 
safepoint table but it had no effect on performance, but since this 
generates code in dispatch_epilogue, it might help with code bloat.

Thanks,
Coleen

From martin.doerr at sap.com  Wed Oct 18 14:05:49 2017
From: martin.doerr at sap.com (Doerr, Martin)
Date: Wed, 18 Oct 2017 14:05:49 +0000
Subject: RFR(XL): 8185640: Thread-local handshakes
In-Reply-To: <770b1286-3c8e-92e5-3929-17eb4e6c3847@oracle.com>
References: <6f2f6259-73f1-c09c-063e-39ae528fb96f@oracle.com>
 <15dd917732444959b7785efbe6640952@sap.com>
 <ba5210b2-d40c-b559-c9bb-35e245d2208e@oracle.com>
 <18f2001cbbbd4772aa9268e6e34b4be9@sap.com>
 <770b1286-3c8e-92e5-3929-17eb4e6c3847@oracle.com>
Message-ID: <edf27200af4e4fb4a8bda1831ae2d974@sap.com>

Hi Robbin,

thanks for the quick reply and for doing additional benchmarks.
Please note that t->does_dispatch() was just a first idea, but doesn't really fit for the purpose because it's false for conditional branch bytecodes for example. I just didn't find an appropriate quick check in the existing code.
I guess you will notice a performance impact when benchmarking with -Xint. (I don't know if Oracle usually runs startup performance benchmarks.)

Best regards,
Martin


-----Original Message-----
From: Robbin Ehn [mailto:robbin.ehn at oracle.com] 
Sent: Mittwoch, 18. Oktober 2017 15:58
To: Doerr, Martin <martin.doerr at sap.com>; hotspot-dev developers <hotspot-dev at openjdk.java.net>
Subject: Re: RFR(XL): 8185640: Thread-local handshakes

Hi Martin,

On 2017-10-18 12:11, Doerr, Martin wrote:
> Hi Robbin,
> 
> so you would like to push your version first (as it does not break other platforms) and then help us to push non-Oracle platform implementations which change shared code again?
> I'd be fine with that, too.

Yes, great!

> 
> While thinking a little longer about the interpreter implementation, a new idea came into my mind.
> I think we could significantly reduce impact on interpreter code size and performance by using safepoint polls only in a subset of bytecodes. E.g., we could use only bytecodes which perform any kind of jump by implementing something like
> if (SafepointMechanism::uses_thread_local_poll() && t->does_dispatch()) generate_safepoint_poll();
> in TemplateInterpreterGenerator::generate_and_dispatch.

We have not seen any performance regression in simple benchmark with this.
I will do a better benchmark and compare what difference it makes.

Thanks, Robbin

> 
> Best regards,
> Martin
> 
> 
> -----Original Message-----
> From: Robbin Ehn [mailto:robbin.ehn at oracle.com]
> Sent: Mittwoch, 18. Oktober 2017 11:07
> To: Doerr, Martin <martin.doerr at sap.com>; hotspot-dev developers <hotspot-dev at openjdk.java.net>
> Subject: Re: RFR(XL): 8185640: Thread-local handshakes
> 
> Thanks for looking at this.
> 
> On 2017-10-17 19:58, Doerr, Martin wrote:
>> Hi Robbin,
>>
>> my first impression is very good. Thanks for providing the webrev.
> 
> Great!
> 
>>
>> I only don't like that "poll_page_val | poll_bit()" is used in shared code. I'd prefer to use either one or the other mechanism.
>> Would it be ok to move the decision between what to use to platform code?
>> (Some platforms could still use both if this is beneficial.)
>>
>> E.g. on PPC64, we'd like to use conditional trap instructions with special bit patterns if UseSIGTRAP is on. Would be excellent if we could implement set functions for _poll_armed_value and _poll_disarmed_value in platform code. poll_bit() also fits better into platform code in my opinion.
> 
> I see no issue with this.
> Maybe SafepointMechanism::local_poll_armed should be possibly platform specific.
> Can we do this incremental when adding the platform support for PPC64?
> 
> Thanks, Robbin
> 
>>
>> Best regards,
>> Martin
>>
>>
>> -----Original Message-----
>> From: hotspot-dev [mailto:hotspot-dev-bounces at openjdk.java.net] On Behalf Of Robbin Ehn
>> Sent: Mittwoch, 11. Oktober 2017 15:38
>> To: hotspot-dev developers <hotspot-dev at openjdk.java.net>
>> Subject: RFR(XL): 8185640: Thread-local handshakes
>>
>> Hi all,
>>
>> Starting the review of the code while JEP work is still not completed.
>>
>> JEP: https://bugs.openjdk.java.net/browse/JDK-8185640
>>
>> This JEP introduces a way to execute a callback on threads without performing a global VM safepoint. It makes it both possible and cheap to stop individual threads and not
>> just all threads or none.
>>
>> Entire changeset:
>> http://cr.openjdk.java.net/~rehn/8185640/v0/flat/
>>
>> Divided into 3-parts,
>> SafepointMechanism abstraction:
>> http://cr.openjdk.java.net/~rehn/8185640/v0/SafepointMechanism-0/
>> Consolidating polling page allocation:
>> http://cr.openjdk.java.net/~rehn/8185640/v0/PollingPage-1/
>> Handshakes:
>> http://cr.openjdk.java.net/~rehn/8185640/v0/Handshakes-2/
>>
>> A handshake operation is a callback that is executed for each JavaThread while that thread is in a safepoint safe state. The callback is executed either by the thread
>> itself or by the VM thread while keeping the thread in a blocked state. The big difference between safepointing and handshaking is that the per thread operation will be
>> performed on all threads as soon as possible and they will continue to execute as soon as it?s own operation is completed. If a JavaThread is known to be running, then a
>> handshake can be performed with that single JavaThread as well.
>>
>> The current safepointing scheme is modified to perform an indirection through a per-thread pointer which will allow a single thread's execution to be forced to trap on the
>> guard page. In order to force a thread to yield the VM updates the per-thread pointer for the corresponding thread to point to the guarded page.
>>
>> Example of potential use-cases:
>> -Biased lock revocation
>> -External requests for stack traces
>> -Deoptimization
>> -Async exception delivery
>> -External suspension
>> -Eliding memory barriers
>>
>> All of these will benefit the VM moving towards becoming more low-latency friendly by reducing the number of global safepoints.
>> Platforms that do not yet implement the per JavaThread poll, a fallback to normal safepoint is in place. HandshakeOneThread will then be a normal safepoint. The supported
>> platforms are Linux x64 and Solaris SPARC.
>>
>> Tested heavily with various test suits and comes with a few new tests.
>>
>> Performance testing using standardized benchmark show no signification changes, the latest number was -0.7% on Linux x64 and +1.5% Solaris SPARC (not statistically
>> ensured). A minor regression for the load vs load load on x64 is expected and a slight increase on SPARC due to the cost of ?materializing? the page vs load load.
>> The time to trigger a safepoint was measured on a large machine to not be an issue. The looping over threads and arming the polling page will benefit from the work on
>> JavaThread life-cycle (8167108 - SMR and JavaThread Lifecycle: http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2017-October/024773.html) which puts all
>> JavaThreads in an array instead of a linked list.
>>
>> Thanks, Robbin
>>

From claes.redestad at oracle.com  Wed Oct 18 14:28:34 2017
From: claes.redestad at oracle.com (Claes Redestad)
Date: Wed, 18 Oct 2017 16:28:34 +0200
Subject: RFR(XL): 8185640: Thread-local handshakes
In-Reply-To: <edf27200af4e4fb4a8bda1831ae2d974@sap.com>
References: <6f2f6259-73f1-c09c-063e-39ae528fb96f@oracle.com>
 <15dd917732444959b7785efbe6640952@sap.com>
 <ba5210b2-d40c-b559-c9bb-35e245d2208e@oracle.com>
 <18f2001cbbbd4772aa9268e6e34b4be9@sap.com>
 <770b1286-3c8e-92e5-3929-17eb4e6c3847@oracle.com>
 <edf27200af4e4fb4a8bda1831ae2d974@sap.com>
Message-ID: <394ca03f-ce6b-5200-8bde-6c4bcb40d35f@oracle.com>

Hi!

On 2017-10-18 16:05, Doerr, Martin wrote:
>   [...] when benchmarking with -Xint. (I don't know if Oracle usually runs startup performance benchmarks.)

we do a lot of benchmarking to measure startup, warmup and footprint on 
a variety of applications,
and have been improving tooling to flag even very small regressions 
(statistically significant results on
<0.5M instruction increases).

-Xint is typically not explicitly used for any benchmarking other than 
as a diagnostic tool, and even
if we did I'd imagine we'd not file bugs if they didn't also correlate 
with a regression in a mixed mode
config.

/Claes

From martin.doerr at sap.com  Wed Oct 18 14:43:13 2017
From: martin.doerr at sap.com (Doerr, Martin)
Date: Wed, 18 Oct 2017 14:43:13 +0000
Subject: RFR(XL): 8185640: Thread-local handshakes
In-Reply-To: <394ca03f-ce6b-5200-8bde-6c4bcb40d35f@oracle.com>
References: <6f2f6259-73f1-c09c-063e-39ae528fb96f@oracle.com>
 <15dd917732444959b7785efbe6640952@sap.com>
 <ba5210b2-d40c-b559-c9bb-35e245d2208e@oracle.com>
 <18f2001cbbbd4772aa9268e6e34b4be9@sap.com>
 <770b1286-3c8e-92e5-3929-17eb4e6c3847@oracle.com>
 <edf27200af4e4fb4a8bda1831ae2d974@sap.com>
 <394ca03f-ce6b-5200-8bde-6c4bcb40d35f@oracle.com>
Message-ID: <abd2f99951dd4031be2769b02fb0c4e6@sap.com>

Hi Claes,

thanks for the explanation. We use -Xint benchmarking only when we make significant interpreter changes as quick regression check (not so relevant for real life, but delivers stable and quick results).

Best regards,
Martin


-----Original Message-----
From: hotspot-dev [mailto:hotspot-dev-bounces at openjdk.java.net] On Behalf Of Claes Redestad
Sent: Mittwoch, 18. Oktober 2017 16:29
To: hotspot-dev at openjdk.java.net
Subject: Re: RFR(XL): 8185640: Thread-local handshakes

Hi!

On 2017-10-18 16:05, Doerr, Martin wrote:
>   [...] when benchmarking with -Xint. (I don't know if Oracle usually runs startup performance benchmarks.)

we do a lot of benchmarking to measure startup, warmup and footprint on 
a variety of applications,
and have been improving tooling to flag even very small regressions 
(statistically significant results on
<0.5M instruction increases).

-Xint is typically not explicitly used for any benchmarking other than 
as a diagnostic tool, and even
if we did I'd imagine we'd not file bugs if they didn't also correlate 
with a regression in a mixed mode
config.

/Claes

From jesper.wilhelmsson at oracle.com  Wed Oct 18 17:21:01 2017
From: jesper.wilhelmsson at oracle.com (jesper.wilhelmsson at oracle.com)
Date: Wed, 18 Oct 2017 19:21:01 +0200
Subject: Integration blockers
Message-ID: <D5F28337-F225-4A43-AF90-B8C9DDEE3949@oracle.com>

Hi,

I've gone through all bugs filed based on nightly findings since the last integration to master and added the integration_blocker label to most of them. I tried to filter out infrastructure problems and bugs that seems to originate from master. The result is 14 blockers.

There can obviously be several cases where the label can be removed but the rule is the same as it has been before: All bugs found by nightly testing should have the integration_blocker label, or a motivation as to why it is not a blocker.

One (desperate) way to remove the integration_blocker label from a pure test bug is to add the test to the problem list. This is not recommended, but possible. The integration_blocker label is then moved to the subtask used to problem list the test. If a test is put on the problem list due to a VM issue (not recommended) the bug remains an integration blocker.

Thanks,
/Jesper


From mandy.chung at oracle.com  Wed Oct 18 17:57:26 2017
From: mandy.chung at oracle.com (mandy chung)
Date: Wed, 18 Oct 2017 10:57:26 -0700
Subject: RFR: JDK-8189607 Remove duplicated jvmticmlr.h
In-Reply-To: <c12b9dca-eb4f-986c-11a3-61fca60d251c@oracle.com>
References: <c48de33b-8a7f-5b88-7dd8-518278956ff9@oracle.com>
 <c12b9dca-eb4f-986c-11a3-61fca60d251c@oracle.com>
Message-ID: <1012e7eb-0509-ceb7-789d-658e87ad1a14@oracle.com>


On 10/18/17 1:26 AM, Erik Joelsson wrote:
> On 2017-10-18 10:04, Magnus Ihse Bursie wrote:
>> The file jvmticmlr.h is stored twice in the repo, both in hotspot and 
>> in java.base. They are both identical, and only the java.base version 
>> is included in the final product. This might arguably have been 
>> useful in a pre-consolidated world, but makes absolutely no sense now.
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8189607
>> WebRev: 
>> http://cr.openjdk.java.net/~ihse/JDK-8189607-remove-duplicated-jvmticmlr/webrev.01
>>
> The question is, which file location makes the most sense. I think 
> your pick of java.base/share/native/include probably makes more sense 
> as that makes it much clearer that this is an exported header file.
>

jvmticmlr.h is an exported header file and 
java.base/share/native/include is a proper location as described in JEP 
201 about the modular source layout.

The change looks good to me too.

Mandy


From kim.barrett at oracle.com  Wed Oct 18 18:21:50 2017
From: kim.barrett at oracle.com (Kim Barrett)
Date: Wed, 18 Oct 2017 14:21:50 -0400
Subject: RFR: 8189359: Move native weak oops cleaning out of
 ReferenceProcessor
In-Reply-To: <b35c7c95-ee54-2eac-8f1b-293da5ccecca@oracle.com>
References: <8c0aafa1-ca06-105c-72f9-7bd11d382452@oracle.com>
 <CAAF1C58-E370-40AB-B383-C1EBF37CF10A@oracle.com>
 <c913cf29-7b15-d98e-9ab9-34fd5ea60caa@oracle.com>
 <f9597843-9e46-23be-6bbe-b6b198dbd07a@oracle.com>
 <8209F13B-72CA-4135-B589-09D72A0B54AA@oracle.com>
 <b35c7c95-ee54-2eac-8f1b-293da5ccecca@oracle.com>
Message-ID: <CFCE07A7-B303-4B9D-A800-2DAAB747B80F@oracle.com>

> On Oct 18, 2017, at 7:55 AM, Stefan Karlsson <stefan.karlsson at oracle.com> wrote:
> 
> Hi all,
> 
> Updated webrevs:
> http://cr.openjdk.java.net/~stefank/8189359/webrev.03.delta
> http://cr.openjdk.java.net/~stefank/8189359/webrev.03

Looks good.


From kim.barrett at oracle.com  Wed Oct 18 18:24:21 2017
From: kim.barrett at oracle.com (Kim Barrett)
Date: Wed, 18 Oct 2017 14:24:21 -0400
Subject: 8189360: JvmtiExport::weak_oops_do is called for all
 JNIHandleBlock instances
In-Reply-To: <276b4eb8-1bb1-0ec7-72c6-6279665b58f5@oracle.com>
References: <8e8b2dd7-3e49-ef54-6e3b-f13fb847cbd8@oracle.com>
 <67b8baf1-0e2b-7ebc-2826-de81da5cf770@oracle.com>
 <276b4eb8-1bb1-0ec7-72c6-6279665b58f5@oracle.com>
Message-ID: <BA01F9AB-D86B-4D0B-A386-4AF7DF54F080@oracle.com>

> On Oct 18, 2017, at 8:01 AM, Stefan Karlsson <stefan.karlsson at oracle.com> wrote:
> New webrevs:
> http://cr.openjdk.java.net/~stefank/8189360/webrev.01.delta
> http://cr.openjdk.java.net/~stefank/8189360/webrev.01

Looks good.


From stefan.karlsson at oracle.com  Wed Oct 18 19:11:17 2017
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Wed, 18 Oct 2017 21:11:17 +0200
Subject: 8189360: JvmtiExport::weak_oops_do is called for all
 JNIHandleBlock instances
In-Reply-To: <BA01F9AB-D86B-4D0B-A386-4AF7DF54F080@oracle.com>
References: <8e8b2dd7-3e49-ef54-6e3b-f13fb847cbd8@oracle.com>
 <67b8baf1-0e2b-7ebc-2826-de81da5cf770@oracle.com>
 <276b4eb8-1bb1-0ec7-72c6-6279665b58f5@oracle.com>
 <BA01F9AB-D86B-4D0B-A386-4AF7DF54F080@oracle.com>
Message-ID: <cd968415-e808-33c9-a942-01951c873b91@oracle.com>

Thanks all for reviewing.

StefanK

On 2017-10-18 20:24, Kim Barrett wrote:
>> On Oct 18, 2017, at 8:01 AM, Stefan Karlsson <stefan.karlsson at oracle.com> wrote:
>> New webrevs:
>> http://cr.openjdk.java.net/~stefank/8189360/webrev.01.delta
>> http://cr.openjdk.java.net/~stefank/8189360/webrev.01
> Looks good.
>


From stefan.karlsson at oracle.com  Wed Oct 18 19:11:48 2017
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Wed, 18 Oct 2017 21:11:48 +0200
Subject: RFR: 8189359: Move native weak oops cleaning out of
 ReferenceProcessor
In-Reply-To: <CFCE07A7-B303-4B9D-A800-2DAAB747B80F@oracle.com>
References: <8c0aafa1-ca06-105c-72f9-7bd11d382452@oracle.com>
 <CAAF1C58-E370-40AB-B383-C1EBF37CF10A@oracle.com>
 <c913cf29-7b15-d98e-9ab9-34fd5ea60caa@oracle.com>
 <f9597843-9e46-23be-6bbe-b6b198dbd07a@oracle.com>
 <8209F13B-72CA-4135-B589-09D72A0B54AA@oracle.com>
 <b35c7c95-ee54-2eac-8f1b-293da5ccecca@oracle.com>
 <CFCE07A7-B303-4B9D-A800-2DAAB747B80F@oracle.com>
Message-ID: <c6a6b021-74fb-fd80-478b-3518cb157e09@oracle.com>

Thanks all for reviewing.

StefanK

On 2017-10-18 20:21, Kim Barrett wrote:
>> On Oct 18, 2017, at 7:55 AM, Stefan Karlsson <stefan.karlsson at oracle.com> wrote:
>>
>> Hi all,
>>
>> Updated webrevs:
>> http://cr.openjdk.java.net/~stefank/8189359/webrev.03.delta
>> http://cr.openjdk.java.net/~stefank/8189359/webrev.03
> Looks good.
>


From coleen.phillimore at oracle.com  Wed Oct 18 20:44:28 2017
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Wed, 18 Oct 2017 16:44:28 -0400
Subject: RFR(XL): 8185640: Thread-local handshakes
In-Reply-To: <6f2f6259-73f1-c09c-063e-39ae528fb96f@oracle.com>
References: <6f2f6259-73f1-c09c-063e-39ae528fb96f@oracle.com>
Message-ID: <d49bddc1-61ff-333b-b92c-3263852ba1ae@oracle.com>


This looks really nice.? A few minor comments.

http://cr.openjdk.java.net/~rehn/8185640/v0/flat/webrev/src/hotspot/share/runtime/handshake.hpp.html

   51 // or the JavaThread it self.

typo, "itself"

Thank you for adding these comments.? I think they're just right in 
length and detail in the header.

http://cr.openjdk.java.net/~rehn/8185640/v0/flat/webrev/src/hotspot/share/runtime/handshake.cpp.html

The protocol in HandshakeState::process_self_inner and cancel_inner is:

 ??? clear_handshake(thread);
 ??? if (op != NULL) {
 ????? op->do_handshake(thread);
 ??? }

But in HandshakeState::process_by_vmthread(), the order is reversed.? 
Can you explain why in the comments.

 ??? _operation->do_handshake(target);
 ??? clear_handshake(target);

It looks like the thread can't continue while the handshake operation is 
in progress, so does the order matter?

http://cr.openjdk.java.net/~rehn/8185640/v0/flat/webrev/test/hotspot/jtreg/runtime/handshake/HandshakeWalkStackNativeTest.java.html

This has the wrong @test name.? These could use an @comment line about 
what you expect also.? I don't know what's "Native" about it though, 
isn't it testing what happens when you use -XX:+ThreadLocalHandshakes?

http://cr.openjdk.java.net/~rehn/8185640/v0/flat/webrev/test/hotspot/jtreg/runtime/handshake/HandshakeWalkStackFallbackTest.java.html

This one too an @comment that it's testing the fallback VM operation 
would be good.

I don't need to see another webrev for the comment changes.

Lastly, as I said before, I think putting the safepoint polls in the 
interpreter at return and backward branches would be a good follow on 
changeset.

Thanks,
Coleen


On 10/11/17 9:37 AM, Robbin Ehn wrote:
> Hi all,
>
> Starting the review of the code while JEP work is still not completed.
>
> JEP: https://bugs.openjdk.java.net/browse/JDK-8185640
>
> This JEP introduces a way to execute a callback on threads without 
> performing a global VM safepoint. It makes it both possible and cheap 
> to stop individual threads and not just all threads or none.
>
> Entire changeset:
> http://cr.openjdk.java.net/~rehn/8185640/v0/flat/
>
> Divided into 3-parts,
> SafepointMechanism abstraction:
> http://cr.openjdk.java.net/~rehn/8185640/v0/SafepointMechanism-0/
> Consolidating polling page allocation:
> http://cr.openjdk.java.net/~rehn/8185640/v0/PollingPage-1/
> Handshakes:
> http://cr.openjdk.java.net/~rehn/8185640/v0/Handshakes-2/
>
> A handshake operation is a callback that is executed for each 
> JavaThread while that thread is in a safepoint safe state. The 
> callback is executed either by the thread itself or by the VM thread 
> while keeping the thread in a blocked state. The big difference 
> between safepointing and handshaking is that the per thread operation 
> will be performed on all threads as soon as possible and they will 
> continue to execute as soon as it?s own operation is completed. If a 
> JavaThread is known to be running, then a handshake can be performed 
> with that single JavaThread as well.
>
> The current safepointing scheme is modified to perform an indirection 
> through a per-thread pointer which will allow a single thread's 
> execution to be forced to trap on the guard page. In order to force a 
> thread to yield the VM updates the per-thread pointer for the 
> corresponding thread to point to the guarded page.
>
> Example of potential use-cases:
> -Biased lock revocation
> -External requests for stack traces
> -Deoptimization
> -Async exception delivery
> -External suspension
> -Eliding memory barriers
>
> All of these will benefit the VM moving towards becoming more 
> low-latency friendly by reducing the number of global safepoints.
> Platforms that do not yet implement the per JavaThread poll, a 
> fallback to normal safepoint is in place. HandshakeOneThread will then 
> be a normal safepoint. The supported platforms are Linux x64 and 
> Solaris SPARC.
>
> Tested heavily with various test suits and comes with a few new tests.
>
> Performance testing using standardized benchmark show no signification 
> changes, the latest number was -0.7% on Linux x64 and +1.5% Solaris 
> SPARC (not statistically ensured). A minor regression for the load vs 
> load load on x64 is expected and a slight increase on SPARC due to the 
> cost of ?materializing? the page vs load load.
> The time to trigger a safepoint was measured on a large machine to not 
> be an issue. The looping over threads and arming the polling page will 
> benefit from the work on JavaThread life-cycle (8167108 - SMR and 
> JavaThread Lifecycle: 
> http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2017-October/024773.html) 
> which puts all JavaThreads in an array instead of a linked list.
>
> Thanks, Robbin


From david.holmes at oracle.com  Thu Oct 19 02:07:48 2017
From: david.holmes at oracle.com (David Holmes)
Date: Thu, 19 Oct 2017 12:07:48 +1000
Subject: RFR: JDK-8189608 Remove duplicated jni.h
In-Reply-To: <237f7a02-73a8-a121-d4f4-5978c7479b79@oracle.com>
References: <05451b2c-9905-e1cc-7cfb-39fbe1d1c983@oracle.com>
 <237f7a02-73a8-a121-d4f4-5978c7479b79@oracle.com>
Message-ID: <d840c791-4f2d-ba48-fad9-0027d6345d6a@oracle.com>

On 18/10/2017 11:14 PM, coleen.phillimore at oracle.com wrote:
> 
> This looks great.? There's also jvm.h too, which is a little more 
> different but shouldn't be.

jvm.h plus the platform specific headers need a bit more work. There's a 
runtime bug open for that:

https://bugs.openjdk.java.net/browse/JDK-8189610

> Did/could you make this change in the jdk10/hs repository since it's 
> primarily hotspot files??? I can't tell from the webrev.

I suggested hs.

David

> Thanks,
> Coleen
> 
> 
> 
> On 10/18/17 4:53 AM, Magnus Ihse Bursie wrote:
>> The file jni.h is stored twice in the repo, both in hotspot and in 
>> java.base. They are both identical, and only the java.base version is 
>> included in the final product.
>>
>> This bug is a part of the umbrella effort JDK-8167078 "Duplicate 
>> header files in hotspot and jdk". As for JDK-8189607, my reasoning is 
>> that the java.base version is the one to keep. (In this case, there 
>> was actually a small difference between the two files -- the hotspot 
>> version first copyright year was 1997, but the java.base version was 
>> 1996. It makes sense to keep the oldest one.)
>>
>> My assumption was that hotspot include files should be sorted 
>> according to the containing directory, and since jni.h no longer 
>> resides in "prims", I've rearranged the include line where needed.
>>
>> The -I path added in CompileJvm.gmk is identical to the one in 
>> JDK-8189607, and will be merged to the same change (depending on which 
>> fix enters first.)
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8189608
>> WebRev: 
>> http://cr.openjdk.java.net/~ihse/JDK-8189608-remove-duplicated-jni/webrev.01 
>>
>>
>> /Magnus
> 

From OGATAK at jp.ibm.com  Thu Oct 19 06:43:19 2017
From: OGATAK at jp.ibm.com (Kazunori Ogata)
Date: Thu, 19 Oct 2017 15:43:19 +0900
Subject: 8188131: [PPC] Increase inlining thresholds to the same as other
 platforms
In-Reply-To: <OFBB2A1A8A.33229D26-ON002581BD.003AEE77@LocalDomain>
References: <OF026D4A74.0E535495-ON492581AA.00245DF9-492581AA.0024C9CA@notes.na.collabserv.com>
 <OFBB2A1A8A.33229D26-ON002581BD.003AEE77@LocalDomain>
Message-ID: <OF8485784D.DED3C7C4-ON492581BE.002406F5-492581BE.0024EEE3@notes.na.collabserv.com>

Hi Martin,

Thank you for your comment.  I checked the code cache size by running 
SPECjbb2015 (composite mode, i.e., single JVM mode, heap size is 31GB).

The used code cache size was increased by 4.5MB from 41982Kb to 47006Kb 
(+12%).  Is the increase too large?


The raw output of -XX:+PrintCodeCache are:

=== Original ===
CodeHeap 'non-profiled nmethods': size=652480Kb used=13884Kb 
max_used=13884Kb free=638595Kb
 bounds [0x00001000356f0000, 0x0000100036480000, 0x000010005d420000]
CodeHeap 'profiled nmethods': size=652480Kb used=26593Kb max_used=26593Kb 
free=625886Kb
 bounds [0x000010000d9c0000, 0x000010000f3c0000, 0x00001000356f0000]
CodeHeap 'non-nmethods': size=5760Kb used=1505Kb max_used=1559Kb 
free=4254Kb
 bounds [0x000010000d420000, 0x000010000d620000, 0x000010000d9c0000]
 total_blobs=16606 nmethods=10265 adapters=653
 compilation: enabled


=== Modified (webrev.00) ===
CodeHeap 'non-profiled nmethods': size=652480Kb used=18516Kb 
max_used=18516Kb free=633964Kb
 bounds [0x0000100035730000, 0x0000100036950000, 0x000010005d460000]
CodeHeap 'profiled nmethods': size=652480Kb used=26963Kb max_used=26963Kb 
free=625516Kb
 bounds [0x000010000da00000, 0x000010000f460000, 0x0000100035730000]
CodeHeap 'non-nmethods': size=5760Kb used=1527Kb max_used=1565Kb 
free=4232Kb
 bounds [0x000010000d460000, 0x000010000d660000, 0x000010000da00000]
 total_blobs=16561 nmethods=10295 adapters=653
 compilation: enabled


Regards,
Ogata


From:   "Doerr, Martin" <martin.doerr at sap.com>
To:     Kazunori Ogata <OGATAK at jp.ibm.com>, "hotspot-dev at openjdk.java.net" 
<hotspot-dev at openjdk.java.net>, "ppc-aix-port-dev at openjdk.java.net" 
<ppc-aix-port-dev at openjdk.java.net>
Date:   2017/10/18 19:43
Subject:        RE: 8188131: [PPC] Increase inlining thresholds to the 
same as other platforms


Hi Ogata,

sorry for the delay. I had missed this one.

The change looks feasible to me.

It may only impact the utilization of the Code Cache. Can you evaluate 
that (e.g. by running large benchmarks with -XX:+PrintCodeCache)?

Thanks and best regards,
Martin


-----Original Message-----
From: hotspot-dev [mailto:hotspot-dev-bounces at openjdk.java.net] On Behalf 
Of Kazunori Ogata
Sent: Freitag, 29. September 2017 08:42
To: hotspot-dev at openjdk.java.net; ppc-aix-port-dev at openjdk.java.net
Subject: RFR: 8188131: [PPC] Increase inlining thresholds to the same as 
other platforms

Hi all,

Please review a change for JDK-8188131.

Bug report: 
https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.openjdk.java.net_browse_JDK-2D8188131&d=DwIFAg&c=jf_iaSHvJObTbx-siA1ZOg&r=p-FJcrbNvnCOLkbIdmQ2tigCrcpdU77tlI2EIdaEcJw&m=ExKSiZAany_n7vS453MD73lAZxkNhGsrlDkk-YUYORQ&s=ic27Fb2_vyTSsUAPraEI89UDJy9cbodGojvMw9DNHiU&e=

Webrev: 
https://urldefense.proofpoint.com/v2/url?u=http-3A__cr.openjdk.java.net_-7Ehorii_8188131_webrev.00_&d=DwIFAg&c=jf_iaSHvJObTbx-siA1ZOg&r=p-FJcrbNvnCOLkbIdmQ2tigCrcpdU77tlI2EIdaEcJw&m=ExKSiZAany_n7vS453MD73lAZxkNhGsrlDkk-YUYORQ&s=xS8PbLyuVtbOBRDMIB-i9r6lTggpGH3Np8kmONkkMAg&e=


This change increases the default values of FreqInlineSize and 
InlineSmallCode in ppc64 to 325 and 2500, respectively.  These values are 
the same as aarch64.  The performance of TPC-DS Q96 was improved by about 
6% with this change.


Regards,
Ogata


From magnus.ihse.bursie at oracle.com  Thu Oct 19 07:21:15 2017
From: magnus.ihse.bursie at oracle.com (Magnus Ihse Bursie)
Date: Thu, 19 Oct 2017 09:21:15 +0200
Subject: RFR: JDK-8189608 Remove duplicated jni.h
In-Reply-To: <237f7a02-73a8-a121-d4f4-5978c7479b79@oracle.com>
References: <05451b2c-9905-e1cc-7cfb-39fbe1d1c983@oracle.com>
 <237f7a02-73a8-a121-d4f4-5978c7479b79@oracle.com>
Message-ID: <075ad93f-cd7f-0147-e7a0-2c9ebc44acfd@oracle.com>

On 2017-10-18 15:14, coleen.phillimore at oracle.com wrote:
>
> This looks great. 
Thank you!

> There's also jvm.h too, which is a little more different but shouldn't 
> be.
That needs a bit of work to make sure no relevant differences get lost. 
I opened JDK-8189610 for the hotspot team to fix this, before I can 
proceed with the unification.
>
> Did/could you make this change in the jdk10/hs repository since it's 
> primarily hotspot files??? I can't tell from the webrev.
Sorry I was not clear on this. I started out by doing the patch in my 
jdk10/master clone, but I pushed it to jdk10/hs.

/Magnus
>
> Thanks,
> Coleen
>
>
>
> On 10/18/17 4:53 AM, Magnus Ihse Bursie wrote:
>> The file jni.h is stored twice in the repo, both in hotspot and in 
>> java.base. They are both identical, and only the java.base version is 
>> included in the final product.
>>
>> This bug is a part of the umbrella effort JDK-8167078 "Duplicate 
>> header files in hotspot and jdk". As for JDK-8189607, my reasoning is 
>> that the java.base version is the one to keep. (In this case, there 
>> was actually a small difference between the two files -- the hotspot 
>> version first copyright year was 1997, but the java.base version was 
>> 1996. It makes sense to keep the oldest one.)
>>
>> My assumption was that hotspot include files should be sorted 
>> according to the containing directory, and since jni.h no longer 
>> resides in "prims", I've rearranged the include line where needed.
>>
>> The -I path added in CompileJvm.gmk is identical to the one in 
>> JDK-8189607, and will be merged to the same change (depending on 
>> which fix enters first.)
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8189608
>> WebRev: 
>> http://cr.openjdk.java.net/~ihse/JDK-8189608-remove-duplicated-jni/webrev.01
>>
>> /Magnus
>


From goetz.lindenmaier at sap.com  Thu Oct 19 11:03:16 2017
From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz)
Date: Thu, 19 Oct 2017 11:03:16 +0000
Subject: 8188131: [PPC] Increase inlining thresholds to the same as other
 platforms
In-Reply-To: <OF8485784D.DED3C7C4-ON492581BE.002406F5-492581BE.0024EEE3@notes.na.collabserv.com>
References: <OF026D4A74.0E535495-ON492581AA.00245DF9-492581AA.0024C9CA@notes.na.collabserv.com>
 <OFBB2A1A8A.33229D26-ON002581BD.003AEE77@LocalDomain>
 <OF8485784D.DED3C7C4-ON492581BE.002406F5-492581BE.0024EEE3@notes.na.collabserv.com>
Message-ID: <a6c5dddc898345319b4f2faee9cabae5@sap.com>

Hi Kazunori, 

To me, this seems to be a very large increase.
Considering that not only the required code cache size but also the 
compiler cpu time will increase in this magnitude, this seems to be 
a rather risky step that should be tested for its benefits on systems
that are highly contended.

In this case, you probably had enough space in the code cache so that
no recompilation etc. happened. 

To further look at this I could think of 
1. finding the minimal code cache size with the old flags where 
   the JIT is not disabled
2. finding the same size for the new flag settings 
   --> How much more is needed for the new settings?

Then you should compare the performance with the bigger 
code cache size for both, and see whether there still is performance
improvement, or whether it's eaten up by more compile time. 
I.e. you should have a setup where compiler threads and application
threads compete for the available CPUs.

What do you think?

Best regards,
  Goetz.

> -----Original Message-----
> From: hotspot-dev [mailto:hotspot-dev-bounces at openjdk.java.net] On
> Behalf Of Kazunori Ogata
> Sent: Donnerstag, 19. Oktober 2017 08:43
> To: Doerr, Martin <martin.doerr at sap.com>
> Cc: ppc-aix-port-dev at openjdk.java.net; hotspot-dev at openjdk.java.net
> Subject: RE: 8188131: [PPC] Increase inlining thresholds to the same as other
> platforms
> 
> Hi Martin,
> 
> Thank you for your comment.  I checked the code cache size by running
> SPECjbb2015 (composite mode, i.e., single JVM mode, heap size is 31GB).
> 
> The used code cache size was increased by 4.5MB from 41982Kb to 47006Kb
> (+12%).  Is the increase too large?
> 
> 
> The raw output of -XX:+PrintCodeCache are:
> 
> === Original ===
> CodeHeap 'non-profiled nmethods': size=652480Kb used=13884Kb
> max_used=13884Kb free=638595Kb
>  bounds [0x00001000356f0000, 0x0000100036480000, 0x000010005d420000]
> CodeHeap 'profiled nmethods': size=652480Kb used=26593Kb
> max_used=26593Kb
> free=625886Kb
>  bounds [0x000010000d9c0000, 0x000010000f3c0000, 0x00001000356f0000]
> CodeHeap 'non-nmethods': size=5760Kb used=1505Kb max_used=1559Kb
> free=4254Kb
>  bounds [0x000010000d420000, 0x000010000d620000, 0x000010000d9c0000]
>  total_blobs=16606 nmethods=10265 adapters=653
>  compilation: enabled
> 
> 
> === Modified (webrev.00) ===
> CodeHeap 'non-profiled nmethods': size=652480Kb used=18516Kb
> max_used=18516Kb free=633964Kb
>  bounds [0x0000100035730000, 0x0000100036950000, 0x000010005d460000]
> CodeHeap 'profiled nmethods': size=652480Kb used=26963Kb
> max_used=26963Kb
> free=625516Kb
>  bounds [0x000010000da00000, 0x000010000f460000, 0x0000100035730000]
> CodeHeap 'non-nmethods': size=5760Kb used=1527Kb max_used=1565Kb
> free=4232Kb
>  bounds [0x000010000d460000, 0x000010000d660000, 0x000010000da00000]
>  total_blobs=16561 nmethods=10295 adapters=653
>  compilation: enabled
> 
> 
> Regards,
> Ogata
> 
> 
> 
> 
> From:   "Doerr, Martin" <martin.doerr at sap.com>
> To:     Kazunori Ogata <OGATAK at jp.ibm.com>, "hotspot-
> dev at openjdk.java.net"
> <hotspot-dev at openjdk.java.net>, "ppc-aix-port-dev at openjdk.java.net"
> <ppc-aix-port-dev at openjdk.java.net>
> Date:   2017/10/18 19:43
> Subject:        RE: 8188131: [PPC] Increase inlining thresholds to the
> same as other platforms
> 
> 
> 
> Hi Ogata,
> 
> sorry for the delay. I had missed this one.
> 
> The change looks feasible to me.
> 
> It may only impact the utilization of the Code Cache. Can you evaluate
> that (e.g. by running large benchmarks with -XX:+PrintCodeCache)?
> 
> Thanks and best regards,
> Martin
> 
> 
> -----Original Message-----
> From: hotspot-dev [mailto:hotspot-dev-bounces at openjdk.java.net] On
> Behalf
> Of Kazunori Ogata
> Sent: Freitag, 29. September 2017 08:42
> To: hotspot-dev at openjdk.java.net; ppc-aix-port-dev at openjdk.java.net
> Subject: RFR: 8188131: [PPC] Increase inlining thresholds to the same as
> other platforms
> 
> Hi all,
> 
> Please review a change for JDK-8188131.
> 
> Bug report:
> https://urldefense.proofpoint.com/v2/url?u=https-
> 3A__bugs.openjdk.java.net_browse_JDK-
> 2D8188131&d=DwIFAg&c=jf_iaSHvJObTbx-siA1ZOg&r=p-
> FJcrbNvnCOLkbIdmQ2tigCrcpdU77tlI2EIdaEcJw&m=ExKSiZAany_n7vS453MD
> 73lAZxkNhGsrlDkk-
> YUYORQ&s=ic27Fb2_vyTSsUAPraEI89UDJy9cbodGojvMw9DNHiU&e=
> 
> Webrev:
> https://urldefense.proofpoint.com/v2/url?u=http-
> 3A__cr.openjdk.java.net_-
> 7Ehorii_8188131_webrev.00_&d=DwIFAg&c=jf_iaSHvJObTbx-siA1ZOg&r=p-
> FJcrbNvnCOLkbIdmQ2tigCrcpdU77tlI2EIdaEcJw&m=ExKSiZAany_n7vS453MD
> 73lAZxkNhGsrlDkk-YUYORQ&s=xS8PbLyuVtbOBRDMIB-
> i9r6lTggpGH3Np8kmONkkMAg&e=
> 
> 
> This change increases the default values of FreqInlineSize and
> InlineSmallCode in ppc64 to 325 and 2500, respectively.  These values are
> the same as aarch64.  The performance of TPC-DS Q96 was improved by
> about
> 6% with this change.
> 
> 
> Regards,
> Ogata
> 
> 
> 


From robbin.ehn at oracle.com  Thu Oct 19 12:36:34 2017
From: robbin.ehn at oracle.com (Robbin Ehn)
Date: Thu, 19 Oct 2017 14:36:34 +0200
Subject: RFR(XL): 8185640: Thread-local handshakes
In-Reply-To: <d49bddc1-61ff-333b-b92c-3263852ba1ae@oracle.com>
References: <6f2f6259-73f1-c09c-063e-39ae528fb96f@oracle.com>
 <d49bddc1-61ff-333b-b92c-3263852ba1ae@oracle.com>
Message-ID: <8e9d6d66-8d5c-6605-f0b1-fdbfedef43cf@oracle.com>

Thanks for looking at this Coleen,

On 2017-10-18 22:44, coleen.phillimore at oracle.com wrote:
> 
> This looks really nice.? A few minor comments.
> 
> http://cr.openjdk.java.net/~rehn/8185640/v0/flat/webrev/src/hotspot/share/runtime/handshake.hpp.html
> 
>  ? 51 // or the JavaThread it self.
> 
> typo, "itself"

Fixed

> 
> Thank you for adding these comments.? I think they're just right in length and detail in the header.
> 
> http://cr.openjdk.java.net/~rehn/8185640/v0/flat/webrev/src/hotspot/share/runtime/handshake.cpp.html
> 
> The protocol in HandshakeState::process_self_inner and cancel_inner is:
> 
>  ??? clear_handshake(thread);
>  ??? if (op != NULL) {
>  ????? op->do_handshake(thread);
>  ??? }
> 
> But in HandshakeState::process_by_vmthread(), the order is reversed. Can you explain why in the comments.
> 
>  ??? _operation->do_handshake(target);
>  ??? clear_handshake(target);
> 
> It looks like the thread can't continue while the handshake operation is in progress, so does the order matter?

The key part here is that must be cleared before signaling the semaphore.
The early clearing is because if the thread is doing it's own operation, the VM thread can quickly skip this thread by looking if it still have an operation.

> 
> http://cr.openjdk.java.net/~rehn/8185640/v0/flat/webrev/test/hotspot/jtreg/runtime/handshake/HandshakeWalkStackNativeTest.java.html
> 
> This has the wrong @test name.? These could use an @comment line about what you expect also.? I don't know what's "Native" about it though, isn't it testing what happens when you use -XX:+ThreadLocalHandshakes?
> 
> http://cr.openjdk.java.net/~rehn/8185640/v0/flat/webrev/test/hotspot/jtreg/runtime/handshake/HandshakeWalkStackFallbackTest.java.html
> 
> This one too an @comment that it's testing the fallback VM operation would be good.
> 
> I don't need to see another webrev for the comment changes.

Here it is, there was inconsistencies in the tests, I think it is better now.

http://cr.openjdk.java.net/~rehn/8185640/v2/Coleen-n-Test-Cleanup-4/webrev/

> 
> Lastly, as I said before, I think putting the safepoint polls in the interpreter at return and backward branches would be a good follow on changeset.

I will let Claes R decided if that is an acceptable approach.

Thanks, Robbin

> 
> Thanks,
> Coleen
> 
> 
> On 10/11/17 9:37 AM, Robbin Ehn wrote:
>> Hi all,
>>
>> Starting the review of the code while JEP work is still not completed.
>>
>> JEP: https://bugs.openjdk.java.net/browse/JDK-8185640
>>
>> This JEP introduces a way to execute a callback on threads without performing a global VM safepoint. It makes it both possible and cheap to stop individual threads and not just all threads or none.
>>
>> Entire changeset:
>> http://cr.openjdk.java.net/~rehn/8185640/v0/flat/
>>
>> Divided into 3-parts,
>> SafepointMechanism abstraction:
>> http://cr.openjdk.java.net/~rehn/8185640/v0/SafepointMechanism-0/
>> Consolidating polling page allocation:
>> http://cr.openjdk.java.net/~rehn/8185640/v0/PollingPage-1/
>> Handshakes:
>> http://cr.openjdk.java.net/~rehn/8185640/v0/Handshakes-2/
>>
>> A handshake operation is a callback that is executed for each JavaThread while that thread is in a safepoint safe state. The callback is executed either by the thread itself or by the VM thread while keeping the thread in a blocked state. The big difference between safepointing and handshaking is that the per thread operation will be performed on all threads as soon as possible and they will continue to execute as soon as it?s own operation is completed. If a JavaThread is known to be running, then a handshake can be performed with that single JavaThread as well.
>>
>> The current safepointing scheme is modified to perform an indirection through a per-thread pointer which will allow a single thread's execution to be forced to trap on the guard page. In order to force a thread to yield the VM updates the per-thread pointer for the corresponding thread to point to the guarded page.
>>
>> Example of potential use-cases:
>> -Biased lock revocation
>> -External requests for stack traces
>> -Deoptimization
>> -Async exception delivery
>> -External suspension
>> -Eliding memory barriers
>>
>> All of these will benefit the VM moving towards becoming more low-latency friendly by reducing the number of global safepoints.
>> Platforms that do not yet implement the per JavaThread poll, a fallback to normal safepoint is in place. HandshakeOneThread will then be a normal safepoint. The supported platforms are Linux x64 and Solaris SPARC.
>>
>> Tested heavily with various test suits and comes with a few new tests.
>>
>> Performance testing using standardized benchmark show no signification changes, the latest number was -0.7% on Linux x64 and +1.5% Solaris SPARC (not statistically ensured). A minor regression for the load vs load load on x64 is expected and a slight increase on SPARC due to the cost of ?materializing? the page vs load load.
>> The time to trigger a safepoint was measured on a large machine to not be an issue. The looping over threads and arming the polling page will benefit from the work on JavaThread life-cycle (8167108 - SMR and JavaThread Lifecycle: http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2017-October/024773.html) which puts all JavaThreads in an array instead of a linked list.
>>
>> Thanks, Robbin
> 

From robbin.ehn at oracle.com  Thu Oct 19 12:40:24 2017
From: robbin.ehn at oracle.com (Robbin Ehn)
Date: Thu, 19 Oct 2017 14:40:24 +0200
Subject: RFR(XL): 8185640: Thread-local handshakes
In-Reply-To: <82848a04-21dd-119e-3d53-101a7f25cb54@oracle.com>
References: <6f2f6259-73f1-c09c-063e-39ae528fb96f@oracle.com>
 <82848a04-21dd-119e-3d53-101a7f25cb54@oracle.com>
Message-ID: <04bd05d6-7ce2-93a9-288f-a640fb4b2806@oracle.com>

Here is the third incremental change:
http://cr.openjdk.java.net/~rehn/8185640/v2/Coleen-n-Test-Cleanup-4/webrev/
Goes on top of Atomic-Update-Rebase-3.

Let me know if anyone want to see some other kind of webrevs.

Thanks, Robbin

On 2017-10-18 11:15, Robbin Ehn wrote:
> Hi all,
> 
> Update after re-base with new atomic implementation:
> http://cr.openjdk.java.net/~rehn/8185640/v1/Atomic-Update-Rebase-3/
> This goes on top of the Handshakes-2.
> 
> Let me know if you want some other kinds of webrevs.
> 
> I would like to point out that Mikael Gerdin and Erik ?sterlund also are contributors of this changeset.
> 
> Thanks, Robbin
> 
> On 2017-10-11 15:37, Robbin Ehn wrote:
>> Hi all,
>>
>> Starting the review of the code while JEP work is still not completed.
>>
>> JEP: https://bugs.openjdk.java.net/browse/JDK-8185640
>>
>> This JEP introduces a way to execute a callback on threads without performing a global VM safepoint. It makes it both possible and cheap to stop individual threads and not just all threads or none.
>>
>> Entire changeset:
>> http://cr.openjdk.java.net/~rehn/8185640/v0/flat/
>>
>> Divided into 3-parts,
>> SafepointMechanism abstraction:
>> http://cr.openjdk.java.net/~rehn/8185640/v0/SafepointMechanism-0/
>> Consolidating polling page allocation:
>> http://cr.openjdk.java.net/~rehn/8185640/v0/PollingPage-1/
>> Handshakes:
>> http://cr.openjdk.java.net/~rehn/8185640/v0/Handshakes-2/
>>
>> A handshake operation is a callback that is executed for each JavaThread while that thread is in a safepoint safe state. The callback is executed either by the thread itself or by the VM thread while keeping the thread in a blocked state. The big difference between safepointing and handshaking is that the per thread operation will be performed on all threads as soon as possible and they will continue to execute as soon as it?s own operation is completed. If a JavaThread is known to be running, then a handshake can be performed with that single JavaThread as well.
>>
>> The current safepointing scheme is modified to perform an indirection through a per-thread pointer which will allow a single thread's execution to be forced to trap on the guard page. In order to force a thread to yield the VM updates the per-thread pointer for the corresponding thread to point to the guarded page.
>>
>> Example of potential use-cases:
>> -Biased lock revocation
>> -External requests for stack traces
>> -Deoptimization
>> -Async exception delivery
>> -External suspension
>> -Eliding memory barriers
>>
>> All of these will benefit the VM moving towards becoming more low-latency friendly by reducing the number of global safepoints.
>> Platforms that do not yet implement the per JavaThread poll, a fallback to normal safepoint is in place. HandshakeOneThread will then be a normal safepoint. The supported platforms are Linux x64 and Solaris SPARC.
>>
>> Tested heavily with various test suits and comes with a few new tests.
>>
>> Performance testing using standardized benchmark show no signification changes, the latest number was -0.7% on Linux x64 and +1.5% Solaris SPARC (not statistically ensured). A minor regression for the load vs load load on x64 is expected and a slight increase on SPARC due to the cost of ?materializing? the page vs load load.
>> The time to trigger a safepoint was measured on a large machine to not be an issue. The looping over threads and arming the polling page will benefit from the work on JavaThread life-cycle (8167108 - SMR and JavaThread Lifecycle: http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2017-October/024773.html) which puts all JavaThreads in an array instead of a linked list.
>>
>> Thanks, Robbin

From coleen.phillimore at oracle.com  Thu Oct 19 13:56:17 2017
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Thu, 19 Oct 2017 09:56:17 -0400
Subject: RFR(XL): 8185640: Thread-local handshakes
In-Reply-To: <04bd05d6-7ce2-93a9-288f-a640fb4b2806@oracle.com>
References: <6f2f6259-73f1-c09c-063e-39ae528fb96f@oracle.com>
 <82848a04-21dd-119e-3d53-101a7f25cb54@oracle.com>
 <04bd05d6-7ce2-93a9-288f-a640fb4b2806@oracle.com>
Message-ID: <868cd47e-f120-69d9-8932-45501794d4b5@oracle.com>

http://cr.openjdk.java.net/~rehn/8185640/v2/Coleen-n-Test-Cleanup-4/webrev/test/hotspot/jtreg/runtime/handshake/HandshakeTransitionTest.java.udiff.html

Thank you this is better.

In this test, what happens if it fails?

Everything looks better with this change.

Thanks,
Coleen


On 10/19/17 8:40 AM, Robbin Ehn wrote:
> Here is the third incremental change:
> http://cr.openjdk.java.net/~rehn/8185640/v2/Coleen-n-Test-Cleanup-4/webrev/ 
>
> Goes on top of Atomic-Update-Rebase-3.
>
> Let me know if anyone want to see some other kind of webrevs.
>
> Thanks, Robbin
>
> On 2017-10-18 11:15, Robbin Ehn wrote:
>> Hi all,
>>
>> Update after re-base with new atomic implementation:
>> http://cr.openjdk.java.net/~rehn/8185640/v1/Atomic-Update-Rebase-3/
>> This goes on top of the Handshakes-2.
>>
>> Let me know if you want some other kinds of webrevs.
>>
>> I would like to point out that Mikael Gerdin and Erik ?sterlund also 
>> are contributors of this changeset.
>>
>> Thanks, Robbin
>>
>> On 2017-10-11 15:37, Robbin Ehn wrote:
>>> Hi all,
>>>
>>> Starting the review of the code while JEP work is still not completed.
>>>
>>> JEP: https://bugs.openjdk.java.net/browse/JDK-8185640
>>>
>>> This JEP introduces a way to execute a callback on threads without 
>>> performing a global VM safepoint. It makes it both possible and 
>>> cheap to stop individual threads and not just all threads or none.
>>>
>>> Entire changeset:
>>> http://cr.openjdk.java.net/~rehn/8185640/v0/flat/
>>>
>>> Divided into 3-parts,
>>> SafepointMechanism abstraction:
>>> http://cr.openjdk.java.net/~rehn/8185640/v0/SafepointMechanism-0/
>>> Consolidating polling page allocation:
>>> http://cr.openjdk.java.net/~rehn/8185640/v0/PollingPage-1/
>>> Handshakes:
>>> http://cr.openjdk.java.net/~rehn/8185640/v0/Handshakes-2/
>>>
>>> A handshake operation is a callback that is executed for each 
>>> JavaThread while that thread is in a safepoint safe state. The 
>>> callback is executed either by the thread itself or by the VM thread 
>>> while keeping the thread in a blocked state. The big difference 
>>> between safepointing and handshaking is that the per thread 
>>> operation will be performed on all threads as soon as possible and 
>>> they will continue to execute as soon as it?s own operation is 
>>> completed. If a JavaThread is known to be running, then a handshake 
>>> can be performed with that single JavaThread as well.
>>>
>>> The current safepointing scheme is modified to perform an 
>>> indirection through a per-thread pointer which will allow a single 
>>> thread's execution to be forced to trap on the guard page. In order 
>>> to force a thread to yield the VM updates the per-thread pointer for 
>>> the corresponding thread to point to the guarded page.
>>>
>>> Example of potential use-cases:
>>> -Biased lock revocation
>>> -External requests for stack traces
>>> -Deoptimization
>>> -Async exception delivery
>>> -External suspension
>>> -Eliding memory barriers
>>>
>>> All of these will benefit the VM moving towards becoming more 
>>> low-latency friendly by reducing the number of global safepoints.
>>> Platforms that do not yet implement the per JavaThread poll, a 
>>> fallback to normal safepoint is in place. HandshakeOneThread will 
>>> then be a normal safepoint. The supported platforms are Linux x64 
>>> and Solaris SPARC.
>>>
>>> Tested heavily with various test suits and comes with a few new tests.
>>>
>>> Performance testing using standardized benchmark show no 
>>> signification changes, the latest number was -0.7% on Linux x64 and 
>>> +1.5% Solaris SPARC (not statistically ensured). A minor regression 
>>> for the load vs load load on x64 is expected and a slight increase 
>>> on SPARC due to the cost of ?materializing? the page vs load load.
>>> The time to trigger a safepoint was measured on a large machine to 
>>> not be an issue. The looping over threads and arming the polling 
>>> page will benefit from the work on JavaThread life-cycle (8167108 - 
>>> SMR and JavaThread Lifecycle: 
>>> http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2017-October/024773.html) 
>>> which puts all JavaThreads in an array instead of a linked list.
>>>
>>> Thanks, Robbin


From coleen.phillimore at oracle.com  Thu Oct 19 14:20:34 2017
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Thu, 19 Oct 2017 10:20:34 -0400
Subject: RFR: 8184914: Use MacroAssembler::cmpoop() consistently when
 comparing heap objects
In-Reply-To: <55bb0f72-df71-44bc-53a0-7d982ab1ca04@redhat.com>
References: <8d667010-f17e-7d1b-088b-106999e3b005@redhat.com>
 <9b629556-b3f0-e52e-35e0-711c6a767e95@oracle.com>
 <55bb0f72-df71-44bc-53a0-7d982ab1ca04@redhat.com>
Message-ID: <f0346b1f-2368-cd99-5a85-5b95214697ce@oracle.com>

I'm calling this as "trivial" and can be pushed now.
Thanks,
Coleen

On 10/17/17 5:05 PM, Roman Kennke wrote:
>
>>
>> This looks reasonable to me.? Maybe the compiler group should review 
>> the c1 part.? I changed the mailing list to hotspot-dev.
>> I can sponsor this for you.
> Thanks, thanks and thanks! ;-)
>
> Roman
>
>> Thanks,
>> Coleen
>>
>> On 10/17/17 4:22 PM, Roman Kennke wrote:
>>> (Not sure if this is the correct list to ask.. if not, please let me 
>>> know and/or redirect me)
>>>
>>> Currently, cmpoop() is only declared for 32-bit x86, and only used 
>>> in 2 places in C1 to compare oops. In other places, oops are 
>>> compared using cmpptr(). It would be useful to distinguish normal 
>>> pointer comparisons from heap object comparisons, and use cmpoop() 
>>> consistently for heap object comparisons. This would remove clutter 
>>> in several places where we have #ifdef _LP64 around comparisons, and 
>>> would also allow to insert necessary barriers for GCs that need them 
>>> (e.g. Shenandoah) later.
>>>
>>> http://cr.openjdk.java.net/~rkennke/8184914/webrev.00/ 
>>> <http://cr.openjdk.java.net/%7Erkennke/8184914/webrev.00/>
>>>
>>> Tested by running hotspot_gc jtreg tests.
>>>
>>> Can I get a review please?
>>>
>>> Thanks, Roman
>>>
>>>
>>
>


From OGATAK at jp.ibm.com  Fri Oct 20 06:31:47 2017
From: OGATAK at jp.ibm.com (Kazunori Ogata)
Date: Fri, 20 Oct 2017 15:31:47 +0900
Subject: 8188131: [PPC] Increase inlining thresholds to the same as other
 platforms
In-Reply-To: <OF908158DB.734FD41B-ON002581BE.003CBC21@LocalDomain>
References: <OF026D4A74.0E535495-ON492581AA.00245DF9-492581AA.0024C9CA@notes.na.collabserv.com>
 <OFBB2A1A8A.33229D26-ON002581BD.003AEE77@LocalDomain>
 <OF8485784D.DED3C7C4-ON492581BE.002406F5-492581BE.0024EEE3@notes.na.collabserv.com>
 <OF908158DB.734FD41B-ON002581BE.003CBC21@LocalDomain>
Message-ID: <OFC7497C0B.2D039295-ON492581BF.00231800-492581BF.0023E21B@notes.na.collabserv.com>

Hi Goetz,

Thank you for your comment.  OK, I'll evaluate the patch more by comparing 
the minimum code cache sizes and the performance on the cache size.

It is helpful if you could explain what is the difference of the JIT 
behavior when the code cache is large enough and when it is the minimum 
size.  It seems almost the same to me because all the methods that needed 
to be compiled should be compiled in both cases, but I may miss something.


By the way, the benchmark I confirmed performance improvement was TPC-DS 
q96, but I measured the code cache size of SPECjbb2015 by my mistake. I'll 
compare the minimum code cache sizes and the performance of both 
benchmarks, as this patch will affect all applications.


Regards,
Ogata


From:   "Lindenmaier, Goetz" <goetz.lindenmaier at sap.com>
To:     Kazunori Ogata <OGATAK at jp.ibm.com>, "Doerr, Martin" 
<martin.doerr at sap.com>
Cc:     "ppc-aix-port-dev at openjdk.java.net" 
<ppc-aix-port-dev at openjdk.java.net>, "hotspot-dev at openjdk.java.net" 
<hotspot-dev at openjdk.java.net>
Date:   2017/10/19 20:03
Subject:        RE: 8188131: [PPC] Increase inlining thresholds to the 
same as other platforms


Hi Kazunori, 

To me, this seems to be a very large increase.
Considering that not only the required code cache size but also the 
compiler cpu time will increase in this magnitude, this seems to be 
a rather risky step that should be tested for its benefits on systems
that are highly contended.

In this case, you probably had enough space in the code cache so that
no recompilation etc. happened. 

To further look at this I could think of 
1. finding the minimal code cache size with the old flags where 
   the JIT is not disabled
2. finding the same size for the new flag settings 
   --> How much more is needed for the new settings?

Then you should compare the performance with the bigger 
code cache size for both, and see whether there still is performance
improvement, or whether it's eaten up by more compile time. 
I.e. you should have a setup where compiler threads and application
threads compete for the available CPUs.

What do you think?

Best regards,
  Goetz.

> -----Original Message-----
> From: hotspot-dev [mailto:hotspot-dev-bounces at openjdk.java.net] On
> Behalf Of Kazunori Ogata
> Sent: Donnerstag, 19. Oktober 2017 08:43
> To: Doerr, Martin <martin.doerr at sap.com>
> Cc: ppc-aix-port-dev at openjdk.java.net; hotspot-dev at openjdk.java.net
> Subject: RE: 8188131: [PPC] Increase inlining thresholds to the same as 
other
> platforms
> 
> Hi Martin,
> 
> Thank you for your comment.  I checked the code cache size by running
> SPECjbb2015 (composite mode, i.e., single JVM mode, heap size is 31GB).
> 
> The used code cache size was increased by 4.5MB from 41982Kb to 47006Kb
> (+12%).  Is the increase too large?
> 
> 
> The raw output of -XX:+PrintCodeCache are:
> 
> === Original ===
> CodeHeap 'non-profiled nmethods': size=652480Kb used=13884Kb
> max_used=13884Kb free=638595Kb
>  bounds [0x00001000356f0000, 0x0000100036480000, 0x000010005d420000]
> CodeHeap 'profiled nmethods': size=652480Kb used=26593Kb
> max_used=26593Kb
> free=625886Kb
>  bounds [0x000010000d9c0000, 0x000010000f3c0000, 0x00001000356f0000]
> CodeHeap 'non-nmethods': size=5760Kb used=1505Kb max_used=1559Kb
> free=4254Kb
>  bounds [0x000010000d420000, 0x000010000d620000, 0x000010000d9c0000]
>  total_blobs=16606 nmethods=10265 adapters=653
>  compilation: enabled
> 
> 
> === Modified (webrev.00) ===
> CodeHeap 'non-profiled nmethods': size=652480Kb used=18516Kb
> max_used=18516Kb free=633964Kb
>  bounds [0x0000100035730000, 0x0000100036950000, 0x000010005d460000]
> CodeHeap 'profiled nmethods': size=652480Kb used=26963Kb
> max_used=26963Kb
> free=625516Kb
>  bounds [0x000010000da00000, 0x000010000f460000, 0x0000100035730000]
> CodeHeap 'non-nmethods': size=5760Kb used=1527Kb max_used=1565Kb
> free=4232Kb
>  bounds [0x000010000d460000, 0x000010000d660000, 0x000010000da00000]
>  total_blobs=16561 nmethods=10295 adapters=653
>  compilation: enabled
> 
> 
> Regards,
> Ogata
> 
> 
> 
> 
> From:   "Doerr, Martin" <martin.doerr at sap.com>
> To:     Kazunori Ogata <OGATAK at jp.ibm.com>, "hotspot-
> dev at openjdk.java.net"
> <hotspot-dev at openjdk.java.net>, "ppc-aix-port-dev at openjdk.java.net"
> <ppc-aix-port-dev at openjdk.java.net>
> Date:   2017/10/18 19:43
> Subject:        RE: 8188131: [PPC] Increase inlining thresholds to the
> same as other platforms
> 
> 
> 
> Hi Ogata,
> 
> sorry for the delay. I had missed this one.
> 
> The change looks feasible to me.
> 
> It may only impact the utilization of the Code Cache. Can you evaluate
> that (e.g. by running large benchmarks with -XX:+PrintCodeCache)?
> 
> Thanks and best regards,
> Martin
> 
> 
> -----Original Message-----
> From: hotspot-dev [mailto:hotspot-dev-bounces at openjdk.java.net] On
> Behalf
> Of Kazunori Ogata
> Sent: Freitag, 29. September 2017 08:42
> To: hotspot-dev at openjdk.java.net; ppc-aix-port-dev at openjdk.java.net
> Subject: RFR: 8188131: [PPC] Increase inlining thresholds to the same as
> other platforms
> 
> Hi all,
> 
> Please review a change for JDK-8188131.
> 
> Bug report:
> https://urldefense.proofpoint.com/v2/url?u=https-
> 3A__bugs.openjdk.java.net_browse_JDK-
> 2D8188131&d=DwIFAg&c=jf_iaSHvJObTbx-siA1ZOg&r=p-
> FJcrbNvnCOLkbIdmQ2tigCrcpdU77tlI2EIdaEcJw&m=ExKSiZAany_n7vS453MD
> 73lAZxkNhGsrlDkk-
> YUYORQ&s=ic27Fb2_vyTSsUAPraEI89UDJy9cbodGojvMw9DNHiU&e=
> 
> Webrev:
> https://urldefense.proofpoint.com/v2/url?u=http-
> 3A__cr.openjdk.java.net_-
> 7Ehorii_8188131_webrev.00_&d=DwIFAg&c=jf_iaSHvJObTbx-siA1ZOg&r=p-
> FJcrbNvnCOLkbIdmQ2tigCrcpdU77tlI2EIdaEcJw&m=ExKSiZAany_n7vS453MD
> 73lAZxkNhGsrlDkk-YUYORQ&s=xS8PbLyuVtbOBRDMIB-
> i9r6lTggpGH3Np8kmONkkMAg&e=
> 
> 
> This change increases the default values of FreqInlineSize and
> InlineSmallCode in ppc64 to 325 and 2500, respectively.  These values 
are
> the same as aarch64.  The performance of TPC-DS Q96 was improved by
> about
> 6% with this change.
> 
> 
> Regards,
> Ogata
> 
> 
> 


From bourges.laurent at gmail.com  Fri Oct 20 08:19:53 2017
From: bourges.laurent at gmail.com (=?UTF-8?Q?Laurent_Bourg=C3=A8s?=)
Date: Fri, 20 Oct 2017 10:19:53 +0200
Subject: Upgrading gcc arch ?
In-Reply-To: <CAKjRUT7e46+_uvZ9ngsTTgr4i_3v1i3=Nrc0tuyzqXuNRWTTMg@mail.gmail.com>
References: <CAKjRUT5v=b+7pNeXh4eCJNOC--KKDTL-Qud6oUv9uDmQrxrCUQ@mail.gmail.com>
 <CAKjRUT7mNgFy4V_+zArhv3C3ptPk9obbaQz3dyw0VnM7dwQbxQ@mail.gmail.com>
 <CAKjRUT5eAHPHC+9Q8Q0YiSfVUugi88Qg_d0RNXO=VgoVgA5Xgw@mail.gmail.com>
 <CAKjRUT60d2Sjewzy2QBxqfux-1EYbVW5dDHDFS6ACaDqtuxZ8Q@mail.gmail.com>
 <CAKjRUT69OptnFzm0kFnJLhWrWvKMFftjZ9e1tAFLF=6ii1rbcg@mail.gmail.com>
 <CAKjRUT7zcBphjWK=A7thLJxbd7Z0YYJVgKf2mgteWjChAdmPxg@mail.gmail.com>
 <CAKjRUT4qJCg8feuWMiasRpDfX1a1sTDjofSUOFbK2rdXaD4abg@mail.gmail.com>
 <CAKjRUT6SjH0xOQT7neS+nntX2My_MPos6et323oUi5jxxn=omA@mail.gmail.com>
 <CAKjRUT5=se_Dqw0egrkdvOAPaFwk=LiMfLy3TMEmJWNiWSEiSw@mail.gmail.com>
 <CAKjRUT7ekkwONtZBteWqhF6yVwE_Gk75qJjB1aTx-XKR76yabQ@mail.gmail.com>
 <CAKjRUT6nTC76qDikTXKA4M1+a=SR20DEOSq9+SxcLM4h9nTjfQ@mail.gmail.com>
 <CAKjRUT4O-FDUFHj2f4nUf1c2RHtnQ2eEELgff+cAmeT_NMBg1Q@mail.gmail.com>
 <CAKjRUT7eVheEc1XdsqsG9-aTp_Q8n_KDgBTuJBQtmcU9zLd9dw@mail.gmail.com>
 <CAKjRUT4vr=BHfuEXQ=kRMeNEqUpE=rKEEk5Bk0NPU2xkPypUQw@mail.gmail.com>
 <CAKjRUT6UOvsFwTRCp5mm3=b1PtLQtQufVdQ2mHL88Wc3+htJbA@mail.gmail.com>
 <CAKjRUT6qh6XX1cV=eaC+d7dBuFRecPtitakEQhxxoj4zYC24XQ@mail.gmail.com>
 <CAKjRUT4E36pkm0q5aJpW6QwQz4SA06_aEA5DdLnOnu_qv3Mw6Q@mail.gmail.com>
 <CAKjRUT7e46+_uvZ9ngsTTgr4i_3v1i3=Nrc0tuyzqXuNRWTTMg@mail.gmail.com>
Message-ID: <CAKjRUT7LQWpAuK6672FHRFf90veYrprxEgUJF2EsYaWcsoEQ6g@mail.gmail.com>

Hi,

I wonder if it is time to compile c/c++ code with a more recent cpu
architecture (x86-64 is quite old: only SSE ?) to take benefit of
performance optimizations offered by recent CPU and compilers (AVX...).

Of course that means such builds would be specific to a CPU class and that
will require build changes to make multiple flavors depending on the CPU
classes ...

See gcc -mtune argument:
https://gcc.gnu.org/onlinedocs/gcc-6.1.0/gcc/x86-Options.html

"
?sandybridge?
    Intel Sandy Bridge CPU with 64-bit extensions, MMX, SSE, SSE2, SSE3,
SSSE3, SSE4.1, SSE4.2, POPCNT, AVX, AES and PCLMUL instruction set support.
?ivybridge?
    Intel Ivy Bridge CPU with 64-bit extensions, MMX, SSE, SSE2, SSE3,
SSSE3, SSE4.1, SSE4.2, POPCNT, AVX, AES, PCLMUL, FSGSBASE, RDRND and F16C
instruction set support.
?haswell?
    Intel Haswell CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2, SSE3,
SSSE3, SSE4.1, SSE4.2, POPCNT, AVX, AVX2, AES, PCLMUL, FSGSBASE, RDRND,
FMA, BMI, BMI2 and F16C instruction set support.
?broadwell?
    Intel Broadwell CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2,
SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, AVX, AVX2, AES, PCLMUL, FSGSBASE,
RDRND, FMA, BMI, BMI2, F16C, RDSEED, ADCX and PREFETCHW instruction set
support.
?skylake?
    Intel Skylake CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2, SSE3,
SSSE3, SSE4.1, SSE4.2, POPCNT, AVX, AVX2, AES, PCLMUL, FSGSBASE, RDRND,
FMA, BMI, BMI2, F16C, RDSEED, ADCX, PREFETCHW, CLFLUSHOPT, XSAVEC and
XSAVES instruction set support.
?bonnell?
    Intel Bonnell CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2, SSE3
and SSSE3 instruction set support.
?silvermont?
    Intel Silvermont CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2,
SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, AES, PCLMUL and RDRND instruction set
support.
?knl?
    Intel Knight's Landing CPU with 64-bit extensions, MOVBE, MMX, SSE,
SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, AVX, AVX2, AES, PCLMUL,
FSGSBASE, RDRND, FMA, BMI, BMI2, F16C, RDSEED, ADCX, PREFETCHW, AVX512F,
AVX512PF, AVX512ER and AVX512CD instruction set support.
?skylake-avx512?
    Intel Skylake Server CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2,
SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, PKU, AVX, AVX2, AES, PCLMUL, FSGSBASE,
RDRND, FMA, BMI, BMI2, F16C, RDSEED, ADCX, PREFETCHW, CLFLUSHOPT, XSAVEC,
XSAVES, AVX512F, AVX512VL, AVX512BW, AVX512DQ and AVX512CD instruction set
support.

"

Comments are welcome,
Laurent

From glaubitz at physik.fu-berlin.de  Fri Oct 20 08:25:54 2017
From: glaubitz at physik.fu-berlin.de (John Paul Adrian Glaubitz)
Date: Fri, 20 Oct 2017 10:25:54 +0200
Subject: Upgrading gcc arch ?
In-Reply-To: <CAKjRUT7LQWpAuK6672FHRFf90veYrprxEgUJF2EsYaWcsoEQ6g@mail.gmail.com>
References: <CAKjRUT5v=b+7pNeXh4eCJNOC--KKDTL-Qud6oUv9uDmQrxrCUQ@mail.gmail.com>
 <CAKjRUT4qJCg8feuWMiasRpDfX1a1sTDjofSUOFbK2rdXaD4abg@mail.gmail.com>
 <CAKjRUT6SjH0xOQT7neS+nntX2My_MPos6et323oUi5jxxn=omA@mail.gmail.com>
 <CAKjRUT5=se_Dqw0egrkdvOAPaFwk=LiMfLy3TMEmJWNiWSEiSw@mail.gmail.com>
 <CAKjRUT7ekkwONtZBteWqhF6yVwE_Gk75qJjB1aTx-XKR76yabQ@mail.gmail.com>
 <CAKjRUT6nTC76qDikTXKA4M1+a=SR20DEOSq9+SxcLM4h9nTjfQ@mail.gmail.com>
 <CAKjRUT4O-FDUFHj2f4nUf1c2RHtnQ2eEELgff+cAmeT_NMBg1Q@mail.gmail.com>
 <CAKjRUT7eVheEc1XdsqsG9-aTp_Q8n_KDgBTuJBQtmcU9zLd9dw@mail.gmail.com>
 <CAKjRUT4vr=BHfuEXQ=kRMeNEqUpE=rKEEk5Bk0NPU2xkPypUQw@mail.gmail.com>
 <CAKjRUT6UOvsFwTRCp5mm3=b1PtLQtQufVdQ2mHL88Wc3+htJbA@mail.gmail.com>
 <CAKjRUT6qh6XX1cV=eaC+d7dBuFRecPtitakEQhxxoj4zYC24XQ@mail.gmail.com>
 <CAKjRUT4E36pkm0q5aJpW6QwQz4SA06_aEA5DdLnOnu_qv3Mw6Q@mail.gmail.com>
 <CAKjRUT7e46+_uvZ9ngsTTgr4i_3v1i3=Nrc0tuyzqXuNRWTTMg@mail.gmail.com>
 <CAKjRUT7LQWpAuK6672FHRFf90veYrprxEgUJF2EsYaWcsoEQ6g@mail.gmail.com>
Message-ID: <8d81eea9-0fff-5981-f885-acc66c69fb33@physik.fu-berlin.de>

On 10/20/2017 10:19 AM, Laurent Bourg?s wrote:
> I wonder if it is time to compile c/c++ code with a more recent cpu
> architecture (x86-64 is quite old: only SSE ?) to take benefit of
> performance optimizations offered by recent CPU and compilers (AVX...).

Only if it's possible to make use of these features during runtime as
it's being done on SPARC.

> Of course that means such builds would be specific to a CPU class and that
> will require build changes to make multiple flavors depending on the CPU
> classes ...

No, if this a compile time option, this is an absolute no go. It would
be absolutely crazy to break compatibility with such widely available
hardware with a piece of software which has one of the largest installation
bases world wide.

Adrian

-- 
 .''`.  John Paul Adrian Glaubitz
: :' :  Debian Developer - glaubitz at debian.org
`. `'   Freie Universitaet Berlin - glaubitz at physik.fu-berlin.de
  `-    GPG: 62FF 8A75 84E0 2956 9546  0006 7426 3B37 F5B5 F913

From david.holmes at oracle.com  Fri Oct 20 09:19:00 2017
From: david.holmes at oracle.com (David Holmes)
Date: Fri, 20 Oct 2017 19:19:00 +1000
Subject: Upgrading gcc arch ?
In-Reply-To: <CAKjRUT7LQWpAuK6672FHRFf90veYrprxEgUJF2EsYaWcsoEQ6g@mail.gmail.com>
References: <CAKjRUT5v=b+7pNeXh4eCJNOC--KKDTL-Qud6oUv9uDmQrxrCUQ@mail.gmail.com>
 <CAKjRUT4qJCg8feuWMiasRpDfX1a1sTDjofSUOFbK2rdXaD4abg@mail.gmail.com>
 <CAKjRUT6SjH0xOQT7neS+nntX2My_MPos6et323oUi5jxxn=omA@mail.gmail.com>
 <CAKjRUT5=se_Dqw0egrkdvOAPaFwk=LiMfLy3TMEmJWNiWSEiSw@mail.gmail.com>
 <CAKjRUT7ekkwONtZBteWqhF6yVwE_Gk75qJjB1aTx-XKR76yabQ@mail.gmail.com>
 <CAKjRUT6nTC76qDikTXKA4M1+a=SR20DEOSq9+SxcLM4h9nTjfQ@mail.gmail.com>
 <CAKjRUT4O-FDUFHj2f4nUf1c2RHtnQ2eEELgff+cAmeT_NMBg1Q@mail.gmail.com>
 <CAKjRUT7eVheEc1XdsqsG9-aTp_Q8n_KDgBTuJBQtmcU9zLd9dw@mail.gmail.com>
 <CAKjRUT4vr=BHfuEXQ=kRMeNEqUpE=rKEEk5Bk0NPU2xkPypUQw@mail.gmail.com>
 <CAKjRUT6UOvsFwTRCp5mm3=b1PtLQtQufVdQ2mHL88Wc3+htJbA@mail.gmail.com>
 <CAKjRUT6qh6XX1cV=eaC+d7dBuFRecPtitakEQhxxoj4zYC24XQ@mail.gmail.com>
 <CAKjRUT4E36pkm0q5aJpW6QwQz4SA06_aEA5DdLnOnu_qv3Mw6Q@mail.gmail.com>
 <CAKjRUT7e46+_uvZ9ngsTTgr4i_3v1i3=Nrc0tuyzqXuNRWTTMg@mail.gmail.com>
 <CAKjRUT7LQWpAuK6672FHRFf90veYrprxEgUJF2EsYaWcsoEQ6g@mail.gmail.com>
Message-ID: <f1d78c33-d5b0-cff7-6b31-dafd3be84338@oracle.com>

bcc'ing the discuss list

On 20/10/2017 6:19 PM, Laurent Bourg?s wrote:
> Hi,
> 
> I wonder if it is time to compile c/c++ code with a more recent cpu
> architecture (x86-64 is quite old: only SSE ?) to take benefit of
> performance optimizations offered by recent CPU and compilers (AVX...).

The focus in hotspot is on JIT generated code which does take advantage 
of such optimizations based on the runtime CPU capabilities.

Is there specific C code in the JDK that you think would benefit from them?

Have you done comparison builds and run any benchmarks?

Thanks,
David

> Of course that means such builds would be specific to a CPU class and that
> will require build changes to make multiple flavors depending on the CPU
> classes ...
> 
> See gcc -mtune argument:
> https://gcc.gnu.org/onlinedocs/gcc-6.1.0/gcc/x86-Options.html
> 
> "
> ?sandybridge?
>      Intel Sandy Bridge CPU with 64-bit extensions, MMX, SSE, SSE2, SSE3,
> SSSE3, SSE4.1, SSE4.2, POPCNT, AVX, AES and PCLMUL instruction set support.
> ?ivybridge?
>      Intel Ivy Bridge CPU with 64-bit extensions, MMX, SSE, SSE2, SSE3,
> SSSE3, SSE4.1, SSE4.2, POPCNT, AVX, AES, PCLMUL, FSGSBASE, RDRND and F16C
> instruction set support.
> ?haswell?
>      Intel Haswell CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2, SSE3,
> SSSE3, SSE4.1, SSE4.2, POPCNT, AVX, AVX2, AES, PCLMUL, FSGSBASE, RDRND,
> FMA, BMI, BMI2 and F16C instruction set support.
> ?broadwell?
>      Intel Broadwell CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2,
> SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, AVX, AVX2, AES, PCLMUL, FSGSBASE,
> RDRND, FMA, BMI, BMI2, F16C, RDSEED, ADCX and PREFETCHW instruction set
> support.
> ?skylake?
>      Intel Skylake CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2, SSE3,
> SSSE3, SSE4.1, SSE4.2, POPCNT, AVX, AVX2, AES, PCLMUL, FSGSBASE, RDRND,
> FMA, BMI, BMI2, F16C, RDSEED, ADCX, PREFETCHW, CLFLUSHOPT, XSAVEC and
> XSAVES instruction set support.
> ?bonnell?
>      Intel Bonnell CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2, SSE3
> and SSSE3 instruction set support.
> ?silvermont?
>      Intel Silvermont CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2,
> SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, AES, PCLMUL and RDRND instruction set
> support.
> ?knl?
>      Intel Knight's Landing CPU with 64-bit extensions, MOVBE, MMX, SSE,
> SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, AVX, AVX2, AES, PCLMUL,
> FSGSBASE, RDRND, FMA, BMI, BMI2, F16C, RDSEED, ADCX, PREFETCHW, AVX512F,
> AVX512PF, AVX512ER and AVX512CD instruction set support.
> ?skylake-avx512?
>      Intel Skylake Server CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2,
> SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, PKU, AVX, AVX2, AES, PCLMUL, FSGSBASE,
> RDRND, FMA, BMI, BMI2, F16C, RDSEED, ADCX, PREFETCHW, CLFLUSHOPT, XSAVEC,
> XSAVES, AVX512F, AVX512VL, AVX512BW, AVX512DQ and AVX512CD instruction set
> support.
> 
> "
> 
> Comments are welcome,
> Laurent
> 

From thomas.schatzl at oracle.com  Fri Oct 20 10:05:29 2017
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Fri, 20 Oct 2017 12:05:29 +0200
Subject: RFR(M) 8186834:Expanding old area without full GC in parallel GC
In-Reply-To: <OF1DAF61D0.8F6A03FC-ON002581BF.0031E68E-492581BF.0032B1BE@notes.na.collabserv.com>
References: <OFCBC5A5CD.3EF6ED1F-ON0025818A.00475480-4925818A.005443B4@notes.na.collabserv.com>
 <OF6D0AF6C7.33D6724D-ON002581B8.0047D597@LocalDomain>
 <OFA375AB07.20C796C9-ON002581BC.0041BF80-492581BC.0042C337@notes.na.collabserv.com>
 <OF5F94E5F4.B935EA93-ON002581BD.002DA943@LocalDomain>
 <OF1DAF61D0.8F6A03FC-ON002581BF.0031E68E-492581BF.0032B1BE@notes.na.collabserv.com>
Message-ID: <1508493929.2820.9.camel@oracle.com>

Hi,

On Fri, 2017-10-20 at 18:13 +0900, Michihiro Horie wrote:
> Hi Thomas,
> 
> Thanks a lot for the discussion, also sorry for my late reply.
> 
> I think MinHeapFreeRatio tunes the size of heap expansion, while
> UseAdaptiveGenerationSizePolicyBeforeMajorCollection decides to
> expand heap, whose size is decided by MinHeapFreeRatio, without full
> GC.

I agree, but one could tune MinHeapFreeRatio so that the amount of full
gcs and the time spent in there would be much smaller than by default.

> >Particularly if, as you mention, full gc will not yield a
> significant amount of freed memory, both methods seem to achieve the
> exact same effect.
> Yes, so I think heap once expands up to Xmx, both methods have the
> same effect.
> 
[...]

> >Otherwise, if you were able to pass different VM arguments to the
> >different VMs, the use of -Xms (instead of that new flag) would seem
> >straightforward to me (Only specifying -Xms will not actually commit
> >the memory, so there is no difference in actual memory use).
> I did not tell this (sorry), but currently Xms and Xmx are set
> explicitly in the VM arguments because we want to use only needed
> memory.
> 

As mentioned before, even with -Xms == -Xmx, memory is not actually
backed with physical memory until actually touched by default.

I could imagine that -Xms==-Xmx would yield better (initial)
performance as the young gen will be sized larger.

So the suggested change would only make a difference in case you also
explicitly pre-touched that memory from what I understood. Not sure if
that is what you do or desire (enabling memory pretouch is typically
only used with -Xms==-Xmx, so not sure if that is a good use case).

Thanks,
  Thomas


From robbin.ehn at oracle.com  Fri Oct 20 10:11:54 2017
From: robbin.ehn at oracle.com (Robbin Ehn)
Date: Fri, 20 Oct 2017 12:11:54 +0200
Subject: Upgrading gcc arch ?
In-Reply-To: <f1d78c33-d5b0-cff7-6b31-dafd3be84338@oracle.com>
References: <CAKjRUT5v=b+7pNeXh4eCJNOC--KKDTL-Qud6oUv9uDmQrxrCUQ@mail.gmail.com>
 <CAKjRUT6SjH0xOQT7neS+nntX2My_MPos6et323oUi5jxxn=omA@mail.gmail.com>
 <CAKjRUT5=se_Dqw0egrkdvOAPaFwk=LiMfLy3TMEmJWNiWSEiSw@mail.gmail.com>
 <CAKjRUT7ekkwONtZBteWqhF6yVwE_Gk75qJjB1aTx-XKR76yabQ@mail.gmail.com>
 <CAKjRUT6nTC76qDikTXKA4M1+a=SR20DEOSq9+SxcLM4h9nTjfQ@mail.gmail.com>
 <CAKjRUT4O-FDUFHj2f4nUf1c2RHtnQ2eEELgff+cAmeT_NMBg1Q@mail.gmail.com>
 <CAKjRUT7eVheEc1XdsqsG9-aTp_Q8n_KDgBTuJBQtmcU9zLd9dw@mail.gmail.com>
 <CAKjRUT4vr=BHfuEXQ=kRMeNEqUpE=rKEEk5Bk0NPU2xkPypUQw@mail.gmail.com>
 <CAKjRUT6UOvsFwTRCp5mm3=b1PtLQtQufVdQ2mHL88Wc3+htJbA@mail.gmail.com>
 <CAKjRUT6qh6XX1cV=eaC+d7dBuFRecPtitakEQhxxoj4zYC24XQ@mail.gmail.com>
 <CAKjRUT4E36pkm0q5aJpW6QwQz4SA06_aEA5DdLnOnu_qv3Mw6Q@mail.gmail.com>
 <CAKjRUT7e46+_uvZ9ngsTTgr4i_3v1i3=Nrc0tuyzqXuNRWTTMg@mail.gmail.com>
 <CAKjRUT7LQWpAuK6672FHRFf90veYrprxEgUJF2EsYaWcsoEQ6g@mail.gmail.com>
 <f1d78c33-d5b0-cff7-6b31-dafd3be84338@oracle.com>
Message-ID: <7bbf5a0d-ed69-6f6e-6c59-2373a465f65d@oracle.com>

On 2017-10-20 11:19, David Holmes wrote:
> bcc'ing the discuss list
> 
> On 20/10/2017 6:19 PM, Laurent Bourg?s wrote:
>> Hi,
>>
>> I wonder if it is time to compile c/c++ code with a more recent cpu
>> architecture (x86-64 is quite old: only SSE ?) to take benefit of
>> performance optimizations offered by recent CPU and compilers (AVX...).
> 
> The focus in hotspot is on JIT generated code which does take advantage of such optimizations based on the runtime CPU capabilities.
> 
> Is there specific C code in the JDK that you think would benefit from them?

If there are specific code that preform much better with new some newer features we could utilize function multiversioning feature in the gcc.
E.g.:
__attribute__((target_clones("sse4.2","sse3","default")))
void stream_function(...) {

Negative impact on size, so as David says, benchmark first.

/Robbin

> 
> Have you done comparison builds and run any benchmarks?
> 
> Thanks,
> David
> 
>> Of course that means such builds would be specific to a CPU class and that
>> will require build changes to make multiple flavors depending on the CPU
>> classes ...
>>
>> See gcc -mtune argument:
>> https://gcc.gnu.org/onlinedocs/gcc-6.1.0/gcc/x86-Options.html
>>
>> "
>> ?sandybridge?
>> ???? Intel Sandy Bridge CPU with 64-bit extensions, MMX, SSE, SSE2, SSE3,
>> SSSE3, SSE4.1, SSE4.2, POPCNT, AVX, AES and PCLMUL instruction set support.
>> ?ivybridge?
>> ???? Intel Ivy Bridge CPU with 64-bit extensions, MMX, SSE, SSE2, SSE3,
>> SSSE3, SSE4.1, SSE4.2, POPCNT, AVX, AES, PCLMUL, FSGSBASE, RDRND and F16C
>> instruction set support.
>> ?haswell?
>> ???? Intel Haswell CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2, SSE3,
>> SSSE3, SSE4.1, SSE4.2, POPCNT, AVX, AVX2, AES, PCLMUL, FSGSBASE, RDRND,
>> FMA, BMI, BMI2 and F16C instruction set support.
>> ?broadwell?
>> ???? Intel Broadwell CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2,
>> SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, AVX, AVX2, AES, PCLMUL, FSGSBASE,
>> RDRND, FMA, BMI, BMI2, F16C, RDSEED, ADCX and PREFETCHW instruction set
>> support.
>> ?skylake?
>> ???? Intel Skylake CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2, SSE3,
>> SSSE3, SSE4.1, SSE4.2, POPCNT, AVX, AVX2, AES, PCLMUL, FSGSBASE, RDRND,
>> FMA, BMI, BMI2, F16C, RDSEED, ADCX, PREFETCHW, CLFLUSHOPT, XSAVEC and
>> XSAVES instruction set support.
>> ?bonnell?
>> ???? Intel Bonnell CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2, SSE3
>> and SSSE3 instruction set support.
>> ?silvermont?
>> ???? Intel Silvermont CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2,
>> SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, AES, PCLMUL and RDRND instruction set
>> support.
>> ?knl?
>> ???? Intel Knight's Landing CPU with 64-bit extensions, MOVBE, MMX, SSE,
>> SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, AVX, AVX2, AES, PCLMUL,
>> FSGSBASE, RDRND, FMA, BMI, BMI2, F16C, RDSEED, ADCX, PREFETCHW, AVX512F,
>> AVX512PF, AVX512ER and AVX512CD instruction set support.
>> ?skylake-avx512?
>> ???? Intel Skylake Server CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2,
>> SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, PKU, AVX, AVX2, AES, PCLMUL, FSGSBASE,
>> RDRND, FMA, BMI, BMI2, F16C, RDSEED, ADCX, PREFETCHW, CLFLUSHOPT, XSAVEC,
>> XSAVES, AVX512F, AVX512VL, AVX512BW, AVX512DQ and AVX512CD instruction set
>> support.
>>
>> "
>>
>> Comments are welcome,
>> Laurent
>>

From thomas.stuefe at gmail.com  Fri Oct 20 10:56:48 2017
From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=)
Date: Fri, 20 Oct 2017 12:56:48 +0200
Subject: Upgrading gcc arch ?
In-Reply-To: <7bbf5a0d-ed69-6f6e-6c59-2373a465f65d@oracle.com>
References: <CAKjRUT5v=b+7pNeXh4eCJNOC--KKDTL-Qud6oUv9uDmQrxrCUQ@mail.gmail.com>
 <CAKjRUT6SjH0xOQT7neS+nntX2My_MPos6et323oUi5jxxn=omA@mail.gmail.com>
 <CAKjRUT5=se_Dqw0egrkdvOAPaFwk=LiMfLy3TMEmJWNiWSEiSw@mail.gmail.com>
 <CAKjRUT7ekkwONtZBteWqhF6yVwE_Gk75qJjB1aTx-XKR76yabQ@mail.gmail.com>
 <CAKjRUT6nTC76qDikTXKA4M1+a=SR20DEOSq9+SxcLM4h9nTjfQ@mail.gmail.com>
 <CAKjRUT4O-FDUFHj2f4nUf1c2RHtnQ2eEELgff+cAmeT_NMBg1Q@mail.gmail.com>
 <CAKjRUT7eVheEc1XdsqsG9-aTp_Q8n_KDgBTuJBQtmcU9zLd9dw@mail.gmail.com>
 <CAKjRUT4vr=BHfuEXQ=kRMeNEqUpE=rKEEk5Bk0NPU2xkPypUQw@mail.gmail.com>
 <CAKjRUT6UOvsFwTRCp5mm3=b1PtLQtQufVdQ2mHL88Wc3+htJbA@mail.gmail.com>
 <CAKjRUT6qh6XX1cV=eaC+d7dBuFRecPtitakEQhxxoj4zYC24XQ@mail.gmail.com>
 <CAKjRUT4E36pkm0q5aJpW6QwQz4SA06_aEA5DdLnOnu_qv3Mw6Q@mail.gmail.com>
 <CAKjRUT7e46+_uvZ9ngsTTgr4i_3v1i3=Nrc0tuyzqXuNRWTTMg@mail.gmail.com>
 <CAKjRUT7LQWpAuK6672FHRFf90veYrprxEgUJF2EsYaWcsoEQ6g@mail.gmail.com>
 <f1d78c33-d5b0-cff7-6b31-dafd3be84338@oracle.com>
 <7bbf5a0d-ed69-6f6e-6c59-2373a465f65d@oracle.com>
Message-ID: <CAA-vtUz7hZS9j3X0sFcaKOq3RSq=ndgG7+yi1WjF2L9H37eBfg@mail.gmail.com>

On Fri, Oct 20, 2017 at 12:11 PM, Robbin Ehn <robbin.ehn at oracle.com> wrote:

> On 2017-10-20 11:19, David Holmes wrote:
>
>> bcc'ing the discuss list
>>
>> On 20/10/2017 6:19 PM, Laurent Bourg?s wrote:
>>
>>> Hi,
>>>
>>> I wonder if it is time to compile c/c++ code with a more recent cpu
>>> architecture (x86-64 is quite old: only SSE ?) to take benefit of
>>> performance optimizations offered by recent CPU and compilers (AVX...).
>>>
>>
>> The focus in hotspot is on JIT generated code which does take advantage
>> of such optimizations based on the runtime CPU capabilities.
>>
>> Is there specific C code in the JDK that you think would benefit from
>> them?
>>
>
> If there are specific code that preform much better with new some newer
> features we could utilize function multiversioning feature in the gcc.
> E.g.:
> __attribute__((target_clones("sse4.2","sse3","default")))
> void stream_function(...) {
> ry
> Negative impact on size, so as David says, benchmark first.
>
>
But how would this help with gcc specific optimizations ?  You can provide
your own implementation, but I thought the idea was to let gcc do the
optimization work via -mtune. We still would have one global mtune setting
for the compilation unit, right?

..Thomas


> /Robbin
>
>
>
>> Have you done comparison builds and run any benchmarks?
>>
>> Thanks,
>> David
>>
>> Of course that means such builds would be specific to a CPU class and that
>>> will require build changes to make multiple flavors depending on the CPU
>>> classes ...
>>>
>>> See gcc -mtune argument:
>>> https://gcc.gnu.org/onlinedocs/gcc-6.1.0/gcc/x86-Options.html
>>>
>>> "
>>> ?sandybridge?
>>>      Intel Sandy Bridge CPU with 64-bit extensions, MMX, SSE, SSE2, SSE3,
>>> SSSE3, SSE4.1, SSE4.2, POPCNT, AVX, AES and PCLMUL instruction set
>>> support.
>>> ?ivybridge?
>>>      Intel Ivy Bridge CPU with 64-bit extensions, MMX, SSE, SSE2, SSE3,
>>> SSSE3, SSE4.1, SSE4.2, POPCNT, AVX, AES, PCLMUL, FSGSBASE, RDRND and F16C
>>> instruction set support.
>>> ?haswell?
>>>      Intel Haswell CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2,
>>> SSE3,
>>> SSSE3, SSE4.1, SSE4.2, POPCNT, AVX, AVX2, AES, PCLMUL, FSGSBASE, RDRND,
>>> FMA, BMI, BMI2 and F16C instruction set support.
>>> ?broadwell?
>>>      Intel Broadwell CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2,
>>> SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, AVX, AVX2, AES, PCLMUL, FSGSBASE,
>>> RDRND, FMA, BMI, BMI2, F16C, RDSEED, ADCX and PREFETCHW instruction set
>>> support.
>>> ?skylake?
>>>      Intel Skylake CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2,
>>> SSE3,
>>> SSSE3, SSE4.1, SSE4.2, POPCNT, AVX, AVX2, AES, PCLMUL, FSGSBASE, RDRND,
>>> FMA, BMI, BMI2, F16C, RDSEED, ADCX, PREFETCHW, CLFLUSHOPT, XSAVEC and
>>> XSAVES instruction set support.
>>> ?bonnell?
>>>      Intel Bonnell CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2,
>>> SSE3
>>> and SSSE3 instruction set support.
>>> ?silvermont?
>>>      Intel Silvermont CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2,
>>> SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, AES, PCLMUL and RDRND instruction
>>> set
>>> support.
>>> ?knl?
>>>      Intel Knight's Landing CPU with 64-bit extensions, MOVBE, MMX, SSE,
>>> SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, AVX, AVX2, AES, PCLMUL,
>>> FSGSBASE, RDRND, FMA, BMI, BMI2, F16C, RDSEED, ADCX, PREFETCHW, AVX512F,
>>> AVX512PF, AVX512ER and AVX512CD instruction set support.
>>> ?skylake-avx512?
>>>      Intel Skylake Server CPU with 64-bit extensions, MOVBE, MMX, SSE,
>>> SSE2,
>>> SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, PKU, AVX, AVX2, AES, PCLMUL,
>>> FSGSBASE,
>>> RDRND, FMA, BMI, BMI2, F16C, RDSEED, ADCX, PREFETCHW, CLFLUSHOPT, XSAVEC,
>>> XSAVES, AVX512F, AVX512VL, AVX512BW, AVX512DQ and AVX512CD instruction
>>> set
>>> support.
>>>
>>> "
>>>
>>> Comments are welcome,
>>> Laurent
>>>
>>>

From robbin.ehn at oracle.com  Fri Oct 20 11:37:50 2017
From: robbin.ehn at oracle.com (Robbin Ehn)
Date: Fri, 20 Oct 2017 13:37:50 +0200
Subject: Upgrading gcc arch ?
In-Reply-To: <CAA-vtUz7hZS9j3X0sFcaKOq3RSq=ndgG7+yi1WjF2L9H37eBfg@mail.gmail.com>
References: <CAKjRUT5v=b+7pNeXh4eCJNOC--KKDTL-Qud6oUv9uDmQrxrCUQ@mail.gmail.com>
 <CAKjRUT7ekkwONtZBteWqhF6yVwE_Gk75qJjB1aTx-XKR76yabQ@mail.gmail.com>
 <CAKjRUT6nTC76qDikTXKA4M1+a=SR20DEOSq9+SxcLM4h9nTjfQ@mail.gmail.com>
 <CAKjRUT4O-FDUFHj2f4nUf1c2RHtnQ2eEELgff+cAmeT_NMBg1Q@mail.gmail.com>
 <CAKjRUT7eVheEc1XdsqsG9-aTp_Q8n_KDgBTuJBQtmcU9zLd9dw@mail.gmail.com>
 <CAKjRUT4vr=BHfuEXQ=kRMeNEqUpE=rKEEk5Bk0NPU2xkPypUQw@mail.gmail.com>
 <CAKjRUT6UOvsFwTRCp5mm3=b1PtLQtQufVdQ2mHL88Wc3+htJbA@mail.gmail.com>
 <CAKjRUT6qh6XX1cV=eaC+d7dBuFRecPtitakEQhxxoj4zYC24XQ@mail.gmail.com>
 <CAKjRUT4E36pkm0q5aJpW6QwQz4SA06_aEA5DdLnOnu_qv3Mw6Q@mail.gmail.com>
 <CAKjRUT7e46+_uvZ9ngsTTgr4i_3v1i3=Nrc0tuyzqXuNRWTTMg@mail.gmail.com>
 <CAKjRUT7LQWpAuK6672FHRFf90veYrprxEgUJF2EsYaWcsoEQ6g@mail.gmail.com>
 <f1d78c33-d5b0-cff7-6b31-dafd3be84338@oracle.com>
 <7bbf5a0d-ed69-6f6e-6c59-2373a465f65d@oracle.com>
 <CAA-vtUz7hZS9j3X0sFcaKOq3RSq=ndgG7+yi1WjF2L9H37eBfg@mail.gmail.com>
Message-ID: <9272af52-8d32-ff5d-49f2-098223096173@oracle.com>

On 2017-10-20 12:56, Thomas St?fe wrote:
> 
> 
> On Fri, Oct 20, 2017 at 12:11 PM, Robbin Ehn <robbin.ehn at oracle.com <mailto:robbin.ehn at oracle.com>> wrote:
> 
>     On 2017-10-20 11:19, David Holmes wrote:
> 
>         bcc'ing the discuss list
> 
>         On 20/10/2017 6:19 PM, Laurent Bourg?s wrote:
> 
>             Hi,
> 
>             I wonder if it is time to compile c/c++ code with a more recent cpu
>             architecture (x86-64 is quite old: only SSE ?) to take benefit of
>             performance optimizations offered by recent CPU and compilers (AVX...).
> 
> 
>         The focus in hotspot is on JIT generated code which does take advantage of such optimizations based on the runtime CPU capabilities.
> 
>         Is there specific C code in the JDK that you think would benefit from them?
> 
> 
>     If there are specific code that preform much better with new some newer features we could utilize function multiversioning feature in the gcc.
>     E.g.:
>     __attribute__((target_clones("sse4.2","sse3","default")))
>     void stream_function(...) {
>     ry
>     Negative impact on size, so as David says, benchmark first.
> 
> 
> But how would this help with gcc specific optimizations ?? You can provide your own implementation, but I thought the idea was to let gcc do the optimization work via -mtune. We still would have one global mtune setting for the compilation unit, right?

target_clones attribute "is used to specify that a function be cloned into multiple versions compiled with different target options than specified on the command line."
gcc generates, in above, 3 functions, you can also do:
__attribute__((target_clones("arch=znver1","arch=skylake", "default")))

So you get:
[rehn at rehn-lt ~]$ nm a.out  | grep stream_function
00000000004009e0 T stream_function
0000000000400bc0 t stream_function.arch_skylake.1
0000000000400b90 t stream_function.arch_znver1.0
0000000000400bf0 i stream_function.ifunc
0000000000400bf0 W stream_function.resolver

/Robbin

> 
> ..Thomas
> 
>     /Robbin
> 
> 
> 
>         Have you done comparison builds and run any benchmarks?
> 
>         Thanks,
>         David
> 
>             Of course that means such builds would be specific to a CPU class and that
>             will require build changes to make multiple flavors depending on the CPU
>             classes ...
> 
>             See gcc -mtune argument:
>             https://gcc.gnu.org/onlinedocs/gcc-6.1.0/gcc/x86-Options.html <https://gcc.gnu.org/onlinedocs/gcc-6.1.0/gcc/x86-Options.html>
> 
>             "
>             ?sandybridge?
>              ???? Intel Sandy Bridge CPU with 64-bit extensions, MMX, SSE, SSE2, SSE3,
>             SSSE3, SSE4.1, SSE4.2, POPCNT, AVX, AES and PCLMUL instruction set support.
>             ?ivybridge?
>              ???? Intel Ivy Bridge CPU with 64-bit extensions, MMX, SSE, SSE2, SSE3,
>             SSSE3, SSE4.1, SSE4.2, POPCNT, AVX, AES, PCLMUL, FSGSBASE, RDRND and F16C
>             instruction set support.
>             ?haswell?
>              ???? Intel Haswell CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2, SSE3,
>             SSSE3, SSE4.1, SSE4.2, POPCNT, AVX, AVX2, AES, PCLMUL, FSGSBASE, RDRND,
>             FMA, BMI, BMI2 and F16C instruction set support.
>             ?broadwell?
>              ???? Intel Broadwell CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2,
>             SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, AVX, AVX2, AES, PCLMUL, FSGSBASE,
>             RDRND, FMA, BMI, BMI2, F16C, RDSEED, ADCX and PREFETCHW instruction set
>             support.
>             ?skylake?
>              ???? Intel Skylake CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2, SSE3,
>             SSSE3, SSE4.1, SSE4.2, POPCNT, AVX, AVX2, AES, PCLMUL, FSGSBASE, RDRND,
>             FMA, BMI, BMI2, F16C, RDSEED, ADCX, PREFETCHW, CLFLUSHOPT, XSAVEC and
>             XSAVES instruction set support.
>             ?bonnell?
>              ???? Intel Bonnell CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2, SSE3
>             and SSSE3 instruction set support.
>             ?silvermont?
>              ???? Intel Silvermont CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2,
>             SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, AES, PCLMUL and RDRND instruction set
>             support.
>             ?knl?
>              ???? Intel Knight's Landing CPU with 64-bit extensions, MOVBE, MMX, SSE,
>             SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, AVX, AVX2, AES, PCLMUL,
>             FSGSBASE, RDRND, FMA, BMI, BMI2, F16C, RDSEED, ADCX, PREFETCHW, AVX512F,
>             AVX512PF, AVX512ER and AVX512CD instruction set support.
>             ?skylake-avx512?
>              ???? Intel Skylake Server CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2,
>             SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, PKU, AVX, AVX2, AES, PCLMUL, FSGSBASE,
>             RDRND, FMA, BMI, BMI2, F16C, RDSEED, ADCX, PREFETCHW, CLFLUSHOPT, XSAVEC,
>             XSAVES, AVX512F, AVX512VL, AVX512BW, AVX512DQ and AVX512CD instruction set
>             support.
> 
>             "
> 
>             Comments are welcome,
>             Laurent
> 
> 

From karen.kinnear at oracle.com  Fri Oct 20 16:24:17 2017
From: karen.kinnear at oracle.com (Karen Kinnear)
Date: Fri, 20 Oct 2017 12:24:17 -0400
Subject: RFR(XL): 8185640: Thread-local handshakes
In-Reply-To: <6f2f6259-73f1-c09c-063e-39ae528fb96f@oracle.com>
References: <6f2f6259-73f1-c09c-063e-39ae528fb96f@oracle.com>
Message-ID: <3018D48F-245A-4C92-9CED-5692BBD88E8C@oracle.com>

Robbin, Erik, Mikael -

Delighted to see this! Looks good. I don?t need to see any updates - these are minor comments.
Thank you for the performance testing

Couple of questions/comments:
1. platform support
supports_thread_local_poll returns true for AMD64 or SPARC
Your comment said Linux x64 and Sparc only.
What about Mac and Windows?

2. safepointMechanism_inline.hpp - comment clarification
line 42 - ?Mutexes can be taken but none JavaThread?. 
Are you saying: ?Non-JavaThreads do not support handshakes, but must stop for
safepoints.?
Not sure what the Mutex comment is about

3. globals.hpp
The way I understand this - ThreadLocalHandshakes flag is not so much to enable
use of ThreadLocalHandle operations, but to enable use of TLH for global safe point.
If that is true, could you possibly at least clarify this in the comment if there is not
a better name for the flag?

4. thank you for looking into startup performance and interpreter return/backward branch checks.

5. handshake.cpp
Could you possibly add a comment that thread_has_completed and/or pool_for_completed_thread
means that the thread has either done the operation or the operation has been cancelled?
I get that we are polling this to tell when it is safe to return to the synchronous requestor not to
determine if the thread actually performed the operation. The comment would make that clearer.

thanks,
Karen

> On Oct 11, 2017, at 9:37 AM, Robbin Ehn <robbin.ehn at oracle.com> wrote:
> 
> Hi all,
> 
> Starting the review of the code while JEP work is still not completed.
> 
> JEP: https://bugs.openjdk.java.net/browse/JDK-8185640
> 
> This JEP introduces a way to execute a callback on threads without performing a global VM safepoint. It makes it both possible and cheap to stop individual threads and not just all threads or none.
> 
> Entire changeset:
> http://cr.openjdk.java.net/~rehn/8185640/v0/flat/
> 
> Divided into 3-parts,
> SafepointMechanism abstraction:
> http://cr.openjdk.java.net/~rehn/8185640/v0/SafepointMechanism-0/
> Consolidating polling page allocation:
> http://cr.openjdk.java.net/~rehn/8185640/v0/PollingPage-1/
> Handshakes:
> http://cr.openjdk.java.net/~rehn/8185640/v0/Handshakes-2/
> 
> A handshake operation is a callback that is executed for each JavaThread while that thread is in a safepoint safe state. The callback is executed either by the thread itself or by the VM thread while keeping the thread in a blocked state. The big difference between safepointing and handshaking is that the per thread operation will be performed on all threads as soon as possible and they will continue to execute as soon as it?s own operation is completed. If a JavaThread is known to be running, then a handshake can be performed with that single JavaThread as well.
> 
> The current safepointing scheme is modified to perform an indirection through a per-thread pointer which will allow a single thread's execution to be forced to trap on the guard page. In order to force a thread to yield the VM updates the per-thread pointer for the corresponding thread to point to the guarded page.
> 
> Example of potential use-cases:
> -Biased lock revocation
> -External requests for stack traces
> -Deoptimization
> -Async exception delivery
> -External suspension
> -Eliding memory barriers
> 
> All of these will benefit the VM moving towards becoming more low-latency friendly by reducing the number of global safepoints.
> Platforms that do not yet implement the per JavaThread poll, a fallback to normal safepoint is in place. HandshakeOneThread will then be a normal safepoint. The supported platforms are Linux x64 and Solaris SPARC.
> 
> Tested heavily with various test suits and comes with a few new tests.
> 
> Performance testing using standardized benchmark show no signification changes, the latest number was -0.7% on Linux x64 and +1.5% Solaris SPARC (not statistically ensured). A minor regression for the load vs load load on x64 is expected and a slight increase on SPARC due to the cost of ?materializing? the page vs load load.
> The time to trigger a safepoint was measured on a large machine to not be an issue. The looping over threads and arming the polling page will benefit from the work on JavaThread life-cycle (8167108 - SMR and JavaThread Lifecycle: http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2017-October/024773.html) which puts all JavaThreads in an array instead of a linked list.
> 
> Thanks, Robbin


From thomas.stuefe at gmail.com  Fri Oct 20 17:12:49 2017
From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=)
Date: Fri, 20 Oct 2017 19:12:49 +0200
Subject: Upgrading gcc arch ?
In-Reply-To: <9272af52-8d32-ff5d-49f2-098223096173@oracle.com>
References: <CAKjRUT5v=b+7pNeXh4eCJNOC--KKDTL-Qud6oUv9uDmQrxrCUQ@mail.gmail.com>
 <CAKjRUT7ekkwONtZBteWqhF6yVwE_Gk75qJjB1aTx-XKR76yabQ@mail.gmail.com>
 <CAKjRUT6nTC76qDikTXKA4M1+a=SR20DEOSq9+SxcLM4h9nTjfQ@mail.gmail.com>
 <CAKjRUT4O-FDUFHj2f4nUf1c2RHtnQ2eEELgff+cAmeT_NMBg1Q@mail.gmail.com>
 <CAKjRUT7eVheEc1XdsqsG9-aTp_Q8n_KDgBTuJBQtmcU9zLd9dw@mail.gmail.com>
 <CAKjRUT4vr=BHfuEXQ=kRMeNEqUpE=rKEEk5Bk0NPU2xkPypUQw@mail.gmail.com>
 <CAKjRUT6UOvsFwTRCp5mm3=b1PtLQtQufVdQ2mHL88Wc3+htJbA@mail.gmail.com>
 <CAKjRUT6qh6XX1cV=eaC+d7dBuFRecPtitakEQhxxoj4zYC24XQ@mail.gmail.com>
 <CAKjRUT4E36pkm0q5aJpW6QwQz4SA06_aEA5DdLnOnu_qv3Mw6Q@mail.gmail.com>
 <CAKjRUT7e46+_uvZ9ngsTTgr4i_3v1i3=Nrc0tuyzqXuNRWTTMg@mail.gmail.com>
 <CAKjRUT7LQWpAuK6672FHRFf90veYrprxEgUJF2EsYaWcsoEQ6g@mail.gmail.com>
 <f1d78c33-d5b0-cff7-6b31-dafd3be84338@oracle.com>
 <7bbf5a0d-ed69-6f6e-6c59-2373a465f65d@oracle.com>
 <CAA-vtUz7hZS9j3X0sFcaKOq3RSq=ndgG7+yi1WjF2L9H37eBfg@mail.gmail.com>
 <9272af52-8d32-ff5d-49f2-098223096173@oracle.com>
Message-ID: <CAA-vtUwEk0L+esfuVZY8AQGgrs9+sQg0RrmhzYJx=iLF_FywaQ@mail.gmail.com>

On Fri, Oct 20, 2017 at 1:37 PM, Robbin Ehn <robbin.ehn at oracle.com> wrote:

> On 2017-10-20 12:56, Thomas St?fe wrote:
>
>>
>>
>> On Fri, Oct 20, 2017 at 12:11 PM, Robbin Ehn <robbin.ehn at oracle.com
>> <mailto:robbin.ehn at oracle.com>> wrote:
>>
>>     On 2017-10-20 11:19, David Holmes wrote:
>>
>>         bcc'ing the discuss list
>>
>>         On 20/10/2017 6:19 PM, Laurent Bourg?s wrote:
>>
>>             Hi,
>>
>>             I wonder if it is time to compile c/c++ code with a more
>> recent cpu
>>             architecture (x86-64 is quite old: only SSE ?) to take
>> benefit of
>>             performance optimizations offered by recent CPU and compilers
>> (AVX...).
>>
>>
>>         The focus in hotspot is on JIT generated code which does take
>> advantage of such optimizations based on the runtime CPU capabilities.
>>
>>         Is there specific C code in the JDK that you think would benefit
>> from them?
>>
>>
>>     If there are specific code that preform much better with new some
>> newer features we could utilize function multiversioning feature in the gcc.
>>     E.g.:
>>     __attribute__((target_clones("sse4.2","sse3","default")))
>>     void stream_function(...) {
>>     ry
>>     Negative impact on size, so as David says, benchmark first.
>>
>>
>> But how would this help with gcc specific optimizations ?  You can
>> provide your own implementation, but I thought the idea was to let gcc do
>> the optimization work via -mtune. We still would have one global mtune
>> setting for the compilation unit, right?
>>
>
> target_clones attribute "is used to specify that a function be cloned into
> multiple versions compiled with different target options than specified on
> the command line."
> gcc generates, in above, 3 functions, you can also do:
> __attribute__((target_clones("arch=znver1","arch=skylake", "default")))
>
> So you get:
> [rehn at rehn-lt ~]$ nm a.out  | grep stream_function
> 00000000004009e0 T stream_function
> 0000000000400bc0 t stream_function.arch_skylake.1
> 0000000000400b90 t stream_function.arch_znver1.0
> 0000000000400bf0 i stream_function.ifunc
> 0000000000400bf0 W stream_function.resolver
>
> /Robbin
>
>
Very interesting, thanks for the pointer. I did not know that was possible.

Best Regards, Thomas


>
>> ..Thomas
>>
>>     /Robbin
>>
>>
>>
>>         Have you done comparison builds and run any benchmarks?
>>
>>         Thanks,
>>         David
>>
>>             Of course that means such builds would be specific to a CPU
>> class and that
>>             will require build changes to make multiple flavors depending
>> on the CPU
>>             classes ...
>>
>>             See gcc -mtune argument:
>>             https://gcc.gnu.org/onlinedocs/gcc-6.1.0/gcc/x86-Options.html
>> <https://gcc.gnu.org/onlinedocs/gcc-6.1.0/gcc/x86-Options.html>
>>
>>
>>             "
>>             ?sandybridge?
>>                   Intel Sandy Bridge CPU with 64-bit extensions, MMX,
>> SSE, SSE2, SSE3,
>>             SSSE3, SSE4.1, SSE4.2, POPCNT, AVX, AES and PCLMUL
>> instruction set support.
>>             ?ivybridge?
>>                   Intel Ivy Bridge CPU with 64-bit extensions, MMX, SSE,
>> SSE2, SSE3,
>>             SSSE3, SSE4.1, SSE4.2, POPCNT, AVX, AES, PCLMUL, FSGSBASE,
>> RDRND and F16C
>>             instruction set support.
>>             ?haswell?
>>                   Intel Haswell CPU with 64-bit extensions, MOVBE, MMX,
>> SSE, SSE2, SSE3,
>>             SSSE3, SSE4.1, SSE4.2, POPCNT, AVX, AVX2, AES, PCLMUL,
>> FSGSBASE, RDRND,
>>             FMA, BMI, BMI2 and F16C instruction set support.
>>             ?broadwell?
>>                   Intel Broadwell CPU with 64-bit extensions, MOVBE, MMX,
>> SSE, SSE2,
>>             SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, AVX, AVX2, AES, PCLMUL,
>> FSGSBASE,
>>             RDRND, FMA, BMI, BMI2, F16C, RDSEED, ADCX and PREFETCHW
>> instruction set
>>             support.
>>             ?skylake?
>>                   Intel Skylake CPU with 64-bit extensions, MOVBE, MMX,
>> SSE, SSE2, SSE3,
>>             SSSE3, SSE4.1, SSE4.2, POPCNT, AVX, AVX2, AES, PCLMUL,
>> FSGSBASE, RDRND,
>>             FMA, BMI, BMI2, F16C, RDSEED, ADCX, PREFETCHW, CLFLUSHOPT,
>> XSAVEC and
>>             XSAVES instruction set support.
>>             ?bonnell?
>>                   Intel Bonnell CPU with 64-bit extensions, MOVBE, MMX,
>> SSE, SSE2, SSE3
>>             and SSSE3 instruction set support.
>>             ?silvermont?
>>                   Intel Silvermont CPU with 64-bit extensions, MOVBE,
>> MMX, SSE, SSE2,
>>             SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, AES, PCLMUL and RDRND
>> instruction set
>>             support.
>>             ?knl?
>>                   Intel Knight's Landing CPU with 64-bit extensions,
>> MOVBE, MMX, SSE,
>>             SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, AVX, AVX2, AES,
>> PCLMUL,
>>             FSGSBASE, RDRND, FMA, BMI, BMI2, F16C, RDSEED, ADCX,
>> PREFETCHW, AVX512F,
>>             AVX512PF, AVX512ER and AVX512CD instruction set support.
>>             ?skylake-avx512?
>>                   Intel Skylake Server CPU with 64-bit extensions, MOVBE,
>> MMX, SSE, SSE2,
>>             SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, PKU, AVX, AVX2, AES,
>> PCLMUL, FSGSBASE,
>>             RDRND, FMA, BMI, BMI2, F16C, RDSEED, ADCX, PREFETCHW,
>> CLFLUSHOPT, XSAVEC,
>>             XSAVES, AVX512F, AVX512VL, AVX512BW, AVX512DQ and AVX512CD
>> instruction set
>>             support.
>>
>>             "
>>
>>             Comments are welcome,
>>             Laurent
>>
>>
>>

From peter.lawrey at gmail.com  Fri Oct 20 08:31:34 2017
From: peter.lawrey at gmail.com (Peter Lawrey)
Date: Fri, 20 Oct 2017 09:31:34 +0100
Subject: Upgrading gcc arch ?
In-Reply-To: <CAKjRUT7LQWpAuK6672FHRFf90veYrprxEgUJF2EsYaWcsoEQ6g@mail.gmail.com>
References: <CAKjRUT5v=b+7pNeXh4eCJNOC--KKDTL-Qud6oUv9uDmQrxrCUQ@mail.gmail.com>
 <CAKjRUT7mNgFy4V_+zArhv3C3ptPk9obbaQz3dyw0VnM7dwQbxQ@mail.gmail.com>
 <CAKjRUT5eAHPHC+9Q8Q0YiSfVUugi88Qg_d0RNXO=VgoVgA5Xgw@mail.gmail.com>
 <CAKjRUT60d2Sjewzy2QBxqfux-1EYbVW5dDHDFS6ACaDqtuxZ8Q@mail.gmail.com>
 <CAKjRUT69OptnFzm0kFnJLhWrWvKMFftjZ9e1tAFLF=6ii1rbcg@mail.gmail.com>
 <CAKjRUT7zcBphjWK=A7thLJxbd7Z0YYJVgKf2mgteWjChAdmPxg@mail.gmail.com>
 <CAKjRUT4qJCg8feuWMiasRpDfX1a1sTDjofSUOFbK2rdXaD4abg@mail.gmail.com>
 <CAKjRUT6SjH0xOQT7neS+nntX2My_MPos6et323oUi5jxxn=omA@mail.gmail.com>
 <CAKjRUT5=se_Dqw0egrkdvOAPaFwk=LiMfLy3TMEmJWNiWSEiSw@mail.gmail.com>
 <CAKjRUT7ekkwONtZBteWqhF6yVwE_Gk75qJjB1aTx-XKR76yabQ@mail.gmail.com>
 <CAKjRUT6nTC76qDikTXKA4M1+a=SR20DEOSq9+SxcLM4h9nTjfQ@mail.gmail.com>
 <CAKjRUT4O-FDUFHj2f4nUf1c2RHtnQ2eEELgff+cAmeT_NMBg1Q@mail.gmail.com>
 <CAKjRUT7eVheEc1XdsqsG9-aTp_Q8n_KDgBTuJBQtmcU9zLd9dw@mail.gmail.com>
 <CAKjRUT4vr=BHfuEXQ=kRMeNEqUpE=rKEEk5Bk0NPU2xkPypUQw@mail.gmail.com>
 <CAKjRUT6UOvsFwTRCp5mm3=b1PtLQtQufVdQ2mHL88Wc3+htJbA@mail.gmail.com>
 <CAKjRUT6qh6XX1cV=eaC+d7dBuFRecPtitakEQhxxoj4zYC24XQ@mail.gmail.com>
 <CAKjRUT4E36pkm0q5aJpW6QwQz4SA06_aEA5DdLnOnu_qv3Mw6Q@mail.gmail.com>
 <CAKjRUT7e46+_uvZ9ngsTTgr4i_3v1i3=Nrc0tuyzqXuNRWTTMg@mail.gmail.com>
 <CAKjRUT7LQWpAuK6672FHRFf90veYrprxEgUJF2EsYaWcsoEQ6g@mail.gmail.com>
Message-ID: <CAOgwmTixdOKG1v0_JDu+DpgZr1DUksMcm6iikud69kFBMEFWCQ@mail.gmail.com>

I know the drive is toward smaller builds, but it would be good to auto
select the CPU level at run time.

I suspect however, this is something the OpenJDK (or a vendor supporting
it) could do.

Perhaps code which is CPU model sensitive could be placed in a small shared
library with multiple versions and the appropriate build selected at
runtime or on installation.

Regards,
   Peter.
?

On 20 October 2017 at 09:19, Laurent Bourg?s <bourges.laurent at gmail.com>
wrote:

> Hi,
>
> I wonder if it is time to compile c/c++ code with a more recent cpu
> architecture (x86-64 is quite old: only SSE ?) to take benefit of
> performance optimizations offered by recent CPU and compilers (AVX...).
>
> Of course that means such builds would be specific to a CPU class and that
> will require build changes to make multiple flavors depending on the CPU
> classes ...
>
> See gcc -mtune argument:
> https://gcc.gnu.org/onlinedocs/gcc-6.1.0/gcc/x86-Options.html
>
> "
> ?sandybridge?
>     Intel Sandy Bridge CPU with 64-bit extensions, MMX, SSE, SSE2, SSE3,
> SSSE3, SSE4.1, SSE4.2, POPCNT, AVX, AES and PCLMUL instruction set support.
> ?ivybridge?
>     Intel Ivy Bridge CPU with 64-bit extensions, MMX, SSE, SSE2, SSE3,
> SSSE3, SSE4.1, SSE4.2, POPCNT, AVX, AES, PCLMUL, FSGSBASE, RDRND and F16C
> instruction set support.
> ?haswell?
>     Intel Haswell CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2, SSE3,
> SSSE3, SSE4.1, SSE4.2, POPCNT, AVX, AVX2, AES, PCLMUL, FSGSBASE, RDRND,
> FMA, BMI, BMI2 and F16C instruction set support.
> ?broadwell?
>     Intel Broadwell CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2,
> SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, AVX, AVX2, AES, PCLMUL, FSGSBASE,
> RDRND, FMA, BMI, BMI2, F16C, RDSEED, ADCX and PREFETCHW instruction set
> support.
> ?skylake?
>     Intel Skylake CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2, SSE3,
> SSSE3, SSE4.1, SSE4.2, POPCNT, AVX, AVX2, AES, PCLMUL, FSGSBASE, RDRND,
> FMA, BMI, BMI2, F16C, RDSEED, ADCX, PREFETCHW, CLFLUSHOPT, XSAVEC and
> XSAVES instruction set support.
> ?bonnell?
>     Intel Bonnell CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2, SSE3
> and SSSE3 instruction set support.
> ?silvermont?
>     Intel Silvermont CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2,
> SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, AES, PCLMUL and RDRND instruction set
> support.
> ?knl?
>     Intel Knight's Landing CPU with 64-bit extensions, MOVBE, MMX, SSE,
> SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, AVX, AVX2, AES, PCLMUL,
> FSGSBASE, RDRND, FMA, BMI, BMI2, F16C, RDSEED, ADCX, PREFETCHW, AVX512F,
> AVX512PF, AVX512ER and AVX512CD instruction set support.
> ?skylake-avx512?
>     Intel Skylake Server CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2,
> SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, PKU, AVX, AVX2, AES, PCLMUL, FSGSBASE,
> RDRND, FMA, BMI, BMI2, F16C, RDSEED, ADCX, PREFETCHW, CLFLUSHOPT, XSAVEC,
> XSAVES, AVX512F, AVX512VL, AVX512BW, AVX512DQ and AVX512CD instruction set
> support.
>
> "
>
> Comments are welcome,
> Laurent
>

From bob.vandette at oracle.com  Fri Oct 20 18:44:31 2017
From: bob.vandette at oracle.com (Bob Vandette)
Date: Fri, 20 Oct 2017 14:44:31 -0400
Subject: RFR: 8146115 - Improve docker container detection and resource
 configuration usage 
In-Reply-To: <5d217c60-3049-30a6-c207-d6c9274a5ddf@oracle.com>
References: <74630458-926E-4B3E-B967-6F6ADCA0A406@oracle.com>
 <39AD9F8D-7E2B-4C15-8525-36DBA7C74302@oracle.com>
 <d3c36d08-1c9f-69bb-4e0c-aa00d3e62691@oracle.com>
 <F25C2D01-C643-4838-B2A7-DAA19D1B1EF5@oracle.com>
 <1d05a95f-75db-e0ca-e069-12fe41502e4f@oracle.com>
 <D2E3D675-2A4A-4ED1-BC30-6C52B185B731@oracle.com>
 <CAHeneC8cxBJ61A4u1AkfmpP63ZG6-WEkMyFLKnhRv=G6W32pFw@mail.gmail.com>
 <615e504d-94af-bae3-b721-6ca1dac6a567@oracle.com>
 <CAHeneC9tdgc9aCx-VLa8TvYufq_V-z+ApoAv5Cewi6UzX9uz8Q@mail.gmail.com>
 <1BD883DB-8C8B-405D-8F85-3A026B19286F@oracle.com>
 <5f7f3d85-db48-fe6e-28f5-e3f4858f33e8@oracle.com>
 <A5CDB304-6B44-448E-ADF3-C736FB49C85C@oracle.com>
 <ff0d651b-eea1-85a1-46d8-c4904556fce2@oracle.com>
 <CD1C6FA4-5729-40A7-91DC-224FCAC55949@oracle.com>
 <799205ae-ba9f-ce3a-8dd6-1a55e32689df@oracle.com>
 <9956F9D0-B01B-44FE-AE56-527907816436@oracle.com>
 <20ef0bac-1942-b29f-a9e2-4ea4d4f81cd2@oracle.com>
 <C080A2E3-5EFE-4753-8048-4B991CEBDB24@oracle.com>
 <5d217c60-3049-30a6-c207-d6c9274a5ddf@oracle.com>
Message-ID: <1C03FCB5-969B-4C43-8BAD-EF939515FEC2@oracle.com>

Here?s an updated webrev that hopefully takes care of all remaining comments.

http://cr.openjdk.java.net/~bobv/8146115/webrev.02

I added the deprecation of the UseCGroupMemoryLimitForHeap option this round since 
this experimental option should no longer be necessary.


Bob.


> On Oct 13, 2017, at 9:34 AM, David Holmes <David.Holmes at oracle.com> wrote:
> 
> Reading back through my suggestion for os.hpp initialize_container_support should just be init_container_support.
> 
> Thanks,
> David
> 
> On 13/10/2017 11:14 PM, Bob Vandette wrote:
>>> On Oct 12, 2017, at 11:08 PM, David Holmes <david.holmes at oracle.com> wrote:
>>> 
>>> Hi Bob,
>>> 
>>> On 13/10/2017 1:43 AM, Bob Vandette wrote:
>>>>> On Oct 11, 2017, at 9:04 PM, David Holmes <david.holmes at oracle.com> wrote:
>>>>> 
>>>>> Hi Bob,
>>>>> 
>>>>> On 12/10/2017 5:11 AM, Bob Vandette wrote:
>>>>>> Here?s an updated webrev for this RFE that contains changes and cleanups based on feedback I?ve received so far.
>>>>>> I?m still investigating the best approach for reacting to cpu shares and quotas.  I do not believe doing nothing is the answer.
>>>>> 
>>>>> I do. :) Let me try this again. When you run outside of a container you don't get 100% of the CPUs - you have to share with whatever else is running on the system. You get a fraction of CPU time based on the load. We don't try to communicate load information to the VM/application so it can adapt. Within a container setting shares/quotas is just a way of setting an artificial load. So why should we be treating it any differently?
>>>> Because today we optimize for a lightly loaded system and when running serverless applications in containers we should be
>>>> optimizing for a fully loaded system.  If developers don?t want this, then don?t use shares or quotas and you?ll have exactly
>>>> the behavior you have today.  I think we just have to document the new behavior (and how to turn it off) so people know what
>>>> to expect.
>>> 
>>> The person deploying the app may not have control over how the app is deployed in terms of shares/quotas. It all depends how (and who) manages the containers. This is a big part of my problem/concerns here that I don't know exactly how all this is organized and who knows what in advance and what they can control.
>>> 
>>> But I'll let this drop, other than raising an additional concern. I don't think just allowing the user to hardwire the number of processors to use will necessarily solve the problem with what available_processors() returns. I'm concerned the execution of the VM may occur in a context where the number of processors is not known in advance, and the user can not disable shares/quotas. In that case we may need to have a flag that says to ignore shares/quotas in the processor count calculation.
>> I?m not sure that?s a high probability issue.  It?s my understanding that whoever is configuring the container
>> management will be specifying the resources required to run these applications which comes along with a
>> guarantee of these resources.  If this issue does come up, I do have the -XX:-UseContainerSupport big
>> switch that turns all of this off.  It will however disable the memory support as well.
>>> 
>>>> You seem to discount the added cost of 100s of VMs creating lots of un-necessaary threads.  In the current JDK 10 code base,
>>>> In a heavily loaded system with 88 processors, VmData grows from 60MBs (1 cpu) to 376MB (88 cpus).  This is only mapped
>>>> memory and it depends heavily on how deep in the stack these threads go before it impacts VmRSS but it shows the potential downside
>>>> of having 100s of VMs thinking they each own the entire machine.
>>> 
>>> I agree that the default ergonomics does not scale well. Anyone doing any serious Java deployment tunes the VM explicitly and does not rely on the defaults. How will they do that in a container environment? I don't know.
>>> 
>>> I would love to see some actual deployment scenarios/experiences for this to understand things better.
>> This is one of the reasons I want to get this support out in JDK 10, to get some feedback under real scenarios.
>>> 
>>>> I haven?t even done any experiments to determine the added context switching cost if the VM decides to use excessive
>>>> pthreads.
>>>>> 
>>>>> That's not to say an API to provide load/shares/quota information may not be useful, but that is a separate issue to what the "active processor count" should report.
>>>> I don?t have a problem with active processor count reporting the number of processors we have, but I do have a problem
>>>> with our current usage of this information within the VM and Core libraries.
>>> 
>>> That is a somewhat separate issue. One worth pursuing separately.
>> We should look at this as part of the ?Container aware Java? JEP.
>>> 
>>>>> 
>>>>>> http://cr.openjdk.java.net/~bobv/8146115/webrev.01
>>>>>> Updates:
>>>>>> 1. I had to move the processing of AggressiveHeap since the container memory size needs to be known before this can be processed.
>>>>> 
>>>>> I don't like the placement of this - we don't call os:: init functions from inside Arguments - we manage the initialization sequence from Threads::create_vm. Seems to me that container initialization can/should happen in os::init_before_ergo, and the AggressiveHeap processing can occur at the start of Arguments::apply_ergo().
>>>>> 
>>>>> That said we need to be sure nothing touched by set_aggressive_heap_flags will be used before we now reach that code - there are a lot of flags being set in there.
>>>> This is exactly the reason why I put the call where it did.  I put the call to set_aggressive_heap_flags in finalize_vm_init_args
>>>> because that is exactly what this call is doing.  It?s finalizing flags used after the parsing.  The impacted flags are definitely being
>>>> used shortly after and before init_before_ergo is called.
>>> 
>>> I see that now and it is very unfortunate because I really do not like what you had to do here. As you can tell from the logic in create_vm we have always refactored to ensure we can progressively manage the interleaving of OS initialization with Arguments processing. So having a deep part of Argument processing go off and call some more OS initialization is not nice. That said I can't see a way around it without very unreasonable refactoring.
>>> 
>>> But I do have a couple of changes I'd like to request please:
>>> 
>>> 1. Move the call to os::initialize_container_support() up a level to before the call to finalize_vm_init_args(), with a more elaborate comment:
>>> 
>>> // We need to ensure processor and memory resources have been properly
>>> // configured - which may rely on arguments we just processed - before
>>> // doing the final argument processing. Any argument processing that
>>> // needs to know about processor and memory resources must occur after
>>> // this point.
>>> 
>>> os::initialize_container_support();
>>> 
>>> // Do final processing now that all arguments have been parsed
>>> result = finalize_vm_init_args(patch_mod_javabase);
>>> 
>>> 2. Simplify and modify os.hpp as follows:
>>> 
>>> +  LINUX_ONLY(static void pd_initialize_container_support();)
>>> 
>>>   public:
>>>    static void init(void);                      // Called before command line parsing
>>> 
>>> +   static void initialize_container_support() { // Called during command line parsing
>>> +     LINUX_ONLY(pd_initialize_container_support();)
>>> +   }
>>> 
>>>    static void init_before_ergo(void);          // Called after command line parsing
>>>                                                 // before VM ergonomics
>>> 
>>> 3. In thread.cpp add a comment here:
>>> 
>>>   // Parse arguments
>>> +  // Note: this internally calls os::initialize_container_support()
>>>   jint parse_result = Arguments::parse(args);
>> All very reasonable changes.
>> Thanks,
>> Bob.
>>> 
>>> Thanks.
>>> 
>>>>> 
>>>>>> 2. I no longer use the cpuset.cpus contents since sched_getaffinity reports the correct results
>>>>>> even if someone manually updates the cgroup data.  I originally didn?t think this was the case since
>>>>>> sched_setaffinity didn?t automatically update the cpuset file contents but the inverse is true.
>>>>> 
>>>>> Ok.
>>>>> 
>>>>>> 3. I ifdef?d the container function support in src/hotspot/share/runtime/os.hpp to avoid putting stubs in all other os
>>>>>> platform directories.  I can do this if it?s absolutely necessary.
>>>>> 
>>>>> You should not need to do this if initialization moves as I suggested above. os::init_before_ergo() in os_linux.cpp can call OSContainer::init().
>>>>> No need for os::initialize_container_support() or os::pd_initialize_container_support.
>>>> But os::init_before_ergo is in shared code.
>>> 
>>> Yep my bad - point is moot now anyway.
>>> 
>>> <snip>
>>> 
>>>>> src/hotspot/os/linux/os_linux.cpp/.hpp
>>>>> 
>>>>> 187         log_trace(os)("available container memory: " JULONG_FORMAT, avail_mem);
>>>>> 188         return avail_mem;
>>>>> 189       } else {
>>>>> 190         log_debug(os,container)("container memory usage call failed: " JLONG_FORMAT, mem_usage);
>>>>> 
>>>>> Why "trace" (the third logging level) to show the information, but "debug" (the second level) to show failed calls? You use debug in other files for basic info. Overall I'm unclear on your use of debug versus trace for the logging.
>>>> I use trace for noisy information that is not reporting errors and debug for failures that are informational and not fatal.
>>>> In this case, the call could return -1 or -2.  -1 is unlimited and -2 is an error.  In either case we fallback to the
>>>> standard system call to get available memory.  I would have used warning but since these messages were occurring
>>>> during a test run causing test failures.
>>> 
>>> Okay. Thanks for clarifying.
>>> 
>>>>> 
>>>>> ---
>>>>> 
>>>>> src/hotspot/os/linux/osContainer_linux.cpp
>>>>> 
>>>>> Dead code:
>>>>> 
>>>>> 376 #if 0
>>>>> 377   os::Linux::print_container_info(tty);
>>>>> ...
>>>>> 390 #endif
>>>> I left it in for standalone testing.  Should I use some other #if?
>>> 
>>> We don't generally leave in dead code in the runtime code. Do you see this as useful after you've finalized the changes?
>>> 
>>> Is this testing just for showing the logging? Is it worth making this a logging controlled call? Is it suitable for a Gtest test?
>>> 
>>> Thanks,
>>> David
>>> -----
>>> 
>>>> Bob.
>>>>> 
>>>>> Thanks,
>>>>> David
>>>>> 
>>>>>> Bob.


From kim.barrett at oracle.com  Sat Oct 21 05:23:47 2017
From: kim.barrett at oracle.com (Kim Barrett)
Date: Sat, 21 Oct 2017 01:23:47 -0400
Subject: RFR: 8189088: Add intrusive doubly-linked list utility
In-Reply-To: <F84BB0A2-9B92-48F9-B497-EF5626786462@oracle.com>
References: <F84BB0A2-9B92-48F9-B497-EF5626786462@oracle.com>
Message-ID: <DF8F78CB-ACF9-498A-9667-B96C4513B94E@oracle.com>

> On Oct 10, 2017, at 4:29 AM, Kim Barrett <kim.barrett at oracle.com> wrote:
> 
> RFR: 8189088: Add intrusive doubly-linked list utility

Based on some offline feedback, I?m withdrawing this change to do some rework.


From david.holmes at oracle.com  Sun Oct 22 21:52:12 2017
From: david.holmes at oracle.com (David Holmes)
Date: Mon, 23 Oct 2017 07:52:12 +1000
Subject: RFR: 8146115 - Improve docker container detection and resource
 configuration usage
In-Reply-To: <1C03FCB5-969B-4C43-8BAD-EF939515FEC2@oracle.com>
References: <74630458-926E-4B3E-B967-6F6ADCA0A406@oracle.com>
 <F25C2D01-C643-4838-B2A7-DAA19D1B1EF5@oracle.com>
 <1d05a95f-75db-e0ca-e069-12fe41502e4f@oracle.com>
 <D2E3D675-2A4A-4ED1-BC30-6C52B185B731@oracle.com>
 <CAHeneC8cxBJ61A4u1AkfmpP63ZG6-WEkMyFLKnhRv=G6W32pFw@mail.gmail.com>
 <615e504d-94af-bae3-b721-6ca1dac6a567@oracle.com>
 <CAHeneC9tdgc9aCx-VLa8TvYufq_V-z+ApoAv5Cewi6UzX9uz8Q@mail.gmail.com>
 <1BD883DB-8C8B-405D-8F85-3A026B19286F@oracle.com>
 <5f7f3d85-db48-fe6e-28f5-e3f4858f33e8@oracle.com>
 <A5CDB304-6B44-448E-ADF3-C736FB49C85C@oracle.com>
 <ff0d651b-eea1-85a1-46d8-c4904556fce2@oracle.com>
 <CD1C6FA4-5729-40A7-91DC-224FCAC55949@oracle.com>
 <799205ae-ba9f-ce3a-8dd6-1a55e32689df@oracle.com>
 <9956F9D0-B01B-44FE-AE56-527907816436@oracle.com>
 <20ef0bac-1942-b29f-a9e2-4ea4d4f81cd2@oracle.com>
 <C080A2E3-5EFE-4753-8048-4B991CEBDB24@oracle.com>
 <5d217c60-3049-30a6-c207-d6c9274a5ddf@oracle.com>
 <1C03FCB5-969B-4C43-8BAD-EF939515FEC2@oracle.com>
Message-ID: <51f57623-8ce5-0883-69cc-9ba6b39b5a65@oracle.com>

Hi Bob,

Changes seem fine.

I'll take up the issue of whether this should be enabled by default in 
the CSR.

Thanks,
David

On 21/10/2017 4:44 AM, Bob Vandette wrote:
> Here?s an updated webrev that hopefully takes care of all remaining comments.
> 
> http://cr.openjdk.java.net/~bobv/8146115/webrev.02
> 
> I added the deprecation of the UseCGroupMemoryLimitForHeap option this round since
> this experimental option should no longer be necessary.
> 
> 
> Bob.
> 
> 
>> On Oct 13, 2017, at 9:34 AM, David Holmes <David.Holmes at oracle.com> wrote:
>>
>> Reading back through my suggestion for os.hpp initialize_container_support should just be init_container_support.
>>
>> Thanks,
>> David
>>
>> On 13/10/2017 11:14 PM, Bob Vandette wrote:
>>>> On Oct 12, 2017, at 11:08 PM, David Holmes <david.holmes at oracle.com> wrote:
>>>>
>>>> Hi Bob,
>>>>
>>>> On 13/10/2017 1:43 AM, Bob Vandette wrote:
>>>>>> On Oct 11, 2017, at 9:04 PM, David Holmes <david.holmes at oracle.com> wrote:
>>>>>>
>>>>>> Hi Bob,
>>>>>>
>>>>>> On 12/10/2017 5:11 AM, Bob Vandette wrote:
>>>>>>> Here?s an updated webrev for this RFE that contains changes and cleanups based on feedback I?ve received so far.
>>>>>>> I?m still investigating the best approach for reacting to cpu shares and quotas.  I do not believe doing nothing is the answer.
>>>>>>
>>>>>> I do. :) Let me try this again. When you run outside of a container you don't get 100% of the CPUs - you have to share with whatever else is running on the system. You get a fraction of CPU time based on the load. We don't try to communicate load information to the VM/application so it can adapt. Within a container setting shares/quotas is just a way of setting an artificial load. So why should we be treating it any differently?
>>>>> Because today we optimize for a lightly loaded system and when running serverless applications in containers we should be
>>>>> optimizing for a fully loaded system.  If developers don?t want this, then don?t use shares or quotas and you?ll have exactly
>>>>> the behavior you have today.  I think we just have to document the new behavior (and how to turn it off) so people know what
>>>>> to expect.
>>>>
>>>> The person deploying the app may not have control over how the app is deployed in terms of shares/quotas. It all depends how (and who) manages the containers. This is a big part of my problem/concerns here that I don't know exactly how all this is organized and who knows what in advance and what they can control.
>>>>
>>>> But I'll let this drop, other than raising an additional concern. I don't think just allowing the user to hardwire the number of processors to use will necessarily solve the problem with what available_processors() returns. I'm concerned the execution of the VM may occur in a context where the number of processors is not known in advance, and the user can not disable shares/quotas. In that case we may need to have a flag that says to ignore shares/quotas in the processor count calculation.
>>> I?m not sure that?s a high probability issue.  It?s my understanding that whoever is configuring the container
>>> management will be specifying the resources required to run these applications which comes along with a
>>> guarantee of these resources.  If this issue does come up, I do have the -XX:-UseContainerSupport big
>>> switch that turns all of this off.  It will however disable the memory support as well.
>>>>
>>>>> You seem to discount the added cost of 100s of VMs creating lots of un-necessaary threads.  In the current JDK 10 code base,
>>>>> In a heavily loaded system with 88 processors, VmData grows from 60MBs (1 cpu) to 376MB (88 cpus).  This is only mapped
>>>>> memory and it depends heavily on how deep in the stack these threads go before it impacts VmRSS but it shows the potential downside
>>>>> of having 100s of VMs thinking they each own the entire machine.
>>>>
>>>> I agree that the default ergonomics does not scale well. Anyone doing any serious Java deployment tunes the VM explicitly and does not rely on the defaults. How will they do that in a container environment? I don't know.
>>>>
>>>> I would love to see some actual deployment scenarios/experiences for this to understand things better.
>>> This is one of the reasons I want to get this support out in JDK 10, to get some feedback under real scenarios.
>>>>
>>>>> I haven?t even done any experiments to determine the added context switching cost if the VM decides to use excessive
>>>>> pthreads.
>>>>>>
>>>>>> That's not to say an API to provide load/shares/quota information may not be useful, but that is a separate issue to what the "active processor count" should report.
>>>>> I don?t have a problem with active processor count reporting the number of processors we have, but I do have a problem
>>>>> with our current usage of this information within the VM and Core libraries.
>>>>
>>>> That is a somewhat separate issue. One worth pursuing separately.
>>> We should look at this as part of the ?Container aware Java? JEP.
>>>>
>>>>>>
>>>>>>> http://cr.openjdk.java.net/~bobv/8146115/webrev.01
>>>>>>> Updates:
>>>>>>> 1. I had to move the processing of AggressiveHeap since the container memory size needs to be known before this can be processed.
>>>>>>
>>>>>> I don't like the placement of this - we don't call os:: init functions from inside Arguments - we manage the initialization sequence from Threads::create_vm. Seems to me that container initialization can/should happen in os::init_before_ergo, and the AggressiveHeap processing can occur at the start of Arguments::apply_ergo().
>>>>>>
>>>>>> That said we need to be sure nothing touched by set_aggressive_heap_flags will be used before we now reach that code - there are a lot of flags being set in there.
>>>>> This is exactly the reason why I put the call where it did.  I put the call to set_aggressive_heap_flags in finalize_vm_init_args
>>>>> because that is exactly what this call is doing.  It?s finalizing flags used after the parsing.  The impacted flags are definitely being
>>>>> used shortly after and before init_before_ergo is called.
>>>>
>>>> I see that now and it is very unfortunate because I really do not like what you had to do here. As you can tell from the logic in create_vm we have always refactored to ensure we can progressively manage the interleaving of OS initialization with Arguments processing. So having a deep part of Argument processing go off and call some more OS initialization is not nice. That said I can't see a way around it without very unreasonable refactoring.
>>>>
>>>> But I do have a couple of changes I'd like to request please:
>>>>
>>>> 1. Move the call to os::initialize_container_support() up a level to before the call to finalize_vm_init_args(), with a more elaborate comment:
>>>>
>>>> // We need to ensure processor and memory resources have been properly
>>>> // configured - which may rely on arguments we just processed - before
>>>> // doing the final argument processing. Any argument processing that
>>>> // needs to know about processor and memory resources must occur after
>>>> // this point.
>>>>
>>>> os::initialize_container_support();
>>>>
>>>> // Do final processing now that all arguments have been parsed
>>>> result = finalize_vm_init_args(patch_mod_javabase);
>>>>
>>>> 2. Simplify and modify os.hpp as follows:
>>>>
>>>> +  LINUX_ONLY(static void pd_initialize_container_support();)
>>>>
>>>>    public:
>>>>     static void init(void);                      // Called before command line parsing
>>>>
>>>> +   static void initialize_container_support() { // Called during command line parsing
>>>> +     LINUX_ONLY(pd_initialize_container_support();)
>>>> +   }
>>>>
>>>>     static void init_before_ergo(void);          // Called after command line parsing
>>>>                                                  // before VM ergonomics
>>>>
>>>> 3. In thread.cpp add a comment here:
>>>>
>>>>    // Parse arguments
>>>> +  // Note: this internally calls os::initialize_container_support()
>>>>    jint parse_result = Arguments::parse(args);
>>> All very reasonable changes.
>>> Thanks,
>>> Bob.
>>>>
>>>> Thanks.
>>>>
>>>>>>
>>>>>>> 2. I no longer use the cpuset.cpus contents since sched_getaffinity reports the correct results
>>>>>>> even if someone manually updates the cgroup data.  I originally didn?t think this was the case since
>>>>>>> sched_setaffinity didn?t automatically update the cpuset file contents but the inverse is true.
>>>>>>
>>>>>> Ok.
>>>>>>
>>>>>>> 3. I ifdef?d the container function support in src/hotspot/share/runtime/os.hpp to avoid putting stubs in all other os
>>>>>>> platform directories.  I can do this if it?s absolutely necessary.
>>>>>>
>>>>>> You should not need to do this if initialization moves as I suggested above. os::init_before_ergo() in os_linux.cpp can call OSContainer::init().
>>>>>> No need for os::initialize_container_support() or os::pd_initialize_container_support.
>>>>> But os::init_before_ergo is in shared code.
>>>>
>>>> Yep my bad - point is moot now anyway.
>>>>
>>>> <snip>
>>>>
>>>>>> src/hotspot/os/linux/os_linux.cpp/.hpp
>>>>>>
>>>>>> 187         log_trace(os)("available container memory: " JULONG_FORMAT, avail_mem);
>>>>>> 188         return avail_mem;
>>>>>> 189       } else {
>>>>>> 190         log_debug(os,container)("container memory usage call failed: " JLONG_FORMAT, mem_usage);
>>>>>>
>>>>>> Why "trace" (the third logging level) to show the information, but "debug" (the second level) to show failed calls? You use debug in other files for basic info. Overall I'm unclear on your use of debug versus trace for the logging.
>>>>> I use trace for noisy information that is not reporting errors and debug for failures that are informational and not fatal.
>>>>> In this case, the call could return -1 or -2.  -1 is unlimited and -2 is an error.  In either case we fallback to the
>>>>> standard system call to get available memory.  I would have used warning but since these messages were occurring
>>>>> during a test run causing test failures.
>>>>
>>>> Okay. Thanks for clarifying.
>>>>
>>>>>>
>>>>>> ---
>>>>>>
>>>>>> src/hotspot/os/linux/osContainer_linux.cpp
>>>>>>
>>>>>> Dead code:
>>>>>>
>>>>>> 376 #if 0
>>>>>> 377   os::Linux::print_container_info(tty);
>>>>>> ...
>>>>>> 390 #endif
>>>>> I left it in for standalone testing.  Should I use some other #if?
>>>>
>>>> We don't generally leave in dead code in the runtime code. Do you see this as useful after you've finalized the changes?
>>>>
>>>> Is this testing just for showing the logging? Is it worth making this a logging controlled call? Is it suitable for a Gtest test?
>>>>
>>>> Thanks,
>>>> David
>>>> -----
>>>>
>>>>> Bob.
>>>>>>
>>>>>> Thanks,
>>>>>> David
>>>>>>
>>>>>>> Bob.
> 

From david.holmes at oracle.com  Mon Oct 23 01:59:55 2017
From: david.holmes at oracle.com (David Holmes)
Date: Mon, 23 Oct 2017 11:59:55 +1000
Subject: RFR: 8146115 - Improve docker container detection and resource
 configuration usage
In-Reply-To: <51f57623-8ce5-0883-69cc-9ba6b39b5a65@oracle.com>
References: <74630458-926E-4B3E-B967-6F6ADCA0A406@oracle.com>
 <1d05a95f-75db-e0ca-e069-12fe41502e4f@oracle.com>
 <D2E3D675-2A4A-4ED1-BC30-6C52B185B731@oracle.com>
 <CAHeneC8cxBJ61A4u1AkfmpP63ZG6-WEkMyFLKnhRv=G6W32pFw@mail.gmail.com>
 <615e504d-94af-bae3-b721-6ca1dac6a567@oracle.com>
 <CAHeneC9tdgc9aCx-VLa8TvYufq_V-z+ApoAv5Cewi6UzX9uz8Q@mail.gmail.com>
 <1BD883DB-8C8B-405D-8F85-3A026B19286F@oracle.com>
 <5f7f3d85-db48-fe6e-28f5-e3f4858f33e8@oracle.com>
 <A5CDB304-6B44-448E-ADF3-C736FB49C85C@oracle.com>
 <ff0d651b-eea1-85a1-46d8-c4904556fce2@oracle.com>
 <CD1C6FA4-5729-40A7-91DC-224FCAC55949@oracle.com>
 <799205ae-ba9f-ce3a-8dd6-1a55e32689df@oracle.com>
 <9956F9D0-B01B-44FE-AE56-527907816436@oracle.com>
 <20ef0bac-1942-b29f-a9e2-4ea4d4f81cd2@oracle.com>
 <C080A2E3-5EFE-4753-8048-4B991CEBDB24@oracle.com>
 <5d217c60-3049-30a6-c207-d6c9274a5ddf@oracle.com>
 <1C03FCB5-969B-4C43-8BAD-EF939515FEC2@oracle.com>
 <51f57623-8ce5-0883-69cc-9ba6b39b5a65@oracle.com>
Message-ID: <46ef96d6-f10a-7da4-8101-08bfc281705d@oracle.com>

Sorry just spotted a minor issue when actually running the code. Many of 
you log statements include \n in the format string. This is unnecessary 
and results in lots of blank lines in the logging output eg:

[0.002s][trace][os,container] OSContainer::init: Initializing Container 
Support
[0.003s][trace][os,container] Path to /memory.limit_in_bytes is 
/cgroup/memory//memory.limit_in_bytes

[0.003s][trace][os,container] Memory Limit is: 9223372036854775807

[0.004s][trace][os,container] Memory Limit is: Unlimited

[0.004s][trace][os          ] active_processor_count: using static path 
- configured processors: 4
[0.004s][trace][os          ] active_processor_count: sched_getaffinity 
processor count: 4
[0.004s][trace][os,container] Path to /cpu.shares is /cgroup/cpu//cpu.shares

[0.005s][trace][os,container] CPU Shares is: 1024

[0.005s][trace][os,container] Path to /cpu.cfs_quota_us is 
/cgroup/cpu//cpu.cfs_quota_us

[0.005s][debug][os,container] file not found /cgroup/cpu//cpu.cfs_quota_us

[0.005s][debug][os,container] Error reading /cpu.cfs_quota_us
[0.005s][trace][os,container] Path to /cpu.cfs_period_us is 
/cgroup/cpu//cpu.cfs_period_us

[0.006s][debug][os,container] file not found /cgroup/cpu//cpu.cfs_period_us

[0.006s][debug][os,container] Error reading /cpu.cfs_period_us

Thanks,
David

On 23/10/2017 7:52 AM, David Holmes wrote:
> Hi Bob,
> 
> Changes seem fine.
> 
> I'll take up the issue of whether this should be enabled by default in 
> the CSR.
> 
> Thanks,
> David
> 
> On 21/10/2017 4:44 AM, Bob Vandette wrote:
>> Here?s an updated webrev that hopefully takes care of all remaining 
>> comments.
>>
>> http://cr.openjdk.java.net/~bobv/8146115/webrev.02
>>
>> I added the deprecation of the UseCGroupMemoryLimitForHeap option this 
>> round since
>> this experimental option should no longer be necessary.
>>
>>
>> Bob.
>>
>>
>>> On Oct 13, 2017, at 9:34 AM, David Holmes <David.Holmes at oracle.com> 
>>> wrote:
>>>
>>> Reading back through my suggestion for os.hpp 
>>> initialize_container_support should just be init_container_support.
>>>
>>> Thanks,
>>> David
>>>
>>> On 13/10/2017 11:14 PM, Bob Vandette wrote:
>>>>> On Oct 12, 2017, at 11:08 PM, David Holmes 
>>>>> <david.holmes at oracle.com> wrote:
>>>>>
>>>>> Hi Bob,
>>>>>
>>>>> On 13/10/2017 1:43 AM, Bob Vandette wrote:
>>>>>>> On Oct 11, 2017, at 9:04 PM, David Holmes 
>>>>>>> <david.holmes at oracle.com> wrote:
>>>>>>>
>>>>>>> Hi Bob,
>>>>>>>
>>>>>>> On 12/10/2017 5:11 AM, Bob Vandette wrote:
>>>>>>>> Here?s an updated webrev for this RFE that contains changes and 
>>>>>>>> cleanups based on feedback I?ve received so far.
>>>>>>>> I?m still investigating the best approach for reacting to cpu 
>>>>>>>> shares and quotas.? I do not believe doing nothing is the answer.
>>>>>>>
>>>>>>> I do. :) Let me try this again. When you run outside of a 
>>>>>>> container you don't get 100% of the CPUs - you have to share with 
>>>>>>> whatever else is running on the system. You get a fraction of CPU 
>>>>>>> time based on the load. We don't try to communicate load 
>>>>>>> information to the VM/application so it can adapt. Within a 
>>>>>>> container setting shares/quotas is just a way of setting an 
>>>>>>> artificial load. So why should we be treating it any differently?
>>>>>> Because today we optimize for a lightly loaded system and when 
>>>>>> running serverless applications in containers we should be
>>>>>> optimizing for a fully loaded system.? If developers don?t want 
>>>>>> this, then don?t use shares or quotas and you?ll have exactly
>>>>>> the behavior you have today.? I think we just have to document the 
>>>>>> new behavior (and how to turn it off) so people know what
>>>>>> to expect.
>>>>>
>>>>> The person deploying the app may not have control over how the app 
>>>>> is deployed in terms of shares/quotas. It all depends how (and who) 
>>>>> manages the containers. This is a big part of my problem/concerns 
>>>>> here that I don't know exactly how all this is organized and who 
>>>>> knows what in advance and what they can control.
>>>>>
>>>>> But I'll let this drop, other than raising an additional concern. I 
>>>>> don't think just allowing the user to hardwire the number of 
>>>>> processors to use will necessarily solve the problem with what 
>>>>> available_processors() returns. I'm concerned the execution of the 
>>>>> VM may occur in a context where the number of processors is not 
>>>>> known in advance, and the user can not disable shares/quotas. In 
>>>>> that case we may need to have a flag that says to ignore 
>>>>> shares/quotas in the processor count calculation.
>>>> I?m not sure that?s a high probability issue.? It?s my understanding 
>>>> that whoever is configuring the container
>>>> management will be specifying the resources required to run these 
>>>> applications which comes along with a
>>>> guarantee of these resources.? If this issue does come up, I do have 
>>>> the -XX:-UseContainerSupport big
>>>> switch that turns all of this off.? It will however disable the 
>>>> memory support as well.
>>>>>
>>>>>> You seem to discount the added cost of 100s of VMs creating lots 
>>>>>> of un-necessaary threads.? In the current JDK 10 code base,
>>>>>> In a heavily loaded system with 88 processors, VmData grows from 
>>>>>> 60MBs (1 cpu) to 376MB (88 cpus).? This is only mapped
>>>>>> memory and it depends heavily on how deep in the stack these 
>>>>>> threads go before it impacts VmRSS but it shows the potential 
>>>>>> downside
>>>>>> of having 100s of VMs thinking they each own the entire machine.
>>>>>
>>>>> I agree that the default ergonomics does not scale well. Anyone 
>>>>> doing any serious Java deployment tunes the VM explicitly and does 
>>>>> not rely on the defaults. How will they do that in a container 
>>>>> environment? I don't know.
>>>>>
>>>>> I would love to see some actual deployment scenarios/experiences 
>>>>> for this to understand things better.
>>>> This is one of the reasons I want to get this support out in JDK 10, 
>>>> to get some feedback under real scenarios.
>>>>>
>>>>>> I haven?t even done any experiments to determine the added context 
>>>>>> switching cost if the VM decides to use excessive
>>>>>> pthreads.
>>>>>>>
>>>>>>> That's not to say an API to provide load/shares/quota information 
>>>>>>> may not be useful, but that is a separate issue to what the 
>>>>>>> "active processor count" should report.
>>>>>> I don?t have a problem with active processor count reporting the 
>>>>>> number of processors we have, but I do have a problem
>>>>>> with our current usage of this information within the VM and Core 
>>>>>> libraries.
>>>>>
>>>>> That is a somewhat separate issue. One worth pursuing separately.
>>>> We should look at this as part of the ?Container aware Java? JEP.
>>>>>
>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~bobv/8146115/webrev.01
>>>>>>>> Updates:
>>>>>>>> 1. I had to move the processing of AggressiveHeap since the 
>>>>>>>> container memory size needs to be known before this can be 
>>>>>>>> processed.
>>>>>>>
>>>>>>> I don't like the placement of this - we don't call os:: init 
>>>>>>> functions from inside Arguments - we manage the initialization 
>>>>>>> sequence from Threads::create_vm. Seems to me that container 
>>>>>>> initialization can/should happen in os::init_before_ergo, and the 
>>>>>>> AggressiveHeap processing can occur at the start of 
>>>>>>> Arguments::apply_ergo().
>>>>>>>
>>>>>>> That said we need to be sure nothing touched by 
>>>>>>> set_aggressive_heap_flags will be used before we now reach that 
>>>>>>> code - there are a lot of flags being set in there.
>>>>>> This is exactly the reason why I put the call where it did.? I put 
>>>>>> the call to set_aggressive_heap_flags in finalize_vm_init_args
>>>>>> because that is exactly what this call is doing.? It?s finalizing 
>>>>>> flags used after the parsing.? The impacted flags are definitely 
>>>>>> being
>>>>>> used shortly after and before init_before_ergo is called.
>>>>>
>>>>> I see that now and it is very unfortunate because I really do not 
>>>>> like what you had to do here. As you can tell from the logic in 
>>>>> create_vm we have always refactored to ensure we can progressively 
>>>>> manage the interleaving of OS initialization with Arguments 
>>>>> processing. So having a deep part of Argument processing go off and 
>>>>> call some more OS initialization is not nice. That said I can't see 
>>>>> a way around it without very unreasonable refactoring.
>>>>>
>>>>> But I do have a couple of changes I'd like to request please:
>>>>>
>>>>> 1. Move the call to os::initialize_container_support() up a level 
>>>>> to before the call to finalize_vm_init_args(), with a more 
>>>>> elaborate comment:
>>>>>
>>>>> // We need to ensure processor and memory resources have been properly
>>>>> // configured - which may rely on arguments we just processed - before
>>>>> // doing the final argument processing. Any argument processing that
>>>>> // needs to know about processor and memory resources must occur after
>>>>> // this point.
>>>>>
>>>>> os::initialize_container_support();
>>>>>
>>>>> // Do final processing now that all arguments have been parsed
>>>>> result = finalize_vm_init_args(patch_mod_javabase);
>>>>>
>>>>> 2. Simplify and modify os.hpp as follows:
>>>>>
>>>>> +? LINUX_ONLY(static void pd_initialize_container_support();)
>>>>>
>>>>> ?? public:
>>>>> ??? static void init(void);????????????????????? // Called before 
>>>>> command line parsing
>>>>>
>>>>> +?? static void initialize_container_support() { // Called during 
>>>>> command line parsing
>>>>> +???? LINUX_ONLY(pd_initialize_container_support();)
>>>>> +?? }
>>>>>
>>>>> ??? static void init_before_ergo(void);????????? // Called after 
>>>>> command line parsing
>>>>> ???????????????????????????????????????????????? // before VM 
>>>>> ergonomics
>>>>>
>>>>> 3. In thread.cpp add a comment here:
>>>>>
>>>>> ?? // Parse arguments
>>>>> +? // Note: this internally calls os::initialize_container_support()
>>>>> ?? jint parse_result = Arguments::parse(args);
>>>> All very reasonable changes.
>>>> Thanks,
>>>> Bob.
>>>>>
>>>>> Thanks.
>>>>>
>>>>>>>
>>>>>>>> 2. I no longer use the cpuset.cpus contents since 
>>>>>>>> sched_getaffinity reports the correct results
>>>>>>>> even if someone manually updates the cgroup data.? I originally 
>>>>>>>> didn?t think this was the case since
>>>>>>>> sched_setaffinity didn?t automatically update the cpuset file 
>>>>>>>> contents but the inverse is true.
>>>>>>>
>>>>>>> Ok.
>>>>>>>
>>>>>>>> 3. I ifdef?d the container function support in 
>>>>>>>> src/hotspot/share/runtime/os.hpp to avoid putting stubs in all 
>>>>>>>> other os
>>>>>>>> platform directories.? I can do this if it?s absolutely necessary.
>>>>>>>
>>>>>>> You should not need to do this if initialization moves as I 
>>>>>>> suggested above. os::init_before_ergo() in os_linux.cpp can call 
>>>>>>> OSContainer::init().
>>>>>>> No need for os::initialize_container_support() or 
>>>>>>> os::pd_initialize_container_support.
>>>>>> But os::init_before_ergo is in shared code.
>>>>>
>>>>> Yep my bad - point is moot now anyway.
>>>>>
>>>>> <snip>
>>>>>
>>>>>>> src/hotspot/os/linux/os_linux.cpp/.hpp
>>>>>>>
>>>>>>> 187???????? log_trace(os)("available container memory: " 
>>>>>>> JULONG_FORMAT, avail_mem);
>>>>>>> 188???????? return avail_mem;
>>>>>>> 189?????? } else {
>>>>>>> 190???????? log_debug(os,container)("container memory usage call 
>>>>>>> failed: " JLONG_FORMAT, mem_usage);
>>>>>>>
>>>>>>> Why "trace" (the third logging level) to show the information, 
>>>>>>> but "debug" (the second level) to show failed calls? You use 
>>>>>>> debug in other files for basic info. Overall I'm unclear on your 
>>>>>>> use of debug versus trace for the logging.
>>>>>> I use trace for noisy information that is not reporting errors and 
>>>>>> debug for failures that are informational and not fatal.
>>>>>> In this case, the call could return -1 or -2.? -1 is unlimited and 
>>>>>> -2 is an error.? In either case we fallback to the
>>>>>> standard system call to get available memory.? I would have used 
>>>>>> warning but since these messages were occurring
>>>>>> during a test run causing test failures.
>>>>>
>>>>> Okay. Thanks for clarifying.
>>>>>
>>>>>>>
>>>>>>> ---
>>>>>>>
>>>>>>> src/hotspot/os/linux/osContainer_linux.cpp
>>>>>>>
>>>>>>> Dead code:
>>>>>>>
>>>>>>> 376 #if 0
>>>>>>> 377?? os::Linux::print_container_info(tty);
>>>>>>> ...
>>>>>>> 390 #endif
>>>>>> I left it in for standalone testing.? Should I use some other #if?
>>>>>
>>>>> We don't generally leave in dead code in the runtime code. Do you 
>>>>> see this as useful after you've finalized the changes?
>>>>>
>>>>> Is this testing just for showing the logging? Is it worth making 
>>>>> this a logging controlled call? Is it suitable for a Gtest test?
>>>>>
>>>>> Thanks,
>>>>> David
>>>>> -----
>>>>>
>>>>>> Bob.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> David
>>>>>>>
>>>>>>>> Bob.
>>

From kim.barrett at oracle.com  Mon Oct 23 04:52:18 2017
From: kim.barrett at oracle.com (Kim Barrett)
Date: Mon, 23 Oct 2017 00:52:18 -0400
Subject: RFR: 8146115 - Improve docker container detection and resource
 configuration usage
In-Reply-To: <833ba1a5-49fc-bb24-ff99-994011af52aa@oracle.com>
References: <74630458-926E-4B3E-B967-6F6ADCA0A406@oracle.com>
 <2d9dd746-63e1-cade-28f9-5ca1ae1c253e@oracle.com>
 <200F07CB-35DA-492B-B78D-9EC033EE0431@oracle.com>
 <833ba1a5-49fc-bb24-ff99-994011af52aa@oracle.com>
Message-ID: <C47EF684-41C9-4BF3-9678-A5E80A87C5D2@oracle.com>

> On Sep 27, 2017, at 9:20 PM, David Holmes <david.holmes at oracle.com> wrote:
>>> 62     void set_subsystem_path(char *cgroup_path) {
>>> 
>>> If this takes a "const char*" will it save you from casting string literals to "char*" elsewhere?
>> I tried several different ways of declaring the container accessor functions and
>> always ended up with warnings due to scanf not being able to validate arguments
>> since the format string didn?t end up being a string literal.  I originally was using templates
>> and then ended up with the macros.  I tried several different casts but could resolve the problem.
> 
> Sounds like something Kim Barrett should take a look at :)

Fortunately, I just happened by.

The warnings are because we compile with -Wformat=2, which enables
-Wformat-nonliteral (among other things).

Use PRAGMA_FORMAT_NONLITERAL_IGNORED, e.g.

PRAGMA_DIAG_PUSH
PRAGMA_FORMAT_NONLITERAL_IGNORED
<function definition>
PRAGMA_DIAG_POP

That will silence warnings about sscanf (or anything else!) with a
non-literal format string within that <function definition>.

Also, while I was looking at this, I noticed that in
get_subsytem_file_contents_##return_name, if the sum of the lengths of
get_subsystem_path() and filename is >= MAXBUF, then we can end up
reading from a file other than the one intended, if such a file
exists.  That seems like it might be bad.

Also, the filename argument should be const char*.


From kim.barrett at oracle.com  Mon Oct 23 05:44:31 2017
From: kim.barrett at oracle.com (Kim Barrett)
Date: Mon, 23 Oct 2017 01:44:31 -0400
Subject: RFR: 8163897: oop_store has unnecessary memory barriers
Message-ID: <FE567DFD-9700-4CA2-A33C-BCBD68C1E07E@oracle.com>

Please review this change to the oop_store function template, which
removes some unnecessary memory barriers, moves CMS-specific code into
GC-specific (though not completely CMS-specific) areas, and cleans up
the API a bit.  See the CR for more details about the problems.

[Note: CTMRBS expands to CardTableModRefBS below.]

As a preliminary cleanup, CTMRBS::inline_write_ref_field has been
merged into it's only caller, CTMRBS::write_ref_field_work.  This left
the file gc/shared/cardTableModRefBS.inline.hpp effectively empty, so
it has been removed.  As a related cleanup,
CTMRBS::inline_write_ref_field_pre was found to be unused and has been
removed.

The volatile overload for oop_store has been renamed to
release_oop_store, to correspond to its purpose.  oop_store no longer
examines always_do_update_barrier to conditionally call the (now
renamed) volatile overload.  The only other caller of the volatile
overload was release_obj_field_put, which has been updated for the new
name.

The release argument for BarrierSet::write_ref_field and all the
related implementation has been removed.  Instead,
CTMRBS::write_ref_field_work now uses a release_store to mark the card
if card marking is required to be ordered after the value store,
e.g. for CMS, per the value of always_do_update_barrier.

Finally, the global variable always_do_update_barrier, which was only
needed for CMS, has been replaced with member variable
CTMRBS::_requires_ordered_marking (with accessor functions).  (G1 had
commented out manipulation of this variable, added in commented out
state as part of fix for 6904516; looks like debugging leftovers.
Those have been removed.)

So we now have [release_]oop_store, which

(1) calls the barrier set's pre-barrier handler (which is a nop except for G1),

(2) then performs a [release_]store of the new value,

(3) and finally calls the barrier set's post-barrier handler.  The
post-barrier handler shared by Serial, Parallel, and CMS performs the
card marking with a release barrier when requested (only for CMS).

With these changes, a release store of the new value is only done when
that's what is actually required by the caller, without regard to some
hidden global variable.  Also with these changes, only CMS (not Serial
or Parallel) uses a release store for the card marking, and then only
when actually needed, irrespective of whether the value store needed
to be a release store.

Finally, _requires_ordered_marking is now only set true when both
UseConcMarkSweepGC and CMSPrecleaningEnabled are true, which matches
the behavior of JITed code.  Precleaning is what requires the
ordering, so there's no point if it's disabled.

CR:
https://bugs.openjdk.java.net/browse/JDK-8163897

Webrev:
http://cr.openjdk.java.net/~kbarrett/8163897/open.00/

Testing:
hs-tier1 through hs-tier5.


From dmitry.samersoff at bell-sw.com  Mon Oct 23 07:37:14 2017
From: dmitry.samersoff at bell-sw.com (Dmitry Samersoff)
Date: Mon, 23 Oct 2017 10:37:14 +0300
Subject: RFR: 8146115 - Improve docker container detection and resource
 configuration usage
In-Reply-To: <51f57623-8ce5-0883-69cc-9ba6b39b5a65@oracle.com>
References: <74630458-926E-4B3E-B967-6F6ADCA0A406@oracle.com>
 <1d05a95f-75db-e0ca-e069-12fe41502e4f@oracle.com>
 <D2E3D675-2A4A-4ED1-BC30-6C52B185B731@oracle.com>
 <CAHeneC8cxBJ61A4u1AkfmpP63ZG6-WEkMyFLKnhRv=G6W32pFw@mail.gmail.com>
 <615e504d-94af-bae3-b721-6ca1dac6a567@oracle.com>
 <CAHeneC9tdgc9aCx-VLa8TvYufq_V-z+ApoAv5Cewi6UzX9uz8Q@mail.gmail.com>
 <1BD883DB-8C8B-405D-8F85-3A026B19286F@oracle.com>
 <5f7f3d85-db48-fe6e-28f5-e3f4858f33e8@oracle.com>
 <A5CDB304-6B44-448E-ADF3-C736FB49C85C@oracle.com>
 <ff0d651b-eea1-85a1-46d8-c4904556fce2@oracle.com>
 <CD1C6FA4-5729-40A7-91DC-224FCAC55949@oracle.com>
 <799205ae-ba9f-ce3a-8dd6-1a55e32689df@oracle.com>
 <9956F9D0-B01B-44FE-AE56-527907816436@oracle.com>
 <20ef0bac-1942-b29f-a9e2-4ea4d4f81cd2@oracle.com>
 <C080A2E3-5EFE-4753-8048-4B991CEBDB24@oracle.com>
 <5d217c60-3049-30a6-c207-d6c9274a5ddf@oracle.com>
 <1C03FCB5-969B-4C43-8BAD-EF939515FEC2@oracle.com>
 <51f57623-8ce5-0883-69cc-9ba6b39b5a65@oracle.com>
Message-ID: <ac4ae4da-c66f-1a23-52ea-ef80a98cb943@bell-sw.com>

Bob,

I compiled and run .02 on aarch64 linux and it works as expected.

-Dmitry


On 23.10.2017 00:52, David Holmes wrote:
> Hi Bob,
> 
> Changes seem fine.
> 
> I'll take up the issue of whether this should be enabled by default in
> the CSR.
> 
> Thanks,
> David
> 
> On 21/10/2017 4:44 AM, Bob Vandette wrote:
>> Here?s an updated webrev that hopefully takes care of all remaining
>> comments.
>>
>> http://cr.openjdk.java.net/~bobv/8146115/webrev.02
>>
>> I added the deprecation of the UseCGroupMemoryLimitForHeap option this
>> round since
>> this experimental option should no longer be necessary.
>>
>>
>> Bob.
>>
>>
>>> On Oct 13, 2017, at 9:34 AM, David Holmes <David.Holmes at oracle.com>
>>> wrote:
>>>
>>> Reading back through my suggestion for os.hpp
>>> initialize_container_support should just be init_container_support.
>>>
>>> Thanks,
>>> David
>>>
>>> On 13/10/2017 11:14 PM, Bob Vandette wrote:
>>>>> On Oct 12, 2017, at 11:08 PM, David Holmes
>>>>> <david.holmes at oracle.com> wrote:
>>>>>
>>>>> Hi Bob,
>>>>>
>>>>> On 13/10/2017 1:43 AM, Bob Vandette wrote:
>>>>>>> On Oct 11, 2017, at 9:04 PM, David Holmes
>>>>>>> <david.holmes at oracle.com> wrote:
>>>>>>>
>>>>>>> Hi Bob,
>>>>>>>
>>>>>>> On 12/10/2017 5:11 AM, Bob Vandette wrote:
>>>>>>>> Here?s an updated webrev for this RFE that contains changes and
>>>>>>>> cleanups based on feedback I?ve received so far.
>>>>>>>> I?m still investigating the best approach for reacting to cpu
>>>>>>>> shares and quotas.? I do not believe doing nothing is the answer.
>>>>>>>
>>>>>>> I do. :) Let me try this again. When you run outside of a
>>>>>>> container you don't get 100% of the CPUs - you have to share with
>>>>>>> whatever else is running on the system. You get a fraction of CPU
>>>>>>> time based on the load. We don't try to communicate load
>>>>>>> information to the VM/application so it can adapt. Within a
>>>>>>> container setting shares/quotas is just a way of setting an
>>>>>>> artificial load. So why should we be treating it any differently?
>>>>>> Because today we optimize for a lightly loaded system and when
>>>>>> running serverless applications in containers we should be
>>>>>> optimizing for a fully loaded system.? If developers don?t want
>>>>>> this, then don?t use shares or quotas and you?ll have exactly
>>>>>> the behavior you have today.? I think we just have to document the
>>>>>> new behavior (and how to turn it off) so people know what
>>>>>> to expect.
>>>>>
>>>>> The person deploying the app may not have control over how the app
>>>>> is deployed in terms of shares/quotas. It all depends how (and who)
>>>>> manages the containers. This is a big part of my problem/concerns
>>>>> here that I don't know exactly how all this is organized and who
>>>>> knows what in advance and what they can control.
>>>>>
>>>>> But I'll let this drop, other than raising an additional concern. I
>>>>> don't think just allowing the user to hardwire the number of
>>>>> processors to use will necessarily solve the problem with what
>>>>> available_processors() returns. I'm concerned the execution of the
>>>>> VM may occur in a context where the number of processors is not
>>>>> known in advance, and the user can not disable shares/quotas. In
>>>>> that case we may need to have a flag that says to ignore
>>>>> shares/quotas in the processor count calculation.
>>>> I?m not sure that?s a high probability issue.? It?s my understanding
>>>> that whoever is configuring the container
>>>> management will be specifying the resources required to run these
>>>> applications which comes along with a
>>>> guarantee of these resources.? If this issue does come up, I do have
>>>> the -XX:-UseContainerSupport big
>>>> switch that turns all of this off.? It will however disable the
>>>> memory support as well.
>>>>>
>>>>>> You seem to discount the added cost of 100s of VMs creating lots
>>>>>> of un-necessaary threads.? In the current JDK 10 code base,
>>>>>> In a heavily loaded system with 88 processors, VmData grows from
>>>>>> 60MBs (1 cpu) to 376MB (88 cpus).? This is only mapped
>>>>>> memory and it depends heavily on how deep in the stack these
>>>>>> threads go before it impacts VmRSS but it shows the potential
>>>>>> downside
>>>>>> of having 100s of VMs thinking they each own the entire machine.
>>>>>
>>>>> I agree that the default ergonomics does not scale well. Anyone
>>>>> doing any serious Java deployment tunes the VM explicitly and does
>>>>> not rely on the defaults. How will they do that in a container
>>>>> environment? I don't know.
>>>>>
>>>>> I would love to see some actual deployment scenarios/experiences
>>>>> for this to understand things better.
>>>> This is one of the reasons I want to get this support out in JDK 10,
>>>> to get some feedback under real scenarios.
>>>>>
>>>>>> I haven?t even done any experiments to determine the added context
>>>>>> switching cost if the VM decides to use excessive
>>>>>> pthreads.
>>>>>>>
>>>>>>> That's not to say an API to provide load/shares/quota information
>>>>>>> may not be useful, but that is a separate issue to what the
>>>>>>> "active processor count" should report.
>>>>>> I don?t have a problem with active processor count reporting the
>>>>>> number of processors we have, but I do have a problem
>>>>>> with our current usage of this information within the VM and Core
>>>>>> libraries.
>>>>>
>>>>> That is a somewhat separate issue. One worth pursuing separately.
>>>> We should look at this as part of the ?Container aware Java? JEP.
>>>>>
>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~bobv/8146115/webrev.01
>>>>>>>> Updates:
>>>>>>>> 1. I had to move the processing of AggressiveHeap since the
>>>>>>>> container memory size needs to be known before this can be
>>>>>>>> processed.
>>>>>>>
>>>>>>> I don't like the placement of this - we don't call os:: init
>>>>>>> functions from inside Arguments - we manage the initialization
>>>>>>> sequence from Threads::create_vm. Seems to me that container
>>>>>>> initialization can/should happen in os::init_before_ergo, and the
>>>>>>> AggressiveHeap processing can occur at the start of
>>>>>>> Arguments::apply_ergo().
>>>>>>>
>>>>>>> That said we need to be sure nothing touched by
>>>>>>> set_aggressive_heap_flags will be used before we now reach that
>>>>>>> code - there are a lot of flags being set in there.
>>>>>> This is exactly the reason why I put the call where it did.? I put
>>>>>> the call to set_aggressive_heap_flags in finalize_vm_init_args
>>>>>> because that is exactly what this call is doing.? It?s finalizing
>>>>>> flags used after the parsing.? The impacted flags are definitely
>>>>>> being
>>>>>> used shortly after and before init_before_ergo is called.
>>>>>
>>>>> I see that now and it is very unfortunate because I really do not
>>>>> like what you had to do here. As you can tell from the logic in
>>>>> create_vm we have always refactored to ensure we can progressively
>>>>> manage the interleaving of OS initialization with Arguments
>>>>> processing. So having a deep part of Argument processing go off and
>>>>> call some more OS initialization is not nice. That said I can't see
>>>>> a way around it without very unreasonable refactoring.
>>>>>
>>>>> But I do have a couple of changes I'd like to request please:
>>>>>
>>>>> 1. Move the call to os::initialize_container_support() up a level
>>>>> to before the call to finalize_vm_init_args(), with a more
>>>>> elaborate comment:
>>>>>
>>>>> // We need to ensure processor and memory resources have been properly
>>>>> // configured - which may rely on arguments we just processed - before
>>>>> // doing the final argument processing. Any argument processing that
>>>>> // needs to know about processor and memory resources must occur after
>>>>> // this point.
>>>>>
>>>>> os::initialize_container_support();
>>>>>
>>>>> // Do final processing now that all arguments have been parsed
>>>>> result = finalize_vm_init_args(patch_mod_javabase);
>>>>>
>>>>> 2. Simplify and modify os.hpp as follows:
>>>>>
>>>>> +? LINUX_ONLY(static void pd_initialize_container_support();)
>>>>>
>>>>> ?? public:
>>>>> ??? static void init(void);????????????????????? // Called before
>>>>> command line parsing
>>>>>
>>>>> +?? static void initialize_container_support() { // Called during
>>>>> command line parsing
>>>>> +???? LINUX_ONLY(pd_initialize_container_support();)
>>>>> +?? }
>>>>>
>>>>> ??? static void init_before_ergo(void);????????? // Called after
>>>>> command line parsing
>>>>> ???????????????????????????????????????????????? // before VM
>>>>> ergonomics
>>>>>
>>>>> 3. In thread.cpp add a comment here:
>>>>>
>>>>> ?? // Parse arguments
>>>>> +? // Note: this internally calls os::initialize_container_support()
>>>>> ?? jint parse_result = Arguments::parse(args);
>>>> All very reasonable changes.
>>>> Thanks,
>>>> Bob.
>>>>>
>>>>> Thanks.
>>>>>
>>>>>>>
>>>>>>>> 2. I no longer use the cpuset.cpus contents since
>>>>>>>> sched_getaffinity reports the correct results
>>>>>>>> even if someone manually updates the cgroup data.? I originally
>>>>>>>> didn?t think this was the case since
>>>>>>>> sched_setaffinity didn?t automatically update the cpuset file
>>>>>>>> contents but the inverse is true.
>>>>>>>
>>>>>>> Ok.
>>>>>>>
>>>>>>>> 3. I ifdef?d the container function support in
>>>>>>>> src/hotspot/share/runtime/os.hpp to avoid putting stubs in all
>>>>>>>> other os
>>>>>>>> platform directories.? I can do this if it?s absolutely necessary.
>>>>>>>
>>>>>>> You should not need to do this if initialization moves as I
>>>>>>> suggested above. os::init_before_ergo() in os_linux.cpp can call
>>>>>>> OSContainer::init().
>>>>>>> No need for os::initialize_container_support() or
>>>>>>> os::pd_initialize_container_support.
>>>>>> But os::init_before_ergo is in shared code.
>>>>>
>>>>> Yep my bad - point is moot now anyway.
>>>>>
>>>>> <snip>
>>>>>
>>>>>>> src/hotspot/os/linux/os_linux.cpp/.hpp
>>>>>>>
>>>>>>> 187???????? log_trace(os)("available container memory: "
>>>>>>> JULONG_FORMAT, avail_mem);
>>>>>>> 188???????? return avail_mem;
>>>>>>> 189?????? } else {
>>>>>>> 190???????? log_debug(os,container)("container memory usage call
>>>>>>> failed: " JLONG_FORMAT, mem_usage);
>>>>>>>
>>>>>>> Why "trace" (the third logging level) to show the information,
>>>>>>> but "debug" (the second level) to show failed calls? You use
>>>>>>> debug in other files for basic info. Overall I'm unclear on your
>>>>>>> use of debug versus trace for the logging.
>>>>>> I use trace for noisy information that is not reporting errors and
>>>>>> debug for failures that are informational and not fatal.
>>>>>> In this case, the call could return -1 or -2.? -1 is unlimited and
>>>>>> -2 is an error.? In either case we fallback to the
>>>>>> standard system call to get available memory.? I would have used
>>>>>> warning but since these messages were occurring
>>>>>> during a test run causing test failures.
>>>>>
>>>>> Okay. Thanks for clarifying.
>>>>>
>>>>>>>
>>>>>>> ---
>>>>>>>
>>>>>>> src/hotspot/os/linux/osContainer_linux.cpp
>>>>>>>
>>>>>>> Dead code:
>>>>>>>
>>>>>>> 376 #if 0
>>>>>>> 377?? os::Linux::print_container_info(tty);
>>>>>>> ...
>>>>>>> 390 #endif
>>>>>> I left it in for standalone testing.? Should I use some other #if?
>>>>>
>>>>> We don't generally leave in dead code in the runtime code. Do you
>>>>> see this as useful after you've finalized the changes?
>>>>>
>>>>> Is this testing just for showing the logging? Is it worth making
>>>>> this a logging controlled call? Is it suitable for a Gtest test?
>>>>>
>>>>> Thanks,
>>>>> David
>>>>> -----
>>>>>
>>>>>> Bob.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> David
>>>>>>>
>>>>>>>> Bob.
>>


From rkennke at redhat.com  Mon Oct 23 12:21:08 2017
From: rkennke at redhat.com (Roman Kennke)
Date: Mon, 23 Oct 2017 14:21:08 +0200
Subject: RFR: 8184914: Use MacroAssembler::cmpoop() consistently when
 comparing heap objects
In-Reply-To: <f0346b1f-2368-cd99-5a85-5b95214697ce@oracle.com>
References: <8d667010-f17e-7d1b-088b-106999e3b005@redhat.com>
 <9b629556-b3f0-e52e-35e0-711c6a767e95@oracle.com>
 <55bb0f72-df71-44bc-53a0-7d982ab1ca04@redhat.com>
 <f0346b1f-2368-cd99-5a85-5b95214697ce@oracle.com>
Message-ID: <810cfcd2-95ed-9df8-0910-dd2beecbdd48@redhat.com>

Hi Coleen,

thank you. Can you sponsor it? Do you need anything from me?

Thanks, Roman

> I'm calling this as "trivial" and can be pushed now.
> Thanks,
> Coleen
>
> On 10/17/17 5:05 PM, Roman Kennke wrote:
>>
>>>
>>> This looks reasonable to me.? Maybe the compiler group should review 
>>> the c1 part.? I changed the mailing list to hotspot-dev.
>>> I can sponsor this for you.
>> Thanks, thanks and thanks! ;-)
>>
>> Roman
>>
>>> Thanks,
>>> Coleen
>>>
>>> On 10/17/17 4:22 PM, Roman Kennke wrote:
>>>> (Not sure if this is the correct list to ask.. if not, please let 
>>>> me know and/or redirect me)
>>>>
>>>> Currently, cmpoop() is only declared for 32-bit x86, and only used 
>>>> in 2 places in C1 to compare oops. In other places, oops are 
>>>> compared using cmpptr(). It would be useful to distinguish normal 
>>>> pointer comparisons from heap object comparisons, and use cmpoop() 
>>>> consistently for heap object comparisons. This would remove clutter 
>>>> in several places where we have #ifdef _LP64 around comparisons, 
>>>> and would also allow to insert necessary barriers for GCs that need 
>>>> them (e.g. Shenandoah) later.
>>>>
>>>> http://cr.openjdk.java.net/~rkennke/8184914/webrev.00/ 
>>>> <http://cr.openjdk.java.net/%7Erkennke/8184914/webrev.00/>
>>>>
>>>> Tested by running hotspot_gc jtreg tests.
>>>>
>>>> Can I get a review please?
>>>>
>>>> Thanks, Roman
>>>>
>>>>
>>>
>>
>


From coleen.phillimore at oracle.com  Mon Oct 23 12:30:27 2017
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Mon, 23 Oct 2017 08:30:27 -0400
Subject: RFR: 8184914: Use MacroAssembler::cmpoop() consistently when
 comparing heap objects
In-Reply-To: <810cfcd2-95ed-9df8-0910-dd2beecbdd48@redhat.com>
References: <8d667010-f17e-7d1b-088b-106999e3b005@redhat.com>
 <9b629556-b3f0-e52e-35e0-711c6a767e95@oracle.com>
 <55bb0f72-df71-44bc-53a0-7d982ab1ca04@redhat.com>
 <f0346b1f-2368-cd99-5a85-5b95214697ce@oracle.com>
 <810cfcd2-95ed-9df8-0910-dd2beecbdd48@redhat.com>
Message-ID: <e04a4600-4666-211c-b4ec-701168ec4173@oracle.com>


On 10/23/17 8:21 AM, Roman Kennke wrote:
> Hi Coleen,
>
> thank you. Can you sponsor it? Do you need anything from me?

I do not.? I'll push it now.? I'm curious why you didn't change any of 
the other platforms.? Or do you only need this for x86?
thanks,
Coleen

>
> Thanks, Roman
>
>> I'm calling this as "trivial" and can be pushed now.
>> Thanks,
>> Coleen
>>
>> On 10/17/17 5:05 PM, Roman Kennke wrote:
>>>
>>>>
>>>> This looks reasonable to me.? Maybe the compiler group should 
>>>> review the c1 part.? I changed the mailing list to hotspot-dev.
>>>> I can sponsor this for you.
>>> Thanks, thanks and thanks! ;-)
>>>
>>> Roman
>>>
>>>> Thanks,
>>>> Coleen
>>>>
>>>> On 10/17/17 4:22 PM, Roman Kennke wrote:
>>>>> (Not sure if this is the correct list to ask.. if not, please let 
>>>>> me know and/or redirect me)
>>>>>
>>>>> Currently, cmpoop() is only declared for 32-bit x86, and only used 
>>>>> in 2 places in C1 to compare oops. In other places, oops are 
>>>>> compared using cmpptr(). It would be useful to distinguish normal 
>>>>> pointer comparisons from heap object comparisons, and use cmpoop() 
>>>>> consistently for heap object comparisons. This would remove 
>>>>> clutter in several places where we have #ifdef _LP64 around 
>>>>> comparisons, and would also allow to insert necessary barriers for 
>>>>> GCs that need them (e.g. Shenandoah) later.
>>>>>
>>>>> http://cr.openjdk.java.net/~rkennke/8184914/webrev.00/ 
>>>>> <http://cr.openjdk.java.net/%7Erkennke/8184914/webrev.00/>
>>>>>
>>>>> Tested by running hotspot_gc jtreg tests.
>>>>>
>>>>> Can I get a review please?
>>>>>
>>>>> Thanks, Roman
>>>>>
>>>>>
>>>>
>>>
>>
>


From rkennke at redhat.com  Mon Oct 23 12:47:11 2017
From: rkennke at redhat.com (Roman Kennke)
Date: Mon, 23 Oct 2017 14:47:11 +0200
Subject: RFR: 8184914: Use MacroAssembler::cmpoop() consistently when
 comparing heap objects
In-Reply-To: <e04a4600-4666-211c-b4ec-701168ec4173@oracle.com>
References: <8d667010-f17e-7d1b-088b-106999e3b005@redhat.com>
 <9b629556-b3f0-e52e-35e0-711c6a767e95@oracle.com>
 <55bb0f72-df71-44bc-53a0-7d982ab1ca04@redhat.com>
 <f0346b1f-2368-cd99-5a85-5b95214697ce@oracle.com>
 <810cfcd2-95ed-9df8-0910-dd2beecbdd48@redhat.com>
 <e04a4600-4666-211c-b4ec-701168ec4173@oracle.com>
Message-ID: <9071d7a5-1837-51e0-ec09-ba5922415811@redhat.com>

Am 23.10.2017 um 14:30 schrieb coleen.phillimore at oracle.com:
>
>
> On 10/23/17 8:21 AM, Roman Kennke wrote:
>> Hi Coleen,
>>
>> thank you. Can you sponsor it? Do you need anything from me?
>
> I do not.? I'll push it now.? I'm curious why you didn't change any of 
> the other platforms.? Or do you only need this for x86?
> thanks,
> Coleen
x86 seemed most important for now because there was this odd discrepancy 
between 32 and 64 bit (cmpoop did exist before, but only for one special 
case in 32 bit). I will do something similar for aarch64 later (need it 
for Shenandoah). Others will have to fill in the required parts for 
Shenandoah to other platforms, if they need it.

Thanks for all your help!

Roman

From robbin.ehn at oracle.com  Mon Oct 23 15:16:41 2017
From: robbin.ehn at oracle.com (Robbin Ehn)
Date: Mon, 23 Oct 2017 17:16:41 +0200
Subject: RFR(XL): 8185640: Thread-local handshakes
In-Reply-To: <3018D48F-245A-4C92-9CED-5692BBD88E8C@oracle.com>
References: <6f2f6259-73f1-c09c-063e-39ae528fb96f@oracle.com>
 <3018D48F-245A-4C92-9CED-5692BBD88E8C@oracle.com>
Message-ID: <47c2ac8e-151e-267f-28bd-f76ed5ef5357@oracle.com>

Hi,

On 2017-10-20 18:24, Karen Kinnear wrote:
> Robbin, Erik, Mikael -
> 
> Delighted to see this! Looks good. I don?t need to see any updates - these are minor comments.
> Thank you for the performance testing
> 
> Couple of questions/comments:
> 1. platform support
> supports_thread_local_poll returns true for AMD64 or SPARC
> Your comment said Linux x64 and Sparc only.
> What about Mac and Windows?

Sorry it should be x64 and SPARC, OS is not important. (so yes mac and windows)

> 
> 2. safepointMechanism_inline.hpp - comment clarification
> line 42 - ?Mutexes can be taken but none JavaThread?.
> Are you saying: ?Non-JavaThreads do not support handshakes, but must stop for
> safepoints.?
> Not sure what the Mutex comment is about

Fixed:
"// If the poll is on a non-java thread, we can only check the global state."

This is possible from e.g. Monitor::TrySpin.

> 
> 3. globals.hpp
> The way I understand this - ThreadLocalHandshakes flag is not so much to enable
> use of ThreadLocalHandle operations, but to enable use of TLH for global safe point.
> If that is true, could you possibly at least clarify this in the comment if there is not
> a better name for the flag?

Fixed
"Use thread-local polls instead of global poll for safepoints."

We can also do better name of option, e.g. -XX:+(Use)ThreadLocalPoll ?
Let me know.

> 
> 4. thank you for looking into startup performance and interpreter return/backward branch checks.

We are committed to fix this before 18.3!

> 
> 5. handshake.cpp
> Could you possibly add a comment that thread_has_completed and/or pool_for_completed_thread
> means that the thread has either done the operation or the operation has been cancelled?
> I get that we are polling this to tell when it is safe to return to the synchronous requestor not to
> determine if the thread actually performed the operation. The comment would make that clearer.

Fixed

Incremental:
http://cr.openjdk.java.net/~rehn/8185640/v3/Assorted-Karen-5/webrev/

Again let me know if anyone needs another kind!

Thanks Karen!

/Robbin

> 
> thanks,
> Karen
> 
>> On Oct 11, 2017, at 9:37 AM, Robbin Ehn <robbin.ehn at oracle.com> wrote:
>>
>> Hi all,
>>
>> Starting the review of the code while JEP work is still not completed.
>>
>> JEP: https://bugs.openjdk.java.net/browse/JDK-8185640
>>
>> This JEP introduces a way to execute a callback on threads without performing a global VM safepoint. It makes it both possible and cheap to stop individual threads and not just all threads or none.
>>
>> Entire changeset:
>> http://cr.openjdk.java.net/~rehn/8185640/v0/flat/
>>
>> Divided into 3-parts,
>> SafepointMechanism abstraction:
>> http://cr.openjdk.java.net/~rehn/8185640/v0/SafepointMechanism-0/
>> Consolidating polling page allocation:
>> http://cr.openjdk.java.net/~rehn/8185640/v0/PollingPage-1/
>> Handshakes:
>> http://cr.openjdk.java.net/~rehn/8185640/v0/Handshakes-2/
>>
>> A handshake operation is a callback that is executed for each JavaThread while that thread is in a safepoint safe state. The callback is executed either by the thread itself or by the VM thread while keeping the thread in a blocked state. The big difference between safepointing and handshaking is that the per thread operation will be performed on all threads as soon as possible and they will continue to execute as soon as it?s own operation is completed. If a JavaThread is known to be running, then a handshake can be performed with that single JavaThread as well.
>>
>> The current safepointing scheme is modified to perform an indirection through a per-thread pointer which will allow a single thread's execution to be forced to trap on the guard page. In order to force a thread to yield the VM updates the per-thread pointer for the corresponding thread to point to the guarded page.
>>
>> Example of potential use-cases:
>> -Biased lock revocation
>> -External requests for stack traces
>> -Deoptimization
>> -Async exception delivery
>> -External suspension
>> -Eliding memory barriers
>>
>> All of these will benefit the VM moving towards becoming more low-latency friendly by reducing the number of global safepoints.
>> Platforms that do not yet implement the per JavaThread poll, a fallback to normal safepoint is in place. HandshakeOneThread will then be a normal safepoint. The supported platforms are Linux x64 and Solaris SPARC.
>>
>> Tested heavily with various test suits and comes with a few new tests.
>>
>> Performance testing using standardized benchmark show no signification changes, the latest number was -0.7% on Linux x64 and +1.5% Solaris SPARC (not statistically ensured). A minor regression for the load vs load load on x64 is expected and a slight increase on SPARC due to the cost of ?materializing? the page vs load load.
>> The time to trigger a safepoint was measured on a large machine to not be an issue. The looping over threads and arming the polling page will benefit from the work on JavaThread life-cycle (8167108 - SMR and JavaThread Lifecycle: http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2017-October/024773.html) which puts all JavaThreads in an array instead of a linked list.
>>
>> Thanks, Robbin
> 

From robbin.ehn at oracle.com  Mon Oct 23 15:26:26 2017
From: robbin.ehn at oracle.com (Robbin Ehn)
Date: Mon, 23 Oct 2017 17:26:26 +0200
Subject: RFR(XL): 8185640: Thread-local handshakes
In-Reply-To: <edf27200af4e4fb4a8bda1831ae2d974@sap.com>
References: <6f2f6259-73f1-c09c-063e-39ae528fb96f@oracle.com>
 <15dd917732444959b7785efbe6640952@sap.com>
 <ba5210b2-d40c-b559-c9bb-35e245d2208e@oracle.com>
 <18f2001cbbbd4772aa9268e6e34b4be9@sap.com>
 <770b1286-3c8e-92e5-3929-17eb4e6c3847@oracle.com>
 <edf27200af4e4fb4a8bda1831ae2d974@sap.com>
Message-ID: <6e10687c-f70e-5ee7-414f-b2c22d3e8f21@oracle.com>

Hi Martin,

On 2017-10-18 16:05, Doerr, Martin wrote:
> Hi Robbin,
> 
> thanks for the quick reply and for doing additional benchmarks.
> Please note that t->does_dispatch() was just a first idea, but doesn't really fit for the purpose because it's false for conditional branch bytecodes for example. I just didn't find an appropriate quick check in the existing code.
> I guess you will notice a performance impact when benchmarking with -Xint. (I don't know if Oracle usually runs startup performance benchmarks.)

Yes, we are seeing a performance regression, 2.5%-6% depending on benchmark.
We are committed to fix this, but it might come as separate RFE/bug depending on 
the JEP's timeline.

(If the fix, very unlikely, would not be done before next release, we would 
change the default to off)

I hope this is an acceptable path?

Thanks, Robbin

> 
> Best regards,
> Martin
> 
> 
> -----Original Message-----
> From: Robbin Ehn [mailto:robbin.ehn at oracle.com]
> Sent: Mittwoch, 18. Oktober 2017 15:58
> To: Doerr, Martin <martin.doerr at sap.com>; hotspot-dev developers <hotspot-dev at openjdk.java.net>
> Subject: Re: RFR(XL): 8185640: Thread-local handshakes
> 
> Hi Martin,
> 
> On 2017-10-18 12:11, Doerr, Martin wrote:
>> Hi Robbin,
>>
>> so you would like to push your version first (as it does not break other platforms) and then help us to push non-Oracle platform implementations which change shared code again?
>> I'd be fine with that, too.
> 
> Yes, great!
> 
>>
>> While thinking a little longer about the interpreter implementation, a new idea came into my mind.
>> I think we could significantly reduce impact on interpreter code size and performance by using safepoint polls only in a subset of bytecodes. E.g., we could use only bytecodes which perform any kind of jump by implementing something like
>> if (SafepointMechanism::uses_thread_local_poll() && t->does_dispatch()) generate_safepoint_poll();
>> in TemplateInterpreterGenerator::generate_and_dispatch.
> 
> We have not seen any performance regression in simple benchmark with this.
> I will do a better benchmark and compare what difference it makes.
> 
> Thanks, Robbin
> 
>>
>> Best regards,
>> Martin
>>
>>
>> -----Original Message-----
>> From: Robbin Ehn [mailto:robbin.ehn at oracle.com]
>> Sent: Mittwoch, 18. Oktober 2017 11:07
>> To: Doerr, Martin <martin.doerr at sap.com>; hotspot-dev developers <hotspot-dev at openjdk.java.net>
>> Subject: Re: RFR(XL): 8185640: Thread-local handshakes
>>
>> Thanks for looking at this.
>>
>> On 2017-10-17 19:58, Doerr, Martin wrote:
>>> Hi Robbin,
>>>
>>> my first impression is very good. Thanks for providing the webrev.
>>
>> Great!
>>
>>>
>>> I only don't like that "poll_page_val | poll_bit()" is used in shared code. I'd prefer to use either one or the other mechanism.
>>> Would it be ok to move the decision between what to use to platform code?
>>> (Some platforms could still use both if this is beneficial.)
>>>
>>> E.g. on PPC64, we'd like to use conditional trap instructions with special bit patterns if UseSIGTRAP is on. Would be excellent if we could implement set functions for _poll_armed_value and _poll_disarmed_value in platform code. poll_bit() also fits better into platform code in my opinion.
>>
>> I see no issue with this.
>> Maybe SafepointMechanism::local_poll_armed should be possibly platform specific.
>> Can we do this incremental when adding the platform support for PPC64?
>>
>> Thanks, Robbin
>>
>>>
>>> Best regards,
>>> Martin
>>>
>>>
>>> -----Original Message-----
>>> From: hotspot-dev [mailto:hotspot-dev-bounces at openjdk.java.net] On Behalf Of Robbin Ehn
>>> Sent: Mittwoch, 11. Oktober 2017 15:38
>>> To: hotspot-dev developers <hotspot-dev at openjdk.java.net>
>>> Subject: RFR(XL): 8185640: Thread-local handshakes
>>>
>>> Hi all,
>>>
>>> Starting the review of the code while JEP work is still not completed.
>>>
>>> JEP: https://bugs.openjdk.java.net/browse/JDK-8185640
>>>
>>> This JEP introduces a way to execute a callback on threads without performing a global VM safepoint. It makes it both possible and cheap to stop individual threads and not
>>> just all threads or none.
>>>
>>> Entire changeset:
>>> http://cr.openjdk.java.net/~rehn/8185640/v0/flat/
>>>
>>> Divided into 3-parts,
>>> SafepointMechanism abstraction:
>>> http://cr.openjdk.java.net/~rehn/8185640/v0/SafepointMechanism-0/
>>> Consolidating polling page allocation:
>>> http://cr.openjdk.java.net/~rehn/8185640/v0/PollingPage-1/
>>> Handshakes:
>>> http://cr.openjdk.java.net/~rehn/8185640/v0/Handshakes-2/
>>>
>>> A handshake operation is a callback that is executed for each JavaThread while that thread is in a safepoint safe state. The callback is executed either by the thread
>>> itself or by the VM thread while keeping the thread in a blocked state. The big difference between safepointing and handshaking is that the per thread operation will be
>>> performed on all threads as soon as possible and they will continue to execute as soon as it?s own operation is completed. If a JavaThread is known to be running, then a
>>> handshake can be performed with that single JavaThread as well.
>>>
>>> The current safepointing scheme is modified to perform an indirection through a per-thread pointer which will allow a single thread's execution to be forced to trap on the
>>> guard page. In order to force a thread to yield the VM updates the per-thread pointer for the corresponding thread to point to the guarded page.
>>>
>>> Example of potential use-cases:
>>> -Biased lock revocation
>>> -External requests for stack traces
>>> -Deoptimization
>>> -Async exception delivery
>>> -External suspension
>>> -Eliding memory barriers
>>>
>>> All of these will benefit the VM moving towards becoming more low-latency friendly by reducing the number of global safepoints.
>>> Platforms that do not yet implement the per JavaThread poll, a fallback to normal safepoint is in place. HandshakeOneThread will then be a normal safepoint. The supported
>>> platforms are Linux x64 and Solaris SPARC.
>>>
>>> Tested heavily with various test suits and comes with a few new tests.
>>>
>>> Performance testing using standardized benchmark show no signification changes, the latest number was -0.7% on Linux x64 and +1.5% Solaris SPARC (not statistically
>>> ensured). A minor regression for the load vs load load on x64 is expected and a slight increase on SPARC due to the cost of ?materializing? the page vs load load.
>>> The time to trigger a safepoint was measured on a large machine to not be an issue. The looping over threads and arming the polling page will benefit from the work on
>>> JavaThread life-cycle (8167108 - SMR and JavaThread Lifecycle: http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2017-October/024773.html) which puts all
>>> JavaThreads in an array instead of a linked list.
>>>
>>> Thanks, Robbin
>>>

From martin.doerr at sap.com  Mon Oct 23 15:40:44 2017
From: martin.doerr at sap.com (Doerr, Martin)
Date: Mon, 23 Oct 2017 15:40:44 +0000
Subject: RFR(XL): 8185640: Thread-local handshakes
In-Reply-To: <6e10687c-f70e-5ee7-414f-b2c22d3e8f21@oracle.com>
References: <6f2f6259-73f1-c09c-063e-39ae528fb96f@oracle.com>
 <15dd917732444959b7785efbe6640952@sap.com>
 <ba5210b2-d40c-b559-c9bb-35e245d2208e@oracle.com>
 <18f2001cbbbd4772aa9268e6e34b4be9@sap.com>
 <770b1286-3c8e-92e5-3929-17eb4e6c3847@oracle.com>
 <edf27200af4e4fb4a8bda1831ae2d974@sap.com>
 <6e10687c-f70e-5ee7-414f-b2c22d3e8f21@oracle.com>
Message-ID: <f3553eec9b1d4815adbf7c87ca550527@sap.com>

Hi Coleen and Robbin,

I'm ok with putting it into a separate RFE. I understand that there are more fun activities than rebasing this XL change for a long time :-)
So you don't need to delay it. It's acceptable for me.
 
Thanks, Coleen, for sharing your proposal. I appreciate it.

Best regards,
Martin


-----Original Message-----
From: Robbin Ehn [mailto:robbin.ehn at oracle.com] 
Sent: Montag, 23. Oktober 2017 17:26
To: Doerr, Martin <martin.doerr at sap.com>; hotspot-dev developers <hotspot-dev at openjdk.java.net>
Subject: Re: RFR(XL): 8185640: Thread-local handshakes

Hi Martin,

On 2017-10-18 16:05, Doerr, Martin wrote:
> Hi Robbin,
> 
> thanks for the quick reply and for doing additional benchmarks.
> Please note that t->does_dispatch() was just a first idea, but doesn't really fit for the purpose because it's false for conditional branch bytecodes for example. I just didn't find an appropriate quick check in the existing code.
> I guess you will notice a performance impact when benchmarking with -Xint. (I don't know if Oracle usually runs startup performance benchmarks.)

Yes, we are seeing a performance regression, 2.5%-6% depending on benchmark.
We are committed to fix this, but it might come as separate RFE/bug depending on 
the JEP's timeline.

(If the fix, very unlikely, would not be done before next release, we would 
change the default to off)

I hope this is an acceptable path?

Thanks, Robbin

> 
> Best regards,
> Martin
> 
> 
> -----Original Message-----
> From: Robbin Ehn [mailto:robbin.ehn at oracle.com]
> Sent: Mittwoch, 18. Oktober 2017 15:58
> To: Doerr, Martin <martin.doerr at sap.com>; hotspot-dev developers <hotspot-dev at openjdk.java.net>
> Subject: Re: RFR(XL): 8185640: Thread-local handshakes
> 
> Hi Martin,
> 
> On 2017-10-18 12:11, Doerr, Martin wrote:
>> Hi Robbin,
>>
>> so you would like to push your version first (as it does not break other platforms) and then help us to push non-Oracle platform implementations which change shared code again?
>> I'd be fine with that, too.
> 
> Yes, great!
> 
>>
>> While thinking a little longer about the interpreter implementation, a new idea came into my mind.
>> I think we could significantly reduce impact on interpreter code size and performance by using safepoint polls only in a subset of bytecodes. E.g., we could use only bytecodes which perform any kind of jump by implementing something like
>> if (SafepointMechanism::uses_thread_local_poll() && t->does_dispatch()) generate_safepoint_poll();
>> in TemplateInterpreterGenerator::generate_and_dispatch.
> 
> We have not seen any performance regression in simple benchmark with this.
> I will do a better benchmark and compare what difference it makes.
> 
> Thanks, Robbin
> 
>>
>> Best regards,
>> Martin
>>
>>
>> -----Original Message-----
>> From: Robbin Ehn [mailto:robbin.ehn at oracle.com]
>> Sent: Mittwoch, 18. Oktober 2017 11:07
>> To: Doerr, Martin <martin.doerr at sap.com>; hotspot-dev developers <hotspot-dev at openjdk.java.net>
>> Subject: Re: RFR(XL): 8185640: Thread-local handshakes
>>
>> Thanks for looking at this.
>>
>> On 2017-10-17 19:58, Doerr, Martin wrote:
>>> Hi Robbin,
>>>
>>> my first impression is very good. Thanks for providing the webrev.
>>
>> Great!
>>
>>>
>>> I only don't like that "poll_page_val | poll_bit()" is used in shared code. I'd prefer to use either one or the other mechanism.
>>> Would it be ok to move the decision between what to use to platform code?
>>> (Some platforms could still use both if this is beneficial.)
>>>
>>> E.g. on PPC64, we'd like to use conditional trap instructions with special bit patterns if UseSIGTRAP is on. Would be excellent if we could implement set functions for _poll_armed_value and _poll_disarmed_value in platform code. poll_bit() also fits better into platform code in my opinion.
>>
>> I see no issue with this.
>> Maybe SafepointMechanism::local_poll_armed should be possibly platform specific.
>> Can we do this incremental when adding the platform support for PPC64?
>>
>> Thanks, Robbin
>>
>>>
>>> Best regards,
>>> Martin
>>>
>>>
>>> -----Original Message-----
>>> From: hotspot-dev [mailto:hotspot-dev-bounces at openjdk.java.net] On Behalf Of Robbin Ehn
>>> Sent: Mittwoch, 11. Oktober 2017 15:38
>>> To: hotspot-dev developers <hotspot-dev at openjdk.java.net>
>>> Subject: RFR(XL): 8185640: Thread-local handshakes
>>>
>>> Hi all,
>>>
>>> Starting the review of the code while JEP work is still not completed.
>>>
>>> JEP: https://bugs.openjdk.java.net/browse/JDK-8185640
>>>
>>> This JEP introduces a way to execute a callback on threads without performing a global VM safepoint. It makes it both possible and cheap to stop individual threads and not
>>> just all threads or none.
>>>
>>> Entire changeset:
>>> http://cr.openjdk.java.net/~rehn/8185640/v0/flat/
>>>
>>> Divided into 3-parts,
>>> SafepointMechanism abstraction:
>>> http://cr.openjdk.java.net/~rehn/8185640/v0/SafepointMechanism-0/
>>> Consolidating polling page allocation:
>>> http://cr.openjdk.java.net/~rehn/8185640/v0/PollingPage-1/
>>> Handshakes:
>>> http://cr.openjdk.java.net/~rehn/8185640/v0/Handshakes-2/
>>>
>>> A handshake operation is a callback that is executed for each JavaThread while that thread is in a safepoint safe state. The callback is executed either by the thread
>>> itself or by the VM thread while keeping the thread in a blocked state. The big difference between safepointing and handshaking is that the per thread operation will be
>>> performed on all threads as soon as possible and they will continue to execute as soon as it?s own operation is completed. If a JavaThread is known to be running, then a
>>> handshake can be performed with that single JavaThread as well.
>>>
>>> The current safepointing scheme is modified to perform an indirection through a per-thread pointer which will allow a single thread's execution to be forced to trap on the
>>> guard page. In order to force a thread to yield the VM updates the per-thread pointer for the corresponding thread to point to the guarded page.
>>>
>>> Example of potential use-cases:
>>> -Biased lock revocation
>>> -External requests for stack traces
>>> -Deoptimization
>>> -Async exception delivery
>>> -External suspension
>>> -Eliding memory barriers
>>>
>>> All of these will benefit the VM moving towards becoming more low-latency friendly by reducing the number of global safepoints.
>>> Platforms that do not yet implement the per JavaThread poll, a fallback to normal safepoint is in place. HandshakeOneThread will then be a normal safepoint. The supported
>>> platforms are Linux x64 and Solaris SPARC.
>>>
>>> Tested heavily with various test suits and comes with a few new tests.
>>>
>>> Performance testing using standardized benchmark show no signification changes, the latest number was -0.7% on Linux x64 and +1.5% Solaris SPARC (not statistically
>>> ensured). A minor regression for the load vs load load on x64 is expected and a slight increase on SPARC due to the cost of ?materializing? the page vs load load.
>>> The time to trigger a safepoint was measured on a large machine to not be an issue. The looping over threads and arming the polling page will benefit from the work on
>>> JavaThread life-cycle (8167108 - SMR and JavaThread Lifecycle: http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2017-October/024773.html) which puts all
>>> JavaThreads in an array instead of a linked list.
>>>
>>> Thanks, Robbin
>>>

From karen.kinnear at oracle.com  Mon Oct 23 15:58:55 2017
From: karen.kinnear at oracle.com (Karen Kinnear)
Date: Mon, 23 Oct 2017 08:58:55 -0700
Subject: RFR(XL): 8185640: Thread-local handshakes
In-Reply-To: <f3553eec9b1d4815adbf7c87ca550527@sap.com>
References: <6f2f6259-73f1-c09c-063e-39ae528fb96f@oracle.com>
 <15dd917732444959b7785efbe6640952@sap.com>
 <ba5210b2-d40c-b559-c9bb-35e245d2208e@oracle.com>
 <18f2001cbbbd4772aa9268e6e34b4be9@sap.com>
 <770b1286-3c8e-92e5-3929-17eb4e6c3847@oracle.com>
 <edf27200af4e4fb4a8bda1831ae2d974@sap.com>
 <6e10687c-f70e-5ee7-414f-b2c22d3e8f21@oracle.com>
 <f3553eec9b1d4815adbf7c87ca550527@sap.com>
Message-ID: <580AD7F0-2713-472C-A440-AAFDDA2D3EB3@oracle.com>

Works for me

Thanks,
Karen

> On Oct 23, 2017, at 8:40 AM, Doerr, Martin <martin.doerr at sap.com> wrote:
> 
> Hi Coleen and Robbin,
> 
> I'm ok with putting it into a separate RFE. I understand that there are more fun activities than rebasing this XL change for a long time :-)
> So you don't need to delay it. It's acceptable for me.
> 
> Thanks, Coleen, for sharing your proposal. I appreciate it.
> 
> Best regards,
> Martin
> 
> 
> -----Original Message-----
> From: Robbin Ehn [mailto:robbin.ehn at oracle.com] 
> Sent: Montag, 23. Oktober 2017 17:26
> To: Doerr, Martin <martin.doerr at sap.com>; hotspot-dev developers <hotspot-dev at openjdk.java.net>
> Subject: Re: RFR(XL): 8185640: Thread-local handshakes
> 
> Hi Martin,
> 
>> On 2017-10-18 16:05, Doerr, Martin wrote:
>> Hi Robbin,
>> 
>> thanks for the quick reply and for doing additional benchmarks.
>> Please note that t->does_dispatch() was just a first idea, but doesn't really fit for the purpose because it's false for conditional branch bytecodes for example. I just didn't find an appropriate quick check in the existing code.
>> I guess you will notice a performance impact when benchmarking with -Xint. (I don't know if Oracle usually runs startup performance benchmarks.)
> 
> Yes, we are seeing a performance regression, 2.5%-6% depending on benchmark.
> We are committed to fix this, but it might come as separate RFE/bug depending on 
> the JEP's timeline.
> 
> (If the fix, very unlikely, would not be done before next release, we would 
> change the default to off)
> 
> I hope this is an acceptable path?
> 
> Thanks, Robbin
> 
>> 
>> Best regards,
>> Martin
>> 
>> 
>> -----Original Message-----
>> From: Robbin Ehn [mailto:robbin.ehn at oracle.com]
>> Sent: Mittwoch, 18. Oktober 2017 15:58
>> To: Doerr, Martin <martin.doerr at sap.com>; hotspot-dev developers <hotspot-dev at openjdk.java.net>
>> Subject: Re: RFR(XL): 8185640: Thread-local handshakes
>> 
>> Hi Martin,
>> 
>>> On 2017-10-18 12:11, Doerr, Martin wrote:
>>> Hi Robbin,
>>> 
>>> so you would like to push your version first (as it does not break other platforms) and then help us to push non-Oracle platform implementations which change shared code again?
>>> I'd be fine with that, too.
>> 
>> Yes, great!
>> 
>>> 
>>> While thinking a little longer about the interpreter implementation, a new idea came into my mind.
>>> I think we could significantly reduce impact on interpreter code size and performance by using safepoint polls only in a subset of bytecodes. E.g., we could use only bytecodes which perform any kind of jump by implementing something like
>>> if (SafepointMechanism::uses_thread_local_poll() && t->does_dispatch()) generate_safepoint_poll();
>>> in TemplateInterpreterGenerator::generate_and_dispatch.
>> 
>> We have not seen any performance regression in simple benchmark with this.
>> I will do a better benchmark and compare what difference it makes.
>> 
>> Thanks, Robbin
>> 
>>> 
>>> Best regards,
>>> Martin
>>> 
>>> 
>>> -----Original Message-----
>>> From: Robbin Ehn [mailto:robbin.ehn at oracle.com]
>>> Sent: Mittwoch, 18. Oktober 2017 11:07
>>> To: Doerr, Martin <martin.doerr at sap.com>; hotspot-dev developers <hotspot-dev at openjdk.java.net>
>>> Subject: Re: RFR(XL): 8185640: Thread-local handshakes
>>> 
>>> Thanks for looking at this.
>>> 
>>>> On 2017-10-17 19:58, Doerr, Martin wrote:
>>>> Hi Robbin,
>>>> 
>>>> my first impression is very good. Thanks for providing the webrev.
>>> 
>>> Great!
>>> 
>>>> 
>>>> I only don't like that "poll_page_val | poll_bit()" is used in shared code. I'd prefer to use either one or the other mechanism.
>>>> Would it be ok to move the decision between what to use to platform code?
>>>> (Some platforms could still use both if this is beneficial.)
>>>> 
>>>> E.g. on PPC64, we'd like to use conditional trap instructions with special bit patterns if UseSIGTRAP is on. Would be excellent if we could implement set functions for _poll_armed_value and _poll_disarmed_value in platform code. poll_bit() also fits better into platform code in my opinion.
>>> 
>>> I see no issue with this.
>>> Maybe SafepointMechanism::local_poll_armed should be possibly platform specific.
>>> Can we do this incremental when adding the platform support for PPC64?
>>> 
>>> Thanks, Robbin
>>> 
>>>> 
>>>> Best regards,
>>>> Martin
>>>> 
>>>> 
>>>> -----Original Message-----
>>>> From: hotspot-dev [mailto:hotspot-dev-bounces at openjdk.java.net] On Behalf Of Robbin Ehn
>>>> Sent: Mittwoch, 11. Oktober 2017 15:38
>>>> To: hotspot-dev developers <hotspot-dev at openjdk.java.net>
>>>> Subject: RFR(XL): 8185640: Thread-local handshakes
>>>> 
>>>> Hi all,
>>>> 
>>>> Starting the review of the code while JEP work is still not completed.
>>>> 
>>>> JEP: https://bugs.openjdk.java.net/browse/JDK-8185640
>>>> 
>>>> This JEP introduces a way to execute a callback on threads without performing a global VM safepoint. It makes it both possible and cheap to stop individual threads and not
>>>> just all threads or none.
>>>> 
>>>> Entire changeset:
>>>> http://cr.openjdk.java.net/~rehn/8185640/v0/flat/
>>>> 
>>>> Divided into 3-parts,
>>>> SafepointMechanism abstraction:
>>>> http://cr.openjdk.java.net/~rehn/8185640/v0/SafepointMechanism-0/
>>>> Consolidating polling page allocation:
>>>> http://cr.openjdk.java.net/~rehn/8185640/v0/PollingPage-1/
>>>> Handshakes:
>>>> http://cr.openjdk.java.net/~rehn/8185640/v0/Handshakes-2/
>>>> 
>>>> A handshake operation is a callback that is executed for each JavaThread while that thread is in a safepoint safe state. The callback is executed either by the thread
>>>> itself or by the VM thread while keeping the thread in a blocked state. The big difference between safepointing and handshaking is that the per thread operation will be
>>>> performed on all threads as soon as possible and they will continue to execute as soon as it?s own operation is completed. If a JavaThread is known to be running, then a
>>>> handshake can be performed with that single JavaThread as well.
>>>> 
>>>> The current safepointing scheme is modified to perform an indirection through a per-thread pointer which will allow a single thread's execution to be forced to trap on the
>>>> guard page. In order to force a thread to yield the VM updates the per-thread pointer for the corresponding thread to point to the guarded page.
>>>> 
>>>> Example of potential use-cases:
>>>> -Biased lock revocation
>>>> -External requests for stack traces
>>>> -Deoptimization
>>>> -Async exception delivery
>>>> -External suspension
>>>> -Eliding memory barriers
>>>> 
>>>> All of these will benefit the VM moving towards becoming more low-latency friendly by reducing the number of global safepoints.
>>>> Platforms that do not yet implement the per JavaThread poll, a fallback to normal safepoint is in place. HandshakeOneThread will then be a normal safepoint. The supported
>>>> platforms are Linux x64 and Solaris SPARC.
>>>> 
>>>> Tested heavily with various test suits and comes with a few new tests.
>>>> 
>>>> Performance testing using standardized benchmark show no signification changes, the latest number was -0.7% on Linux x64 and +1.5% Solaris SPARC (not statistically
>>>> ensured). A minor regression for the load vs load load on x64 is expected and a slight increase on SPARC due to the cost of ?materializing? the page vs load load.
>>>> The time to trigger a safepoint was measured on a large machine to not be an issue. The looping over threads and arming the polling page will benefit from the work on
>>>> JavaThread life-cycle (8167108 - SMR and JavaThread Lifecycle: http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2017-October/024773.html) which puts all
>>>> JavaThreads in an array instead of a linked list.
>>>> 
>>>> Thanks, Robbin
>>>> 


From aph at redhat.com  Mon Oct 23 16:36:03 2017
From: aph at redhat.com (Andrew Haley)
Date: Mon, 23 Oct 2017 17:36:03 +0100
Subject: RFR(XL): 8185640: Thread-local handshakes
In-Reply-To: <6f2f6259-73f1-c09c-063e-39ae528fb96f@oracle.com>
References: <6f2f6259-73f1-c09c-063e-39ae528fb96f@oracle.com>
Message-ID: <33aff570-5bdb-d1aa-bccd-f6122db61051@redhat.com>

This is a bad way to handle supports_thread_local_poll():

  static bool supports_thread_local_poll() {
#if defined(AMD64) || defined(SPARC)
    return true;
#else
    return false;
#endif
  }

Instead, it is better to use a flag which is #defined in the back
ends, and allow each back end to specify if it supports thread-local
handshakes.  We have *two* AARCH64 back ends, and only one of them
supports thread-local handshakes; both of them #define AARCH64.

#if defined(BLAH) should be reserved for hardware-specific properties,
not back-end-specific properties.

-- 
Andrew Haley
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671

From volker.simonis at gmail.com  Mon Oct 23 17:15:01 2017
From: volker.simonis at gmail.com (Volker Simonis)
Date: Mon, 23 Oct 2017 19:15:01 +0200
Subject: RFR(M): 8166317: InterpreterCodeSize should be computed
In-Reply-To: <1c2eeaa1-334a-4744-ba31-87e580faafa5@oracle.com>
References: <CA+3eh13aijzoGexJXDB7tTnFfmqE4BffbASUysSb2=osD7F4nA@mail.gmail.com>
 <e00bf17e-0ee5-75ba-8eb3-f766b984fa9f@oracle.com>
 <CA+3eh10Sxx5hsXABdeWCZ_P=AY5aR6F37Gs3cdFEv9P_K8GUrw@mail.gmail.com>
 <CA+3eh13fpKiZZgmHCAORgOjxdrEiTipN4zOxAxdz=++GiMZ0SQ@mail.gmail.com>
 <6704868d-caa7-51e0-4741-5d62f90d837c@oracle.com>
 <CA+3eh12_Z9S-oyMNJf8MLwtveDXAx+2xAqsKjOZjS5o_nm6WUQ@mail.gmail.com>
 <8c522d38-90db-2864-0778-6d5948b1f50c@oracle.com>
 <7fee08f1-8304-3026-19e9-844e618e98ea@oracle.com>
 <CA+3eh125jBF2OEMMJ_22N__UHx=Kj6NMKtpfnfGnnf0WUk4Wbw@mail.gmail.com>
 <2bb4136a-8c0e-ac4c-0c03-af38ff79ab40@oracle.com>
 <CA+3eh125R_kgucQ0pfUAgmW9a+y=hx05_N4eomHBSzd1viEztg@mail.gmail.com>
 <5b5219a5-960e-363b-2bdc-3613f1dae62c@oracle.com>
 <ff56e260-c520-f0a2-32ac-512af28b317c@oracle.com>
 <CA+3eh10SxnE3nOXUpuQ_+vp_S_LP0ByrRhD3A2Asyo9goccs3Q@mail.gmail.com>
 <4109f960-078f-e582-3c78-71f201a265fd@redhat.com>
 <CA+3eh10S08sqtk8dgHnDPSdUmt4buvy7Ht=iYE2hKXmPXGqf1w@mail.gmail.com>
 <1c2eeaa1-334a-4744-ba31-87e580faafa5@oracle.com>
Message-ID: <CA+3eh13b_xRUUpGiH6iQJ7=kYMAAimvoMhaKKbAZdy_-2+Be6w@mail.gmail.com>

Hi Vladimir,

that's a good suggestion! I've did so and prepared a new webrev:

http://cr.openjdk.java.net/~simonis/webrevs/2017/8166317.v4/

I've also verified that:

http://cr.openjdk.java.net/~simonis/webrevs/2017/8187091.v2/

still applies after 8166317.v4

Thank you and best regards,
Volker


On Tue, Oct 17, 2017 at 7:49 PM, Vladimir Kozlov
<vladimir.kozlov at oracle.com> wrote:
> Hi, Volker
>
> You can do a trick with NOT_SPARC() macro to avoid defining empty method on
> all platforms:
>
> +#if INCLUDE_ALL_GCS
> +void g1_barrier_stubs_init() NOT_SPARC( {} );  // depends on universe_init,
> must be before interpreter_init
> +#endif
>
> I thought we pushed 8187091 already. I will keep it in mind.
>
> Thanks,
> Vladimir
>
>
> On 10/10/17 10:17 AM, Volker Simonis wrote:
>>
>> On Tue, Oct 10, 2017 at 9:42 AM, Andrew Haley <aph at redhat.com> wrote:
>>>
>>> On 09/10/17 20:24, Volker Simonis wrote:
>>>>
>>>> Unfortunately we can't easily generate these stubs during
>>>> 'stubRoutines_init1()' because
>>>> 'generate_dirty_card_log_enqueue_if_necessary()' needs the byte map
>>>> base address which is only initialized in
>>>> 'CardTableModRefBS::initialize()' during 'univers_init()' which
>>>> happens after 'stubRoutines_init1()'.
>>>
>>>
>>> Yes you can, you can do something like we do for narrow_ptrs_base:
>>>
>>>      if (Universe::is_fully_initialized()) {
>>>        mov(rheapbase, Universe::narrow_ptrs_base());
>>>      } else {
>>>        lea(rheapbase,
>>> ExternalAddress((address)Universe::narrow_ptrs_base_addr()));
>>>        ldr(rheapbase, Address(rheapbase));
>>>      }
>>>
>>
>> Hi Andrew,
>>
>> thanks for your suggestion. Yes, I could do that, but that would
>> replace a constant load in the barrier with a constant load plus a
>> load from memory, because during stubRoutines_init1() heap won't be
>> initialized. Not sure about this, but I think we want to avoid this
>> overhead in the barriers.
>>
>> Also, Christian proposed in a previous mail to replace the G1 barrier
>> stubs on SPARC with simple runtime calls like on other platforms.
>> While I think that it is probably worthwhile thinking about such a
>> change, I don't know the exact history of these stubs and probably
>> some GC experts should decide if that's really a good idea. I'd be
>> happy to open an extra issue for following up on that path.
>>
>> But for the moments I've simply added a new initialization step
>> "g1_barrier_stubs_init()" between 'univers_init()' and
>> interpreter_init() which is empty on all platforms except SPARC where
>> it generates the corresponding stubs:
>>
>> http://cr.openjdk.java.net/~simonis/webrevs/2017/8166317.v3/
>>
>> I've built and smoke-tested the new change on Windows, MacOS,
>> Solaris/SPARC, AIX, Linux/x86_64/ppc64/ppc64le/s390. Unfortunately I
>> don't have access to ARM machines so I couldn't check arm,arm64 and
>> aarch64 although I don't expect any problems there (actually I've just
>> added an empty method there). But it would be great if somebody could
>> check that for any case.
>>
>> @Vladimir: I've also rebased the change for "8187091:
>> ReturnBlobToWrongHeapTest fails because of problems in
>> CodeHeap::contains_blob()":
>>
>> http://cr.openjdk.java.net/~simonis/webrevs/2017/8187091.v2/
>>
>> Because it changes the same files like 8166317 it should be applied
>> and pushed only after 8166317 was pushed.
>>
>> Thank you and best regards,
>> Volker
>>
>>> --
>>> Andrew Haley
>>> Java Platform Lead Engineer
>>> Red Hat UK Ltd. <https://www.redhat.com>
>>> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671

From bob.vandette at oracle.com  Mon Oct 23 18:28:31 2017
From: bob.vandette at oracle.com (Bob Vandette)
Date: Mon, 23 Oct 2017 14:28:31 -0400
Subject: RFR: 8146115 - Improve docker container detection and resource
 configuration usage 
In-Reply-To: <C47EF684-41C9-4BF3-9678-A5E80A87C5D2@oracle.com>
References: <74630458-926E-4B3E-B967-6F6ADCA0A406@oracle.com>
 <2d9dd746-63e1-cade-28f9-5ca1ae1c253e@oracle.com>
 <200F07CB-35DA-492B-B78D-9EC033EE0431@oracle.com>
 <833ba1a5-49fc-bb24-ff99-994011af52aa@oracle.com>
 <C47EF684-41C9-4BF3-9678-A5E80A87C5D2@oracle.com>
Message-ID: <CAC6F920-FD59-410D-A84B-AA0049AF58BD@oracle.com>

Thanks Kim!

Bob.

> On Oct 23, 2017, at 12:52 AM, Kim Barrett <kim.barrett at oracle.com> wrote:
> 
>> On Sep 27, 2017, at 9:20 PM, David Holmes <david.holmes at oracle.com> wrote:
>>>> 62     void set_subsystem_path(char *cgroup_path) {
>>>> 
>>>> If this takes a "const char*" will it save you from casting string literals to "char*" elsewhere?
>>> I tried several different ways of declaring the container accessor functions and
>>> always ended up with warnings due to scanf not being able to validate arguments
>>> since the format string didn?t end up being a string literal.  I originally was using templates
>>> and then ended up with the macros.  I tried several different casts but could resolve the problem.
>> 
>> Sounds like something Kim Barrett should take a look at :)
> 
> Fortunately, I just happened by.
> 
> The warnings are because we compile with -Wformat=2, which enables
> -Wformat-nonliteral (among other things).
> 
> Use PRAGMA_FORMAT_NONLITERAL_IGNORED, e.g.
> 
> PRAGMA_DIAG_PUSH
> PRAGMA_FORMAT_NONLITERAL_IGNORED
> <function definition>
> PRAGMA_DIAG_POP
> 
> That will silence warnings about sscanf (or anything else!) with a
> non-literal format string within that <function definition>.
> 
> Also, while I was looking at this, I noticed that in
> get_subsytem_file_contents_##return_name, if the sum of the lengths of
> get_subsystem_path() and filename is >= MAXBUF, then we can end up
> reading from a file other than the one intended, if such a file
> exists.  That seems like it might be bad.
> 
> Also, the filename argument should be const char*.
> 


From mark.reinhold at oracle.com  Mon Oct 23 19:43:00 2017
From: mark.reinhold at oracle.com (mark.reinhold at oracle.com)
Date: Mon, 23 Oct 2017 12:43:00 -0700 (PDT)
Subject: JEP 312: Thread-Local Handshakes
Message-ID: <20171023194300.CA616EB325@eggemoggin.niobe.net>

New JEP Candidate: http://openjdk.java.net/jeps/312

- Mark

From vladimir.kozlov at oracle.com  Mon Oct 23 20:00:17 2017
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Mon, 23 Oct 2017 13:00:17 -0700
Subject: RFR(M): 8166317: InterpreterCodeSize should be computed
In-Reply-To: <CA+3eh13b_xRUUpGiH6iQJ7=kYMAAimvoMhaKKbAZdy_-2+Be6w@mail.gmail.com>
References: <CA+3eh13aijzoGexJXDB7tTnFfmqE4BffbASUysSb2=osD7F4nA@mail.gmail.com>
 <CA+3eh13fpKiZZgmHCAORgOjxdrEiTipN4zOxAxdz=++GiMZ0SQ@mail.gmail.com>
 <6704868d-caa7-51e0-4741-5d62f90d837c@oracle.com>
 <CA+3eh12_Z9S-oyMNJf8MLwtveDXAx+2xAqsKjOZjS5o_nm6WUQ@mail.gmail.com>
 <8c522d38-90db-2864-0778-6d5948b1f50c@oracle.com>
 <7fee08f1-8304-3026-19e9-844e618e98ea@oracle.com>
 <CA+3eh125jBF2OEMMJ_22N__UHx=Kj6NMKtpfnfGnnf0WUk4Wbw@mail.gmail.com>
 <2bb4136a-8c0e-ac4c-0c03-af38ff79ab40@oracle.com>
 <CA+3eh125R_kgucQ0pfUAgmW9a+y=hx05_N4eomHBSzd1viEztg@mail.gmail.com>
 <5b5219a5-960e-363b-2bdc-3613f1dae62c@oracle.com>
 <ff56e260-c520-f0a2-32ac-512af28b317c@oracle.com>
 <CA+3eh10SxnE3nOXUpuQ_+vp_S_LP0ByrRhD3A2Asyo9goccs3Q@mail.gmail.com>
 <4109f960-078f-e582-3c78-71f201a265fd@redhat.com>
 <CA+3eh10S08sqtk8dgHnDPSdUmt4buvy7Ht=iYE2hKXmPXGqf1w@mail.gmail.com>
 <1c2eeaa1-334a-4744-ba31-87e580faafa5@oracle.com>
 <CA+3eh13b_xRUUpGiH6iQJ7=kYMAAimvoMhaKKbAZdy_-2+Be6w@mail.gmail.com>
Message-ID: <9c35ed03-5a85-8b14-6874-cd828f123d16@oracle.com>

Looks good. I start new testing.

Thanks,
Vladimir

On 10/23/17 10:15 AM, Volker Simonis wrote:
> Hi Vladimir,
> 
> that's a good suggestion! I've did so and prepared a new webrev:
> 
> http://cr.openjdk.java.net/~simonis/webrevs/2017/8166317.v4/
> 
> I've also verified that:
> 
> http://cr.openjdk.java.net/~simonis/webrevs/2017/8187091.v2/
> 
> still applies after 8166317.v4
> 
> Thank you and best regards,
> Volker
> 
> 
> On Tue, Oct 17, 2017 at 7:49 PM, Vladimir Kozlov
> <vladimir.kozlov at oracle.com> wrote:
>> Hi, Volker
>>
>> You can do a trick with NOT_SPARC() macro to avoid defining empty method on
>> all platforms:
>>
>> +#if INCLUDE_ALL_GCS
>> +void g1_barrier_stubs_init() NOT_SPARC( {} );  // depends on universe_init,
>> must be before interpreter_init
>> +#endif
>>
>> I thought we pushed 8187091 already. I will keep it in mind.
>>
>> Thanks,
>> Vladimir
>>
>>
>> On 10/10/17 10:17 AM, Volker Simonis wrote:
>>>
>>> On Tue, Oct 10, 2017 at 9:42 AM, Andrew Haley <aph at redhat.com> wrote:
>>>>
>>>> On 09/10/17 20:24, Volker Simonis wrote:
>>>>>
>>>>> Unfortunately we can't easily generate these stubs during
>>>>> 'stubRoutines_init1()' because
>>>>> 'generate_dirty_card_log_enqueue_if_necessary()' needs the byte map
>>>>> base address which is only initialized in
>>>>> 'CardTableModRefBS::initialize()' during 'univers_init()' which
>>>>> happens after 'stubRoutines_init1()'.
>>>>
>>>>
>>>> Yes you can, you can do something like we do for narrow_ptrs_base:
>>>>
>>>>       if (Universe::is_fully_initialized()) {
>>>>         mov(rheapbase, Universe::narrow_ptrs_base());
>>>>       } else {
>>>>         lea(rheapbase,
>>>> ExternalAddress((address)Universe::narrow_ptrs_base_addr()));
>>>>         ldr(rheapbase, Address(rheapbase));
>>>>       }
>>>>
>>>
>>> Hi Andrew,
>>>
>>> thanks for your suggestion. Yes, I could do that, but that would
>>> replace a constant load in the barrier with a constant load plus a
>>> load from memory, because during stubRoutines_init1() heap won't be
>>> initialized. Not sure about this, but I think we want to avoid this
>>> overhead in the barriers.
>>>
>>> Also, Christian proposed in a previous mail to replace the G1 barrier
>>> stubs on SPARC with simple runtime calls like on other platforms.
>>> While I think that it is probably worthwhile thinking about such a
>>> change, I don't know the exact history of these stubs and probably
>>> some GC experts should decide if that's really a good idea. I'd be
>>> happy to open an extra issue for following up on that path.
>>>
>>> But for the moments I've simply added a new initialization step
>>> "g1_barrier_stubs_init()" between 'univers_init()' and
>>> interpreter_init() which is empty on all platforms except SPARC where
>>> it generates the corresponding stubs:
>>>
>>> http://cr.openjdk.java.net/~simonis/webrevs/2017/8166317.v3/
>>>
>>> I've built and smoke-tested the new change on Windows, MacOS,
>>> Solaris/SPARC, AIX, Linux/x86_64/ppc64/ppc64le/s390. Unfortunately I
>>> don't have access to ARM machines so I couldn't check arm,arm64 and
>>> aarch64 although I don't expect any problems there (actually I've just
>>> added an empty method there). But it would be great if somebody could
>>> check that for any case.
>>>
>>> @Vladimir: I've also rebased the change for "8187091:
>>> ReturnBlobToWrongHeapTest fails because of problems in
>>> CodeHeap::contains_blob()":
>>>
>>> http://cr.openjdk.java.net/~simonis/webrevs/2017/8187091.v2/
>>>
>>> Because it changes the same files like 8166317 it should be applied
>>> and pushed only after 8166317 was pushed.
>>>
>>> Thank you and best regards,
>>> Volker
>>>
>>>> --
>>>> Andrew Haley
>>>> Java Platform Lead Engineer
>>>> Red Hat UK Ltd. <https://www.redhat.com>
>>>> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671

From adeel.iqbal at hotmail.com  Sun Oct 22 14:23:32 2017
From: adeel.iqbal at hotmail.com (Adeel Iqbal)
Date: Sun, 22 Oct 2017 14:23:32 +0000
Subject: Modify / Add Instruction set of Java & Add New SuperInstruction to It
Message-ID: <PS1PR0301MB20733B5B5896FE3EADB47CDDFF410@PS1PR0301MB2073.apcprd03.prod.outlook.com>

Hi,
i am working on a project where i have to modify the java bytecode instruction set by adding custom instruction  (SuperInstructions) as a replacement of sequence of instructions in order to reduce the size of the generated file and to modify the JVM to recognize these new instructions.
can you please guide me.

From david.holmes at oracle.com  Tue Oct 24 05:07:40 2017
From: david.holmes at oracle.com (David Holmes)
Date: Tue, 24 Oct 2017 15:07:40 +1000
Subject: Modify / Add Instruction set of Java & Add New SuperInstruction
 to It
In-Reply-To: <PS1PR0301MB20733B5B5896FE3EADB47CDDFF410@PS1PR0301MB2073.apcprd03.prod.outlook.com>
References: <PS1PR0301MB20733B5B5896FE3EADB47CDDFF410@PS1PR0301MB2073.apcprd03.prod.outlook.com>
Message-ID: <4b506c75-eabd-8325-680a-0e1acb24bba7@oracle.com>

Hi,

On 23/10/2017 12:23 AM, Adeel Iqbal wrote:
> Hi,
> i am working on a project where i have to modify the java bytecode instruction set by adding custom instruction  (SuperInstructions) as a replacement of sequence of instructions in order to reduce the size of the generated file and to modify the JVM to recognize these new instructions.
> can you please guide me.

That's a significant project. Not knowing how much you know about 
anything makes it hard to give guidance. But you're not the first to 
attempt such a thing so I suggest doing some initial research. This is a 
quick hit I got when I googled "bytecode compaction for the JVM":

https://link.springer.com/chapter/10.1007/978-3-642-13651-1_2

It's only a preview, you'll need to get full access to the paper by some 
means. But their project has a wikipedia entry:

https://en.wikipedia.org/wiki/TakaTuka

Disclaimer: I know nothing about this system.

The theory is simple enough:

1. Identify the sequences you want to replace
2. Write a tool (or modify javac) to recognize the sequences and replace 
them with the new bytecode.
3. Add the new bytecode to the interpreter.

Before proceeding with step 3 run the tool over your benchmark 
application and see if you're really achieving your goals with regards 
to saving space.

But please don't expect step-by-step assistance with this.

Cheers,
David

From jini.george at oracle.com  Tue Oct 24 07:31:37 2017
From: jini.george at oracle.com (Jini George)
Date: Tue, 24 Oct 2017 13:01:37 +0530
Subject: RFR: SA: JDK-8189798: SA cleanup - part 1
In-Reply-To: <18501902-23db-de6c-b83d-640cd33df836@oracle.com>
References: <18501902-23db-de6c-b83d-640cd33df836@oracle.com>
Message-ID: <e7cf9e4a-7986-317d-56dc-7594e0b3c798@oracle.com>

Adding hotspot-dev too.

Thanks,
Jini.

On 10/24/2017 12:05 PM, Jini George wrote:
> Hello,
> 
> As a part of SA next, I am working on writing a test case which compares 
> the fields and the types of the fields of the SA java classes with the 
> corresponding entries in the vmStructs tables. This, to some extent, 
> would help in preventing errors in SA due to the changes in hotspot. As 
> a precursor to this, I am in the process of making some cleanup related 
> changes (mostly in SA). I plan to have the changes done in parts. For 
> this webrev, most of the changes are for:
> 
> 1. Avoiding having some values being redefined in SA. Instead have those 
> exported through vmStructs, and read it in SA. 
> (CompactibleFreeListSpace::_min_chunk_size_in_bytes, 
> CompactibleFreeListSpace::IndexSetSize)
> 
> Redefinition of hotspot values in SA makes SA error prone, when the 
> value gets altered in hotspot and the corresponding modification gets 
> missed out in SA.
> 
> 2. To remove some unused code (JNIid.java).
> 3. Add the missing "CMSBitMap::_bmStartWord" in vmStructs.
> 4. Modify variable names in SA and hotspot to match the counterpart 
> names, so that the comparison of the fields become easier. Most of the 
> changes belong to this group.
> 
> Could I please get reviews done for these precursor changes ?
> 
> JBS Id: https://bugs.openjdk.java.net/browse/JDK-8189798
> webrev: http://cr.openjdk.java.net/~jgeorge/8189798/webrev.00/
> 
> Thank you,
> Jini.
> 

From volker.simonis at gmail.com  Tue Oct 24 07:35:37 2017
From: volker.simonis at gmail.com (Volker Simonis)
Date: Tue, 24 Oct 2017 09:35:37 +0200
Subject: RFR(M): 8166317: InterpreterCodeSize should be computed
In-Reply-To: <9c35ed03-5a85-8b14-6874-cd828f123d16@oracle.com>
References: <CA+3eh13aijzoGexJXDB7tTnFfmqE4BffbASUysSb2=osD7F4nA@mail.gmail.com>
 <CA+3eh13fpKiZZgmHCAORgOjxdrEiTipN4zOxAxdz=++GiMZ0SQ@mail.gmail.com>
 <6704868d-caa7-51e0-4741-5d62f90d837c@oracle.com>
 <CA+3eh12_Z9S-oyMNJf8MLwtveDXAx+2xAqsKjOZjS5o_nm6WUQ@mail.gmail.com>
 <8c522d38-90db-2864-0778-6d5948b1f50c@oracle.com>
 <7fee08f1-8304-3026-19e9-844e618e98ea@oracle.com>
 <CA+3eh125jBF2OEMMJ_22N__UHx=Kj6NMKtpfnfGnnf0WUk4Wbw@mail.gmail.com>
 <2bb4136a-8c0e-ac4c-0c03-af38ff79ab40@oracle.com>
 <CA+3eh125R_kgucQ0pfUAgmW9a+y=hx05_N4eomHBSzd1viEztg@mail.gmail.com>
 <5b5219a5-960e-363b-2bdc-3613f1dae62c@oracle.com>
 <ff56e260-c520-f0a2-32ac-512af28b317c@oracle.com>
 <CA+3eh10SxnE3nOXUpuQ_+vp_S_LP0ByrRhD3A2Asyo9goccs3Q@mail.gmail.com>
 <4109f960-078f-e582-3c78-71f201a265fd@redhat.com>
 <CA+3eh10S08sqtk8dgHnDPSdUmt4buvy7Ht=iYE2hKXmPXGqf1w@mail.gmail.com>
 <1c2eeaa1-334a-4744-ba31-87e580faafa5@oracle.com>
 <CA+3eh13b_xRUUpGiH6iQJ7=kYMAAimvoMhaKKbAZdy_-2+Be6w@mail.gmail.com>
 <9c35ed03-5a85-8b14-6874-cd828f123d16@oracle.com>
Message-ID: <CA+3eh10SGygrZPPonxQh1igmB+YcnPeqKRMqk2SM212LXmviQg@mail.gmail.com>

Thanks,
Volker

On Mon, Oct 23, 2017 at 10:00 PM, Vladimir Kozlov
<vladimir.kozlov at oracle.com> wrote:
> Looks good. I start new testing.
>
> Thanks,
> Vladimir
>
>
> On 10/23/17 10:15 AM, Volker Simonis wrote:
>>
>> Hi Vladimir,
>>
>> that's a good suggestion! I've did so and prepared a new webrev:
>>
>> http://cr.openjdk.java.net/~simonis/webrevs/2017/8166317.v4/
>>
>> I've also verified that:
>>
>> http://cr.openjdk.java.net/~simonis/webrevs/2017/8187091.v2/
>>
>> still applies after 8166317.v4
>>
>> Thank you and best regards,
>> Volker
>>
>>
>> On Tue, Oct 17, 2017 at 7:49 PM, Vladimir Kozlov
>> <vladimir.kozlov at oracle.com> wrote:
>>>
>>> Hi, Volker
>>>
>>> You can do a trick with NOT_SPARC() macro to avoid defining empty method
>>> on
>>> all platforms:
>>>
>>> +#if INCLUDE_ALL_GCS
>>> +void g1_barrier_stubs_init() NOT_SPARC( {} );  // depends on
>>> universe_init,
>>> must be before interpreter_init
>>> +#endif
>>>
>>> I thought we pushed 8187091 already. I will keep it in mind.
>>>
>>> Thanks,
>>> Vladimir
>>>
>>>
>>> On 10/10/17 10:17 AM, Volker Simonis wrote:
>>>>
>>>>
>>>> On Tue, Oct 10, 2017 at 9:42 AM, Andrew Haley <aph at redhat.com> wrote:
>>>>>
>>>>>
>>>>> On 09/10/17 20:24, Volker Simonis wrote:
>>>>>>
>>>>>>
>>>>>> Unfortunately we can't easily generate these stubs during
>>>>>> 'stubRoutines_init1()' because
>>>>>> 'generate_dirty_card_log_enqueue_if_necessary()' needs the byte map
>>>>>> base address which is only initialized in
>>>>>> 'CardTableModRefBS::initialize()' during 'univers_init()' which
>>>>>> happens after 'stubRoutines_init1()'.
>>>>>
>>>>>
>>>>>
>>>>> Yes you can, you can do something like we do for narrow_ptrs_base:
>>>>>
>>>>>       if (Universe::is_fully_initialized()) {
>>>>>         mov(rheapbase, Universe::narrow_ptrs_base());
>>>>>       } else {
>>>>>         lea(rheapbase,
>>>>> ExternalAddress((address)Universe::narrow_ptrs_base_addr()));
>>>>>         ldr(rheapbase, Address(rheapbase));
>>>>>       }
>>>>>
>>>>
>>>> Hi Andrew,
>>>>
>>>> thanks for your suggestion. Yes, I could do that, but that would
>>>> replace a constant load in the barrier with a constant load plus a
>>>> load from memory, because during stubRoutines_init1() heap won't be
>>>> initialized. Not sure about this, but I think we want to avoid this
>>>> overhead in the barriers.
>>>>
>>>> Also, Christian proposed in a previous mail to replace the G1 barrier
>>>> stubs on SPARC with simple runtime calls like on other platforms.
>>>> While I think that it is probably worthwhile thinking about such a
>>>> change, I don't know the exact history of these stubs and probably
>>>> some GC experts should decide if that's really a good idea. I'd be
>>>> happy to open an extra issue for following up on that path.
>>>>
>>>> But for the moments I've simply added a new initialization step
>>>> "g1_barrier_stubs_init()" between 'univers_init()' and
>>>> interpreter_init() which is empty on all platforms except SPARC where
>>>> it generates the corresponding stubs:
>>>>
>>>> http://cr.openjdk.java.net/~simonis/webrevs/2017/8166317.v3/
>>>>
>>>> I've built and smoke-tested the new change on Windows, MacOS,
>>>> Solaris/SPARC, AIX, Linux/x86_64/ppc64/ppc64le/s390. Unfortunately I
>>>> don't have access to ARM machines so I couldn't check arm,arm64 and
>>>> aarch64 although I don't expect any problems there (actually I've just
>>>> added an empty method there). But it would be great if somebody could
>>>> check that for any case.
>>>>
>>>> @Vladimir: I've also rebased the change for "8187091:
>>>> ReturnBlobToWrongHeapTest fails because of problems in
>>>> CodeHeap::contains_blob()":
>>>>
>>>> http://cr.openjdk.java.net/~simonis/webrevs/2017/8187091.v2/
>>>>
>>>> Because it changes the same files like 8166317 it should be applied
>>>> and pushed only after 8166317 was pushed.
>>>>
>>>> Thank you and best regards,
>>>> Volker
>>>>
>>>>> --
>>>>> Andrew Haley
>>>>> Java Platform Lead Engineer
>>>>> Red Hat UK Ltd. <https://www.redhat.com>
>>>>> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671

From robbin.ehn at oracle.com  Tue Oct 24 14:03:37 2017
From: robbin.ehn at oracle.com (Robbin Ehn)
Date: Tue, 24 Oct 2017 16:03:37 +0200
Subject: RFR(XL): 8185640: Thread-local handshakes
In-Reply-To: <33aff570-5bdb-d1aa-bccd-f6122db61051@redhat.com>
References: <6f2f6259-73f1-c09c-063e-39ae528fb96f@oracle.com>
 <33aff570-5bdb-d1aa-bccd-f6122db61051@redhat.com>
Message-ID: <79f14e0e-6b68-feca-fbc9-3bd538ac7364@oracle.com>

On 2017-10-23 18:36, Andrew Haley wrote:
> This is a bad way to handle supports_thread_local_poll():

I agree, is this what you had in mind:
Incremental:
http://cr.openjdk.java.net/~rehn/8185640/v4/Support-Check-Haley-6/webrev/

Thanks, Robbin

> 
>    static bool supports_thread_local_poll() {
> #if defined(AMD64) || defined(SPARC)
>      return true;
> #else
>      return false;
> #endif
>    }
> 
> Instead, it is better to use a flag which is #defined in the back
> ends, and allow each back end to specify if it supports thread-local
> handshakes.  We have *two* AARCH64 back ends, and only one of them
> supports thread-local handshakes; both of them #define AARCH64.
> 
> #if defined(BLAH) should be reserved for hardware-specific properties,
> not back-end-specific properties.
> 

From bob.vandette at oracle.com  Tue Oct 24 14:11:43 2017
From: bob.vandette at oracle.com (Bob Vandette)
Date: Tue, 24 Oct 2017 10:11:43 -0400
Subject: RFR: 8146115 - Improve docker container detection and resource
 configuration usage 
In-Reply-To: <C47EF684-41C9-4BF3-9678-A5E80A87C5D2@oracle.com>
References: <74630458-926E-4B3E-B967-6F6ADCA0A406@oracle.com>
 <2d9dd746-63e1-cade-28f9-5ca1ae1c253e@oracle.com>
 <200F07CB-35DA-492B-B78D-9EC033EE0431@oracle.com>
 <833ba1a5-49fc-bb24-ff99-994011af52aa@oracle.com>
 <C47EF684-41C9-4BF3-9678-A5E80A87C5D2@oracle.com>
Message-ID: <B7430867-3BE1-4B8D-B252-87DED25396D3@oracle.com>


> On Oct 23, 2017, at 12:52 AM, Kim Barrett <kim.barrett at oracle.com> wrote:
> 
>> On Sep 27, 2017, at 9:20 PM, David Holmes <david.holmes at oracle.com> wrote:
>>>> 62     void set_subsystem_path(char *cgroup_path) {
>>>> 
>>>> If this takes a "const char*" will it save you from casting string literals to "char*" elsewhere?
>>> I tried several different ways of declaring the container accessor functions and
>>> always ended up with warnings due to scanf not being able to validate arguments
>>> since the format string didn?t end up being a string literal.  I originally was using templates
>>> and then ended up with the macros.  I tried several different casts but could resolve the problem.
>> 
>> Sounds like something Kim Barrett should take a look at :)
> 
> Fortunately, I just happened by.
> 
> The warnings are because we compile with -Wformat=2, which enables
> -Wformat-nonliteral (among other things).
> 
> Use PRAGMA_FORMAT_NONLITERAL_IGNORED, e.g.
> 
> PRAGMA_DIAG_PUSH
> PRAGMA_FORMAT_NONLITERAL_IGNORED
> <function definition>
> PRAGMA_DIAG_POP
> 
> That will silence warnings about sscanf (or anything else!) with a
> non-literal format string within that <function definition>.

Thanks but I ended up taking a different approach that resulted in more compact code.

http://cr.openjdk.java.net/~bobv/8146115/webrev.02 <http://cr.openjdk.java.net/~bobv/8146115/webrev.02>

> 
> Also, while I was looking at this, I noticed that in
> get_subsytem_file_contents_##return_name, if the sum of the lengths of
> get_subsystem_path() and filename is >= MAXBUF, then we can end up
> reading from a file other than the one intended, if such a file
> exists.  That seems like it might be bad.

I fixed all uses of strncat.

> 
> Also, the filename argument should be const char*.
> 

Fixed.

Thanks,
Bob.


From aph at redhat.com  Tue Oct 24 14:21:45 2017
From: aph at redhat.com (Andrew Haley)
Date: Tue, 24 Oct 2017 15:21:45 +0100
Subject: RFR(XL): 8185640: Thread-local handshakes
In-Reply-To: <79f14e0e-6b68-feca-fbc9-3bd538ac7364@oracle.com>
References: <6f2f6259-73f1-c09c-063e-39ae528fb96f@oracle.com>
 <33aff570-5bdb-d1aa-bccd-f6122db61051@redhat.com>
 <79f14e0e-6b68-feca-fbc9-3bd538ac7364@oracle.com>
Message-ID: <f44c9066-66a9-d134-acb3-7e7238955055@redhat.com>

On 24/10/17 15:03, Robbin Ehn wrote:
> I agree, is this what you had in mind:
> Incremental:
> http://cr.openjdk.java.net/~rehn/8185640/v4/Support-Check-Haley-6/webrev/

Perfect, thanks.

-- 
Andrew Haley
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671

From robbin.ehn at oracle.com  Tue Oct 24 14:54:28 2017
From: robbin.ehn at oracle.com (Robbin Ehn)
Date: Tue, 24 Oct 2017 16:54:28 +0200
Subject: RFR(XL): 8185640: Thread-local handshakes
In-Reply-To: <580AD7F0-2713-472C-A440-AAFDDA2D3EB3@oracle.com>
References: <6f2f6259-73f1-c09c-063e-39ae528fb96f@oracle.com>
 <15dd917732444959b7785efbe6640952@sap.com>
 <ba5210b2-d40c-b559-c9bb-35e245d2208e@oracle.com>
 <18f2001cbbbd4772aa9268e6e34b4be9@sap.com>
 <770b1286-3c8e-92e5-3929-17eb4e6c3847@oracle.com>
 <edf27200af4e4fb4a8bda1831ae2d974@sap.com>
 <6e10687c-f70e-5ee7-414f-b2c22d3e8f21@oracle.com>
 <f3553eec9b1d4815adbf7c87ca550527@sap.com>
 <580AD7F0-2713-472C-A440-AAFDDA2D3EB3@oracle.com>
Message-ID: <7591f6c0-7192-78c3-fe79-56a7785c43e4@oracle.com>

Hi,

I did a fix for the interpreter performance regression, it's plain and simple, I 
kept the polling code inside dispatch_base but made it optional as the verify oop.

Incremental:
http://cr.openjdk.java.net/~rehn/8185640/v5/Interpreter-Poll-7/webrev/index.html

Manual tested with jstack and it passes: hotspot_tier1, hotspot_handshake

It reduces the polling cost of 80%, sensitive benchmarks shows -0.44% regression 
vs TLH off. More insensitive benchmark show no regression.

Thanks, Robbin

On 2017-10-23 17:58, Karen Kinnear wrote:
> Works for me
> 
> Thanks,
> Karen
> 
>> On Oct 23, 2017, at 8:40 AM, Doerr, Martin <martin.doerr at sap.com> wrote:
>>
>> Hi Coleen and Robbin,
>>
>> I'm ok with putting it into a separate RFE. I understand that there are more fun activities than rebasing this XL change for a long time :-)
>> So you don't need to delay it. It's acceptable for me.
>>
>> Thanks, Coleen, for sharing your proposal. I appreciate it.
>>
>> Best regards,
>> Martin
>>
>>
>> -----Original Message-----
>> From: Robbin Ehn [mailto:robbin.ehn at oracle.com]
>> Sent: Montag, 23. Oktober 2017 17:26
>> To: Doerr, Martin <martin.doerr at sap.com>; hotspot-dev developers <hotspot-dev at openjdk.java.net>
>> Subject: Re: RFR(XL): 8185640: Thread-local handshakes
>>
>> Hi Martin,
>>
>>> On 2017-10-18 16:05, Doerr, Martin wrote:
>>> Hi Robbin,
>>>
>>> thanks for the quick reply and for doing additional benchmarks.
>>> Please note that t->does_dispatch() was just a first idea, but doesn't really fit for the purpose because it's false for conditional branch bytecodes for example. I just didn't find an appropriate quick check in the existing code.
>>> I guess you will notice a performance impact when benchmarking with -Xint. (I don't know if Oracle usually runs startup performance benchmarks.)
>>
>> Yes, we are seeing a performance regression, 2.5%-6% depending on benchmark.
>> We are committed to fix this, but it might come as separate RFE/bug depending on
>> the JEP's timeline.
>>
>> (If the fix, very unlikely, would not be done before next release, we would
>> change the default to off)
>>
>> I hope this is an acceptable path?
>>
>> Thanks, Robbin
>>
>>>
>>> Best regards,
>>> Martin
>>>
>>>
>>> -----Original Message-----
>>> From: Robbin Ehn [mailto:robbin.ehn at oracle.com]
>>> Sent: Mittwoch, 18. Oktober 2017 15:58
>>> To: Doerr, Martin <martin.doerr at sap.com>; hotspot-dev developers <hotspot-dev at openjdk.java.net>
>>> Subject: Re: RFR(XL): 8185640: Thread-local handshakes
>>>
>>> Hi Martin,
>>>
>>>> On 2017-10-18 12:11, Doerr, Martin wrote:
>>>> Hi Robbin,
>>>>
>>>> so you would like to push your version first (as it does not break other platforms) and then help us to push non-Oracle platform implementations which change shared code again?
>>>> I'd be fine with that, too.
>>>
>>> Yes, great!
>>>
>>>>
>>>> While thinking a little longer about the interpreter implementation, a new idea came into my mind.
>>>> I think we could significantly reduce impact on interpreter code size and performance by using safepoint polls only in a subset of bytecodes. E.g., we could use only bytecodes which perform any kind of jump by implementing something like
>>>> if (SafepointMechanism::uses_thread_local_poll() && t->does_dispatch()) generate_safepoint_poll();
>>>> in TemplateInterpreterGenerator::generate_and_dispatch.
>>>
>>> We have not seen any performance regression in simple benchmark with this.
>>> I will do a better benchmark and compare what difference it makes.
>>>
>>> Thanks, Robbin
>>>
>>>>
>>>> Best regards,
>>>> Martin
>>>>
>>>>
>>>> -----Original Message-----
>>>> From: Robbin Ehn [mailto:robbin.ehn at oracle.com]
>>>> Sent: Mittwoch, 18. Oktober 2017 11:07
>>>> To: Doerr, Martin <martin.doerr at sap.com>; hotspot-dev developers <hotspot-dev at openjdk.java.net>
>>>> Subject: Re: RFR(XL): 8185640: Thread-local handshakes
>>>>
>>>> Thanks for looking at this.
>>>>
>>>>> On 2017-10-17 19:58, Doerr, Martin wrote:
>>>>> Hi Robbin,
>>>>>
>>>>> my first impression is very good. Thanks for providing the webrev.
>>>>
>>>> Great!
>>>>
>>>>>
>>>>> I only don't like that "poll_page_val | poll_bit()" is used in shared code. I'd prefer to use either one or the other mechanism.
>>>>> Would it be ok to move the decision between what to use to platform code?
>>>>> (Some platforms could still use both if this is beneficial.)
>>>>>
>>>>> E.g. on PPC64, we'd like to use conditional trap instructions with special bit patterns if UseSIGTRAP is on. Would be excellent if we could implement set functions for _poll_armed_value and _poll_disarmed_value in platform code. poll_bit() also fits better into platform code in my opinion.
>>>>
>>>> I see no issue with this.
>>>> Maybe SafepointMechanism::local_poll_armed should be possibly platform specific.
>>>> Can we do this incremental when adding the platform support for PPC64?
>>>>
>>>> Thanks, Robbin
>>>>
>>>>>
>>>>> Best regards,
>>>>> Martin
>>>>>
>>>>>
>>>>> -----Original Message-----
>>>>> From: hotspot-dev [mailto:hotspot-dev-bounces at openjdk.java.net] On Behalf Of Robbin Ehn
>>>>> Sent: Mittwoch, 11. Oktober 2017 15:38
>>>>> To: hotspot-dev developers <hotspot-dev at openjdk.java.net>
>>>>> Subject: RFR(XL): 8185640: Thread-local handshakes
>>>>>
>>>>> Hi all,
>>>>>
>>>>> Starting the review of the code while JEP work is still not completed.
>>>>>
>>>>> JEP: https://bugs.openjdk.java.net/browse/JDK-8185640
>>>>>
>>>>> This JEP introduces a way to execute a callback on threads without performing a global VM safepoint. It makes it both possible and cheap to stop individual threads and not
>>>>> just all threads or none.
>>>>>
>>>>> Entire changeset:
>>>>> http://cr.openjdk.java.net/~rehn/8185640/v0/flat/
>>>>>
>>>>> Divided into 3-parts,
>>>>> SafepointMechanism abstraction:
>>>>> http://cr.openjdk.java.net/~rehn/8185640/v0/SafepointMechanism-0/
>>>>> Consolidating polling page allocation:
>>>>> http://cr.openjdk.java.net/~rehn/8185640/v0/PollingPage-1/
>>>>> Handshakes:
>>>>> http://cr.openjdk.java.net/~rehn/8185640/v0/Handshakes-2/
>>>>>
>>>>> A handshake operation is a callback that is executed for each JavaThread while that thread is in a safepoint safe state. The callback is executed either by the thread
>>>>> itself or by the VM thread while keeping the thread in a blocked state. The big difference between safepointing and handshaking is that the per thread operation will be
>>>>> performed on all threads as soon as possible and they will continue to execute as soon as it?s own operation is completed. If a JavaThread is known to be running, then a
>>>>> handshake can be performed with that single JavaThread as well.
>>>>>
>>>>> The current safepointing scheme is modified to perform an indirection through a per-thread pointer which will allow a single thread's execution to be forced to trap on the
>>>>> guard page. In order to force a thread to yield the VM updates the per-thread pointer for the corresponding thread to point to the guarded page.
>>>>>
>>>>> Example of potential use-cases:
>>>>> -Biased lock revocation
>>>>> -External requests for stack traces
>>>>> -Deoptimization
>>>>> -Async exception delivery
>>>>> -External suspension
>>>>> -Eliding memory barriers
>>>>>
>>>>> All of these will benefit the VM moving towards becoming more low-latency friendly by reducing the number of global safepoints.
>>>>> Platforms that do not yet implement the per JavaThread poll, a fallback to normal safepoint is in place. HandshakeOneThread will then be a normal safepoint. The supported
>>>>> platforms are Linux x64 and Solaris SPARC.
>>>>>
>>>>> Tested heavily with various test suits and comes with a few new tests.
>>>>>
>>>>> Performance testing using standardized benchmark show no signification changes, the latest number was -0.7% on Linux x64 and +1.5% Solaris SPARC (not statistically
>>>>> ensured). A minor regression for the load vs load load on x64 is expected and a slight increase on SPARC due to the cost of ?materializing? the page vs load load.
>>>>> The time to trigger a safepoint was measured on a large machine to not be an issue. The looping over threads and arming the polling page will benefit from the work on
>>>>> JavaThread life-cycle (8167108 - SMR and JavaThread Lifecycle: http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2017-October/024773.html) which puts all
>>>>> JavaThreads in an array instead of a linked list.
>>>>>
>>>>> Thanks, Robbin
>>>>>
> 

From martin.doerr at sap.com  Tue Oct 24 17:08:25 2017
From: martin.doerr at sap.com (Doerr, Martin)
Date: Tue, 24 Oct 2017 17:08:25 +0000
Subject: RFR(XL): 8185640: Thread-local handshakes
In-Reply-To: <7591f6c0-7192-78c3-fe79-56a7785c43e4@oracle.com>
References: <6f2f6259-73f1-c09c-063e-39ae528fb96f@oracle.com>
 <15dd917732444959b7785efbe6640952@sap.com>
 <ba5210b2-d40c-b559-c9bb-35e245d2208e@oracle.com>
 <18f2001cbbbd4772aa9268e6e34b4be9@sap.com>
 <770b1286-3c8e-92e5-3929-17eb4e6c3847@oracle.com>
 <edf27200af4e4fb4a8bda1831ae2d974@sap.com>
 <6e10687c-f70e-5ee7-414f-b2c22d3e8f21@oracle.com>
 <f3553eec9b1d4815adbf7c87ca550527@sap.com>
 <580AD7F0-2713-472C-A440-AAFDDA2D3EB3@oracle.com>
 <7591f6c0-7192-78c3-fe79-56a7785c43e4@oracle.com>
Message-ID: <0b34f052cc7047cfb40dc44e91c1300d@sap.com>

Hi Robbin,

sounds good. Thanks a lot for doing it.
The change looks good to me except that I'd expect a poll for wide_ret, too.

Best regards,
Martin


-----Original Message-----
From: Robbin Ehn [mailto:robbin.ehn at oracle.com] 
Sent: Dienstag, 24. Oktober 2017 16:54
To: Karen Kinnear <karen.kinnear at oracle.com>; Doerr, Martin <martin.doerr at sap.com>
Cc: hotspot-dev developers <hotspot-dev at openjdk.java.net>; Coleen Phillimore (coleen.phillimore at oracle.com) <coleen.phillimore at oracle.com>
Subject: Re: RFR(XL): 8185640: Thread-local handshakes

Hi,

I did a fix for the interpreter performance regression, it's plain and simple, I 
kept the polling code inside dispatch_base but made it optional as the verify oop.

Incremental:
http://cr.openjdk.java.net/~rehn/8185640/v5/Interpreter-Poll-7/webrev/index.html

Manual tested with jstack and it passes: hotspot_tier1, hotspot_handshake

It reduces the polling cost of 80%, sensitive benchmarks shows -0.44% regression 
vs TLH off. More insensitive benchmark show no regression.

Thanks, Robbin

On 2017-10-23 17:58, Karen Kinnear wrote:
> Works for me
> 
> Thanks,
> Karen
> 
>> On Oct 23, 2017, at 8:40 AM, Doerr, Martin <martin.doerr at sap.com> wrote:
>>
>> Hi Coleen and Robbin,
>>
>> I'm ok with putting it into a separate RFE. I understand that there are more fun activities than rebasing this XL change for a long time :-)
>> So you don't need to delay it. It's acceptable for me.
>>
>> Thanks, Coleen, for sharing your proposal. I appreciate it.
>>
>> Best regards,
>> Martin
>>
>>
>> -----Original Message-----
>> From: Robbin Ehn [mailto:robbin.ehn at oracle.com]
>> Sent: Montag, 23. Oktober 2017 17:26
>> To: Doerr, Martin <martin.doerr at sap.com>; hotspot-dev developers <hotspot-dev at openjdk.java.net>
>> Subject: Re: RFR(XL): 8185640: Thread-local handshakes
>>
>> Hi Martin,
>>
>>> On 2017-10-18 16:05, Doerr, Martin wrote:
>>> Hi Robbin,
>>>
>>> thanks for the quick reply and for doing additional benchmarks.
>>> Please note that t->does_dispatch() was just a first idea, but doesn't really fit for the purpose because it's false for conditional branch bytecodes for example. I just didn't find an appropriate quick check in the existing code.
>>> I guess you will notice a performance impact when benchmarking with -Xint. (I don't know if Oracle usually runs startup performance benchmarks.)
>>
>> Yes, we are seeing a performance regression, 2.5%-6% depending on benchmark.
>> We are committed to fix this, but it might come as separate RFE/bug depending on
>> the JEP's timeline.
>>
>> (If the fix, very unlikely, would not be done before next release, we would
>> change the default to off)
>>
>> I hope this is an acceptable path?
>>
>> Thanks, Robbin
>>
>>>
>>> Best regards,
>>> Martin
>>>
>>>
>>> -----Original Message-----
>>> From: Robbin Ehn [mailto:robbin.ehn at oracle.com]
>>> Sent: Mittwoch, 18. Oktober 2017 15:58
>>> To: Doerr, Martin <martin.doerr at sap.com>; hotspot-dev developers <hotspot-dev at openjdk.java.net>
>>> Subject: Re: RFR(XL): 8185640: Thread-local handshakes
>>>
>>> Hi Martin,
>>>
>>>> On 2017-10-18 12:11, Doerr, Martin wrote:
>>>> Hi Robbin,
>>>>
>>>> so you would like to push your version first (as it does not break other platforms) and then help us to push non-Oracle platform implementations which change shared code again?
>>>> I'd be fine with that, too.
>>>
>>> Yes, great!
>>>
>>>>
>>>> While thinking a little longer about the interpreter implementation, a new idea came into my mind.
>>>> I think we could significantly reduce impact on interpreter code size and performance by using safepoint polls only in a subset of bytecodes. E.g., we could use only bytecodes which perform any kind of jump by implementing something like
>>>> if (SafepointMechanism::uses_thread_local_poll() && t->does_dispatch()) generate_safepoint_poll();
>>>> in TemplateInterpreterGenerator::generate_and_dispatch.
>>>
>>> We have not seen any performance regression in simple benchmark with this.
>>> I will do a better benchmark and compare what difference it makes.
>>>
>>> Thanks, Robbin
>>>
>>>>
>>>> Best regards,
>>>> Martin
>>>>
>>>>
>>>> -----Original Message-----
>>>> From: Robbin Ehn [mailto:robbin.ehn at oracle.com]
>>>> Sent: Mittwoch, 18. Oktober 2017 11:07
>>>> To: Doerr, Martin <martin.doerr at sap.com>; hotspot-dev developers <hotspot-dev at openjdk.java.net>
>>>> Subject: Re: RFR(XL): 8185640: Thread-local handshakes
>>>>
>>>> Thanks for looking at this.
>>>>
>>>>> On 2017-10-17 19:58, Doerr, Martin wrote:
>>>>> Hi Robbin,
>>>>>
>>>>> my first impression is very good. Thanks for providing the webrev.
>>>>
>>>> Great!
>>>>
>>>>>
>>>>> I only don't like that "poll_page_val | poll_bit()" is used in shared code. I'd prefer to use either one or the other mechanism.
>>>>> Would it be ok to move the decision between what to use to platform code?
>>>>> (Some platforms could still use both if this is beneficial.)
>>>>>
>>>>> E.g. on PPC64, we'd like to use conditional trap instructions with special bit patterns if UseSIGTRAP is on. Would be excellent if we could implement set functions for _poll_armed_value and _poll_disarmed_value in platform code. poll_bit() also fits better into platform code in my opinion.
>>>>
>>>> I see no issue with this.
>>>> Maybe SafepointMechanism::local_poll_armed should be possibly platform specific.
>>>> Can we do this incremental when adding the platform support for PPC64?
>>>>
>>>> Thanks, Robbin
>>>>
>>>>>
>>>>> Best regards,
>>>>> Martin
>>>>>
>>>>>
>>>>> -----Original Message-----
>>>>> From: hotspot-dev [mailto:hotspot-dev-bounces at openjdk.java.net] On Behalf Of Robbin Ehn
>>>>> Sent: Mittwoch, 11. Oktober 2017 15:38
>>>>> To: hotspot-dev developers <hotspot-dev at openjdk.java.net>
>>>>> Subject: RFR(XL): 8185640: Thread-local handshakes
>>>>>
>>>>> Hi all,
>>>>>
>>>>> Starting the review of the code while JEP work is still not completed.
>>>>>
>>>>> JEP: https://bugs.openjdk.java.net/browse/JDK-8185640
>>>>>
>>>>> This JEP introduces a way to execute a callback on threads without performing a global VM safepoint. It makes it both possible and cheap to stop individual threads and not
>>>>> just all threads or none.
>>>>>
>>>>> Entire changeset:
>>>>> http://cr.openjdk.java.net/~rehn/8185640/v0/flat/
>>>>>
>>>>> Divided into 3-parts,
>>>>> SafepointMechanism abstraction:
>>>>> http://cr.openjdk.java.net/~rehn/8185640/v0/SafepointMechanism-0/
>>>>> Consolidating polling page allocation:
>>>>> http://cr.openjdk.java.net/~rehn/8185640/v0/PollingPage-1/
>>>>> Handshakes:
>>>>> http://cr.openjdk.java.net/~rehn/8185640/v0/Handshakes-2/
>>>>>
>>>>> A handshake operation is a callback that is executed for each JavaThread while that thread is in a safepoint safe state. The callback is executed either by the thread
>>>>> itself or by the VM thread while keeping the thread in a blocked state. The big difference between safepointing and handshaking is that the per thread operation will be
>>>>> performed on all threads as soon as possible and they will continue to execute as soon as it?s own operation is completed. If a JavaThread is known to be running, then a
>>>>> handshake can be performed with that single JavaThread as well.
>>>>>
>>>>> The current safepointing scheme is modified to perform an indirection through a per-thread pointer which will allow a single thread's execution to be forced to trap on the
>>>>> guard page. In order to force a thread to yield the VM updates the per-thread pointer for the corresponding thread to point to the guarded page.
>>>>>
>>>>> Example of potential use-cases:
>>>>> -Biased lock revocation
>>>>> -External requests for stack traces
>>>>> -Deoptimization
>>>>> -Async exception delivery
>>>>> -External suspension
>>>>> -Eliding memory barriers
>>>>>
>>>>> All of these will benefit the VM moving towards becoming more low-latency friendly by reducing the number of global safepoints.
>>>>> Platforms that do not yet implement the per JavaThread poll, a fallback to normal safepoint is in place. HandshakeOneThread will then be a normal safepoint. The supported
>>>>> platforms are Linux x64 and Solaris SPARC.
>>>>>
>>>>> Tested heavily with various test suits and comes with a few new tests.
>>>>>
>>>>> Performance testing using standardized benchmark show no signification changes, the latest number was -0.7% on Linux x64 and +1.5% Solaris SPARC (not statistically
>>>>> ensured). A minor regression for the load vs load load on x64 is expected and a slight increase on SPARC due to the cost of ?materializing? the page vs load load.
>>>>> The time to trigger a safepoint was measured on a large machine to not be an issue. The looping over threads and arming the polling page will benefit from the work on
>>>>> JavaThread life-cycle (8167108 - SMR and JavaThread Lifecycle: http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2017-October/024773.html) which puts all
>>>>> JavaThreads in an array instead of a linked list.
>>>>>
>>>>> Thanks, Robbin
>>>>>
> 

From kim.barrett at oracle.com  Wed Oct 25 06:57:14 2017
From: kim.barrett at oracle.com (Kim Barrett)
Date: Wed, 25 Oct 2017 02:57:14 -0400
Subject: RFR: 8146115 - Improve docker container detection and resource
 configuration usage
In-Reply-To: <B7430867-3BE1-4B8D-B252-87DED25396D3@oracle.com>
References: <74630458-926E-4B3E-B967-6F6ADCA0A406@oracle.com>
 <2d9dd746-63e1-cade-28f9-5ca1ae1c253e@oracle.com>
 <200F07CB-35DA-492B-B78D-9EC033EE0431@oracle.com>
 <833ba1a5-49fc-bb24-ff99-994011af52aa@oracle.com>
 <C47EF684-41C9-4BF3-9678-A5E80A87C5D2@oracle.com>
 <B7430867-3BE1-4B8D-B252-87DED25396D3@oracle.com>
Message-ID: <E5A49BC8-B64A-4145-B351-E8B33FCCE28C@oracle.com>

> On Oct 24, 2017, at 10:11 AM, Bob Vandette <bob.vandette at oracle.com> wrote:
> 
> 
>> On Oct 23, 2017, at 12:52 AM, Kim Barrett <kim.barrett at oracle.com> wrote:
>> 
>>> On Sep 27, 2017, at 9:20 PM, David Holmes <david.holmes at oracle.com> wrote:
>>>>> 62     void set_subsystem_path(char *cgroup_path) {
>>>>> 
>>>>> If this takes a "const char*" will it save you from casting string literals to "char*" elsewhere?
>>>> I tried several different ways of declaring the container accessor functions and
>>>> always ended up with warnings due to scanf not being able to validate arguments
>>>> since the format string didn?t end up being a string literal.  I originally was using templates
>>>> and then ended up with the macros.  I tried several different casts but could resolve the problem.
>>> 
>>> Sounds like something Kim Barrett should take a look at :)
>> 
>> Fortunately, I just happened by.
>> 
>> The warnings are because we compile with -Wformat=2, which enables
>> -Wformat-nonliteral (among other things).
>> 
>> Use PRAGMA_FORMAT_NONLITERAL_IGNORED, e.g.
>> 
>> PRAGMA_DIAG_PUSH
>> PRAGMA_FORMAT_NONLITERAL_IGNORED
>> <function definition>
>> PRAGMA_DIAG_POP
>> 
>> That will silence warnings about sscanf (or anything else!) with a
>> non-literal format string within that <function definition>.
> 
> Thanks but I ended up taking a different approach that resulted in more compact code.
> 
> http://cr.openjdk.java.net/~bobv/8146115/webrev.02

Not a review, just a few more comments in passing.

------------------------------------------------------------------------------ 
src/hotspot/os/linux/osContainer_linux.cpp 
 150             log_debug(os, container)("Type %s not found in file %s\n",    \
 151                                      scan_fmt , buf);                     \

uses buf as path, but buf has been clobbered to contain contents from
file.

Similarly for
 155         log_debug(os, container)("Empty file %s\n", buf);                 \

------------------------------------------------------------------------------
src/hotspot/os/linux/osContainer_linux.cpp 
 158       log_debug(os, container)("file not found %s\n", buf);               \

There are many reasons why fopen might fail, and merging them all into
a "file not found" message could be quite confusing.  It would be much
better to report the error from errno.

------------------------------------------------------------------------------ 
src/hotspot/os/linux/osContainer_linux.cpp 

Something like the following (where the obvious helpers are made up to
keep this short) would eliminate the macrology.

PRAGMA_DIAG_PUSH  
PRAGMA_FORMAT_NONLITERAL_IGNORED
template<typename T>
int get_subsystem_file_contents_value(CgroupSubsystem* c,
                                  const char* filename,
                                  T* returnval,
                                  const char* scan_fmt,
				  const char* description) {
  const char* line = get_subsystem_file_line(c, filename);
  if (line != NULL) {
    if (sscanf(line, scan_fmt, returnval) == 1) {
      return 0;
    } else {
      report_subsystem_file_contents_parse_error(description, c, filename);
    }
  }
  return OSCONTAINER_ERROR;
}
PRAGMA_DIAG_POP

int subsystem_file_contents_int(CgroupSubsystem* c,
                                const char* filename,
                                int* returnval) {
  return get_subsystem_file_contents_value(c, filename, returnval, "%d", "int");
}

------------------------------------------------------------------------------


From aph at redhat.com  Wed Oct 25 10:32:33 2017
From: aph at redhat.com (Andrew Haley)
Date: Wed, 25 Oct 2017 11:32:33 +0100
Subject: RFR(XL): 8185640: Thread-local handshakes
In-Reply-To: <6f2f6259-73f1-c09c-063e-39ae528fb96f@oracle.com>
References: <6f2f6259-73f1-c09c-063e-39ae528fb96f@oracle.com>
Message-ID: <a6703425-7c00-a55e-60e8-49f892fffc19@redhat.com>

Do we know hat this is always correct for C2?  Could we not have
something like

    ldr r0, [rthread, #polling_page_offset]

loop:
    ldr rscratch, [r0]  {poll}
    cmp foo, bar
    bne loop

when C2 hoists the load of the polling page address out of a loop?

Or is such hoisting disable for this case?

-- 
Andrew Haley
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671

From robbin.ehn at oracle.com  Wed Oct 25 11:36:33 2017
From: robbin.ehn at oracle.com (Robbin Ehn)
Date: Wed, 25 Oct 2017 13:36:33 +0200
Subject: RFR(XL): 8185640: Thread-local handshakes
In-Reply-To: <a6703425-7c00-a55e-60e8-49f892fffc19@redhat.com>
References: <6f2f6259-73f1-c09c-063e-39ae528fb96f@oracle.com>
 <a6703425-7c00-a55e-60e8-49f892fffc19@redhat.com>
Message-ID: <dce56ee7-a429-0dee-8abd-b61bd57ba6c4@oracle.com>

Hi Andrew,

The address of the polling page address is static per thread.
The load of the polling page address is a dependent load.

If the add of the offset to rthread is done outside loop, that is perfectly 
fine. I do not see an issue here. If I did not understand you correctly, please 
let me know.

Thanks, Robbin

On 2017-10-25 12:32, Andrew Haley wrote:
> Do we know hat this is always correct for C2?  Could we not have
> something like
> 
>      ldr r0, [rthread, #polling_page_offset]
> 
> loop:
>      ldr rscratch, [r0]  {poll}
>      cmp foo, bar
>      bne loop
> 
> when C2 hoists the load of the polling page address out of a loop?
> 
> Or is such hoisting disable for this case?
> 

From erik.osterlund at oracle.com  Wed Oct 25 11:45:46 2017
From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=)
Date: Wed, 25 Oct 2017 13:45:46 +0200
Subject: RFR(XL): 8185640: Thread-local handshakes
In-Reply-To: <dce56ee7-a429-0dee-8abd-b61bd57ba6c4@oracle.com>
References: <6f2f6259-73f1-c09c-063e-39ae528fb96f@oracle.com>
 <a6703425-7c00-a55e-60e8-49f892fffc19@redhat.com>
 <dce56ee7-a429-0dee-8abd-b61bd57ba6c4@oracle.com>
Message-ID: <59F0796A.9060408@oracle.com>

Hi Robbin and Andrew,

@Robbin: I think Andrew is concerned about a poll inside of the loop 
always being on if the initial load on rthread points to the trapping 
page and was loaded into a register (before a loop) that is not changed 
inside of the loop, and as a consequence gets stuck in trapping all the 
time for every poll in the loop.

By making the initial load from rthread of "raw" pointer type, this load 
will (as far as I know) not be moved outside of the loop. If it ever 
was, it would be a bug.

Thanks,
/Erik

On 2017-10-25 13:36, Robbin Ehn wrote:
> Hi Andrew,
>
> The address of the polling page address is static per thread.
> The load of the polling page address is a dependent load.
>
> If the add of the offset to rthread is done outside loop, that is 
> perfectly fine. I do not see an issue here. If I did not understand 
> you correctly, please let me know.
>
> Thanks, Robbin
>
> On 2017-10-25 12:32, Andrew Haley wrote:
>> Do we know hat this is always correct for C2?  Could we not have
>> something like
>>
>>      ldr r0, [rthread, #polling_page_offset]
>>
>> loop:
>>      ldr rscratch, [r0]  {poll}
>>      cmp foo, bar
>>      bne loop
>>
>> when C2 hoists the load of the polling page address out of a loop?
>>
>> Or is such hoisting disable for this case?
>>


From aph at redhat.com  Wed Oct 25 11:52:03 2017
From: aph at redhat.com (Andrew Haley)
Date: Wed, 25 Oct 2017 12:52:03 +0100
Subject: RFR(XL): 8185640: Thread-local handshakes
In-Reply-To: <59F0796A.9060408@oracle.com>
References: <6f2f6259-73f1-c09c-063e-39ae528fb96f@oracle.com>
 <a6703425-7c00-a55e-60e8-49f892fffc19@redhat.com>
 <dce56ee7-a429-0dee-8abd-b61bd57ba6c4@oracle.com>
 <59F0796A.9060408@oracle.com>
Message-ID: <c8fde608-6e39-ef22-784b-85d79170982f@redhat.com>

On 25/10/17 12:45, Erik ?sterlund wrote:
> By making the initial load from rthread of "raw" pointer type, this load 
> will (as far as I know) not be moved outside of the loop. If it ever 
> was, it would be a bug.

OK, that's the answer to my question.

-- 
Andrew Haley
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671

From robbin.ehn at oracle.com  Wed Oct 25 12:28:50 2017
From: robbin.ehn at oracle.com (Robbin Ehn)
Date: Wed, 25 Oct 2017 14:28:50 +0200
Subject: RFR(XL): 8185640: Thread-local handshakes
In-Reply-To: <c8fde608-6e39-ef22-784b-85d79170982f@redhat.com>
References: <6f2f6259-73f1-c09c-063e-39ae528fb96f@oracle.com>
 <a6703425-7c00-a55e-60e8-49f892fffc19@redhat.com>
 <dce56ee7-a429-0dee-8abd-b61bd57ba6c4@oracle.com>
 <59F0796A.9060408@oracle.com>
 <c8fde608-6e39-ef22-784b-85d79170982f@redhat.com>
Message-ID: <b866e554-3ca0-d65a-44d1-b199b309d280@oracle.com>

Thanks Erik for understanding and answering!

On 2017-10-25 13:52, Andrew Haley wrote:
> On 25/10/17 12:45, Erik ?sterlund wrote:
>> By making the initial load from rthread of "raw" pointer type, this load
>> will (as far as I know) not be moved outside of the loop. If it ever
>> was, it would be a bug.
> 
> OK, that's the answer to my question.

Great!

/Robbin

> 

From robbin.ehn at oracle.com  Wed Oct 25 12:53:38 2017
From: robbin.ehn at oracle.com (Robbin Ehn)
Date: Wed, 25 Oct 2017 14:53:38 +0200
Subject: RFR(XL): 8185640: Thread-local handshakes
In-Reply-To: <0b34f052cc7047cfb40dc44e91c1300d@sap.com>
References: <6f2f6259-73f1-c09c-063e-39ae528fb96f@oracle.com>
 <15dd917732444959b7785efbe6640952@sap.com>
 <ba5210b2-d40c-b559-c9bb-35e245d2208e@oracle.com>
 <18f2001cbbbd4772aa9268e6e34b4be9@sap.com>
 <770b1286-3c8e-92e5-3929-17eb4e6c3847@oracle.com>
 <edf27200af4e4fb4a8bda1831ae2d974@sap.com>
 <6e10687c-f70e-5ee7-414f-b2c22d3e8f21@oracle.com>
 <f3553eec9b1d4815adbf7c87ca550527@sap.com>
 <580AD7F0-2713-472C-A440-AAFDDA2D3EB3@oracle.com>
 <7591f6c0-7192-78c3-fe79-56a7785c43e4@oracle.com>
 <0b34f052cc7047cfb40dc44e91c1300d@sap.com>
Message-ID: <e98367fa-bb76-8aa8-5cd5-caf7e93ceb3c@oracle.com>

Hi Martin,

On 2017-10-24 19:08, Doerr, Martin wrote:
> Hi Robbin,
> 
> sounds good. Thanks a lot for doing it.
> The change looks good to me except that I'd expect a poll for wide_ret, too.

Yes, incremental:
http://cr.openjdk.java.net/~rehn/8185640/v6/Interpreter-Poll-Wide_Ret-8/webrev/index.html

Sanity tested, running big test job now.

Thanks!

/Robbin

> 
> Best regards,
> Martin
> 
> 
> -----Original Message-----
> From: Robbin Ehn [mailto:robbin.ehn at oracle.com]
> Sent: Dienstag, 24. Oktober 2017 16:54
> To: Karen Kinnear <karen.kinnear at oracle.com>; Doerr, Martin <martin.doerr at sap.com>
> Cc: hotspot-dev developers <hotspot-dev at openjdk.java.net>; Coleen Phillimore (coleen.phillimore at oracle.com) <coleen.phillimore at oracle.com>
> Subject: Re: RFR(XL): 8185640: Thread-local handshakes
> 
> Hi,
> 
> I did a fix for the interpreter performance regression, it's plain and simple, I
> kept the polling code inside dispatch_base but made it optional as the verify oop.
> 
> Incremental:
> http://cr.openjdk.java.net/~rehn/8185640/v5/Interpreter-Poll-7/webrev/index.html
> 
> Manual tested with jstack and it passes: hotspot_tier1, hotspot_handshake
> 
> It reduces the polling cost of 80%, sensitive benchmarks shows -0.44% regression
> vs TLH off. More insensitive benchmark show no regression.
> 
> Thanks, Robbin
> 
> On 2017-10-23 17:58, Karen Kinnear wrote:
>> Works for me
>>
>> Thanks,
>> Karen
>>
>>> On Oct 23, 2017, at 8:40 AM, Doerr, Martin <martin.doerr at sap.com> wrote:
>>>
>>> Hi Coleen and Robbin,
>>>
>>> I'm ok with putting it into a separate RFE. I understand that there are more fun activities than rebasing this XL change for a long time :-)
>>> So you don't need to delay it. It's acceptable for me.
>>>
>>> Thanks, Coleen, for sharing your proposal. I appreciate it.
>>>
>>> Best regards,
>>> Martin
>>>
>>>
>>> -----Original Message-----
>>> From: Robbin Ehn [mailto:robbin.ehn at oracle.com]
>>> Sent: Montag, 23. Oktober 2017 17:26
>>> To: Doerr, Martin <martin.doerr at sap.com>; hotspot-dev developers <hotspot-dev at openjdk.java.net>
>>> Subject: Re: RFR(XL): 8185640: Thread-local handshakes
>>>
>>> Hi Martin,
>>>
>>>> On 2017-10-18 16:05, Doerr, Martin wrote:
>>>> Hi Robbin,
>>>>
>>>> thanks for the quick reply and for doing additional benchmarks.
>>>> Please note that t->does_dispatch() was just a first idea, but doesn't really fit for the purpose because it's false for conditional branch bytecodes for example. I just didn't find an appropriate quick check in the existing code.
>>>> I guess you will notice a performance impact when benchmarking with -Xint. (I don't know if Oracle usually runs startup performance benchmarks.)
>>>
>>> Yes, we are seeing a performance regression, 2.5%-6% depending on benchmark.
>>> We are committed to fix this, but it might come as separate RFE/bug depending on
>>> the JEP's timeline.
>>>
>>> (If the fix, very unlikely, would not be done before next release, we would
>>> change the default to off)
>>>
>>> I hope this is an acceptable path?
>>>
>>> Thanks, Robbin
>>>
>>>>
>>>> Best regards,
>>>> Martin
>>>>
>>>>
>>>> -----Original Message-----
>>>> From: Robbin Ehn [mailto:robbin.ehn at oracle.com]
>>>> Sent: Mittwoch, 18. Oktober 2017 15:58
>>>> To: Doerr, Martin <martin.doerr at sap.com>; hotspot-dev developers <hotspot-dev at openjdk.java.net>
>>>> Subject: Re: RFR(XL): 8185640: Thread-local handshakes
>>>>
>>>> Hi Martin,
>>>>
>>>>> On 2017-10-18 12:11, Doerr, Martin wrote:
>>>>> Hi Robbin,
>>>>>
>>>>> so you would like to push your version first (as it does not break other platforms) and then help us to push non-Oracle platform implementations which change shared code again?
>>>>> I'd be fine with that, too.
>>>>
>>>> Yes, great!
>>>>
>>>>>
>>>>> While thinking a little longer about the interpreter implementation, a new idea came into my mind.
>>>>> I think we could significantly reduce impact on interpreter code size and performance by using safepoint polls only in a subset of bytecodes. E.g., we could use only bytecodes which perform any kind of jump by implementing something like
>>>>> if (SafepointMechanism::uses_thread_local_poll() && t->does_dispatch()) generate_safepoint_poll();
>>>>> in TemplateInterpreterGenerator::generate_and_dispatch.
>>>>
>>>> We have not seen any performance regression in simple benchmark with this.
>>>> I will do a better benchmark and compare what difference it makes.
>>>>
>>>> Thanks, Robbin
>>>>
>>>>>
>>>>> Best regards,
>>>>> Martin
>>>>>
>>>>>
>>>>> -----Original Message-----
>>>>> From: Robbin Ehn [mailto:robbin.ehn at oracle.com]
>>>>> Sent: Mittwoch, 18. Oktober 2017 11:07
>>>>> To: Doerr, Martin <martin.doerr at sap.com>; hotspot-dev developers <hotspot-dev at openjdk.java.net>
>>>>> Subject: Re: RFR(XL): 8185640: Thread-local handshakes
>>>>>
>>>>> Thanks for looking at this.
>>>>>
>>>>>> On 2017-10-17 19:58, Doerr, Martin wrote:
>>>>>> Hi Robbin,
>>>>>>
>>>>>> my first impression is very good. Thanks for providing the webrev.
>>>>>
>>>>> Great!
>>>>>
>>>>>>
>>>>>> I only don't like that "poll_page_val | poll_bit()" is used in shared code. I'd prefer to use either one or the other mechanism.
>>>>>> Would it be ok to move the decision between what to use to platform code?
>>>>>> (Some platforms could still use both if this is beneficial.)
>>>>>>
>>>>>> E.g. on PPC64, we'd like to use conditional trap instructions with special bit patterns if UseSIGTRAP is on. Would be excellent if we could implement set functions for _poll_armed_value and _poll_disarmed_value in platform code. poll_bit() also fits better into platform code in my opinion.
>>>>>
>>>>> I see no issue with this.
>>>>> Maybe SafepointMechanism::local_poll_armed should be possibly platform specific.
>>>>> Can we do this incremental when adding the platform support for PPC64?
>>>>>
>>>>> Thanks, Robbin
>>>>>
>>>>>>
>>>>>> Best regards,
>>>>>> Martin
>>>>>>
>>>>>>
>>>>>> -----Original Message-----
>>>>>> From: hotspot-dev [mailto:hotspot-dev-bounces at openjdk.java.net] On Behalf Of Robbin Ehn
>>>>>> Sent: Mittwoch, 11. Oktober 2017 15:38
>>>>>> To: hotspot-dev developers <hotspot-dev at openjdk.java.net>
>>>>>> Subject: RFR(XL): 8185640: Thread-local handshakes
>>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>> Starting the review of the code while JEP work is still not completed.
>>>>>>
>>>>>> JEP: https://bugs.openjdk.java.net/browse/JDK-8185640
>>>>>>
>>>>>> This JEP introduces a way to execute a callback on threads without performing a global VM safepoint. It makes it both possible and cheap to stop individual threads and not
>>>>>> just all threads or none.
>>>>>>
>>>>>> Entire changeset:
>>>>>> http://cr.openjdk.java.net/~rehn/8185640/v0/flat/
>>>>>>
>>>>>> Divided into 3-parts,
>>>>>> SafepointMechanism abstraction:
>>>>>> http://cr.openjdk.java.net/~rehn/8185640/v0/SafepointMechanism-0/
>>>>>> Consolidating polling page allocation:
>>>>>> http://cr.openjdk.java.net/~rehn/8185640/v0/PollingPage-1/
>>>>>> Handshakes:
>>>>>> http://cr.openjdk.java.net/~rehn/8185640/v0/Handshakes-2/
>>>>>>
>>>>>> A handshake operation is a callback that is executed for each JavaThread while that thread is in a safepoint safe state. The callback is executed either by the thread
>>>>>> itself or by the VM thread while keeping the thread in a blocked state. The big difference between safepointing and handshaking is that the per thread operation will be
>>>>>> performed on all threads as soon as possible and they will continue to execute as soon as it?s own operation is completed. If a JavaThread is known to be running, then a
>>>>>> handshake can be performed with that single JavaThread as well.
>>>>>>
>>>>>> The current safepointing scheme is modified to perform an indirection through a per-thread pointer which will allow a single thread's execution to be forced to trap on the
>>>>>> guard page. In order to force a thread to yield the VM updates the per-thread pointer for the corresponding thread to point to the guarded page.
>>>>>>
>>>>>> Example of potential use-cases:
>>>>>> -Biased lock revocation
>>>>>> -External requests for stack traces
>>>>>> -Deoptimization
>>>>>> -Async exception delivery
>>>>>> -External suspension
>>>>>> -Eliding memory barriers
>>>>>>
>>>>>> All of these will benefit the VM moving towards becoming more low-latency friendly by reducing the number of global safepoints.
>>>>>> Platforms that do not yet implement the per JavaThread poll, a fallback to normal safepoint is in place. HandshakeOneThread will then be a normal safepoint. The supported
>>>>>> platforms are Linux x64 and Solaris SPARC.
>>>>>>
>>>>>> Tested heavily with various test suits and comes with a few new tests.
>>>>>>
>>>>>> Performance testing using standardized benchmark show no signification changes, the latest number was -0.7% on Linux x64 and +1.5% Solaris SPARC (not statistically
>>>>>> ensured). A minor regression for the load vs load load on x64 is expected and a slight increase on SPARC due to the cost of ?materializing? the page vs load load.
>>>>>> The time to trigger a safepoint was measured on a large machine to not be an issue. The looping over threads and arming the polling page will benefit from the work on
>>>>>> JavaThread life-cycle (8167108 - SMR and JavaThread Lifecycle: http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2017-October/024773.html) which puts all
>>>>>> JavaThreads in an array instead of a linked list.
>>>>>>
>>>>>> Thanks, Robbin
>>>>>>
>>

From martin.doerr at sap.com  Wed Oct 25 13:14:32 2017
From: martin.doerr at sap.com (Doerr, Martin)
Date: Wed, 25 Oct 2017 13:14:32 +0000
Subject: RFR(XL): 8185640: Thread-local handshakes
In-Reply-To: <e98367fa-bb76-8aa8-5cd5-caf7e93ceb3c@oracle.com>
References: <6f2f6259-73f1-c09c-063e-39ae528fb96f@oracle.com>
 <15dd917732444959b7785efbe6640952@sap.com>
 <ba5210b2-d40c-b559-c9bb-35e245d2208e@oracle.com>
 <18f2001cbbbd4772aa9268e6e34b4be9@sap.com>
 <770b1286-3c8e-92e5-3929-17eb4e6c3847@oracle.com>
 <edf27200af4e4fb4a8bda1831ae2d974@sap.com>
 <6e10687c-f70e-5ee7-414f-b2c22d3e8f21@oracle.com>
 <f3553eec9b1d4815adbf7c87ca550527@sap.com>
 <580AD7F0-2713-472C-A440-AAFDDA2D3EB3@oracle.com>
 <7591f6c0-7192-78c3-fe79-56a7785c43e4@oracle.com>
 <0b34f052cc7047cfb40dc44e91c1300d@sap.com>
 <e98367fa-bb76-8aa8-5cd5-caf7e93ceb3c@oracle.com>
Message-ID: <1b4cf2fdb4864377b41dc56016af819f@sap.com>

Hi Robbin,

thanks a lot. Looks good.

Best regards,
Martin


-----Original Message-----
From: Robbin Ehn [mailto:robbin.ehn at oracle.com] 
Sent: Mittwoch, 25. Oktober 2017 14:54
To: Doerr, Martin <martin.doerr at sap.com>; Karen Kinnear <karen.kinnear at oracle.com>
Cc: hotspot-dev developers <hotspot-dev at openjdk.java.net>; Coleen Phillimore (coleen.phillimore at oracle.com) <coleen.phillimore at oracle.com>
Subject: Re: RFR(XL): 8185640: Thread-local handshakes

Hi Martin,

On 2017-10-24 19:08, Doerr, Martin wrote:
> Hi Robbin,
> 
> sounds good. Thanks a lot for doing it.
> The change looks good to me except that I'd expect a poll for wide_ret, too.

Yes, incremental:
http://cr.openjdk.java.net/~rehn/8185640/v6/Interpreter-Poll-Wide_Ret-8/webrev/index.html

Sanity tested, running big test job now.

Thanks!

/Robbin

> 
> Best regards,
> Martin
> 
> 
> -----Original Message-----
> From: Robbin Ehn [mailto:robbin.ehn at oracle.com]
> Sent: Dienstag, 24. Oktober 2017 16:54
> To: Karen Kinnear <karen.kinnear at oracle.com>; Doerr, Martin <martin.doerr at sap.com>
> Cc: hotspot-dev developers <hotspot-dev at openjdk.java.net>; Coleen Phillimore (coleen.phillimore at oracle.com) <coleen.phillimore at oracle.com>
> Subject: Re: RFR(XL): 8185640: Thread-local handshakes
> 
> Hi,
> 
> I did a fix for the interpreter performance regression, it's plain and simple, I
> kept the polling code inside dispatch_base but made it optional as the verify oop.
> 
> Incremental:
> http://cr.openjdk.java.net/~rehn/8185640/v5/Interpreter-Poll-7/webrev/index.html
> 
> Manual tested with jstack and it passes: hotspot_tier1, hotspot_handshake
> 
> It reduces the polling cost of 80%, sensitive benchmarks shows -0.44% regression
> vs TLH off. More insensitive benchmark show no regression.
> 
> Thanks, Robbin
> 
> On 2017-10-23 17:58, Karen Kinnear wrote:
>> Works for me
>>
>> Thanks,
>> Karen
>>
>>> On Oct 23, 2017, at 8:40 AM, Doerr, Martin <martin.doerr at sap.com> wrote:
>>>
>>> Hi Coleen and Robbin,
>>>
>>> I'm ok with putting it into a separate RFE. I understand that there are more fun activities than rebasing this XL change for a long time :-)
>>> So you don't need to delay it. It's acceptable for me.
>>>
>>> Thanks, Coleen, for sharing your proposal. I appreciate it.
>>>
>>> Best regards,
>>> Martin
>>>
>>>
>>> -----Original Message-----
>>> From: Robbin Ehn [mailto:robbin.ehn at oracle.com]
>>> Sent: Montag, 23. Oktober 2017 17:26
>>> To: Doerr, Martin <martin.doerr at sap.com>; hotspot-dev developers <hotspot-dev at openjdk.java.net>
>>> Subject: Re: RFR(XL): 8185640: Thread-local handshakes
>>>
>>> Hi Martin,
>>>
>>>> On 2017-10-18 16:05, Doerr, Martin wrote:
>>>> Hi Robbin,
>>>>
>>>> thanks for the quick reply and for doing additional benchmarks.
>>>> Please note that t->does_dispatch() was just a first idea, but doesn't really fit for the purpose because it's false for conditional branch bytecodes for example. I just didn't find an appropriate quick check in the existing code.
>>>> I guess you will notice a performance impact when benchmarking with -Xint. (I don't know if Oracle usually runs startup performance benchmarks.)
>>>
>>> Yes, we are seeing a performance regression, 2.5%-6% depending on benchmark.
>>> We are committed to fix this, but it might come as separate RFE/bug depending on
>>> the JEP's timeline.
>>>
>>> (If the fix, very unlikely, would not be done before next release, we would
>>> change the default to off)
>>>
>>> I hope this is an acceptable path?
>>>
>>> Thanks, Robbin
>>>
>>>>
>>>> Best regards,
>>>> Martin
>>>>
>>>>
>>>> -----Original Message-----
>>>> From: Robbin Ehn [mailto:robbin.ehn at oracle.com]
>>>> Sent: Mittwoch, 18. Oktober 2017 15:58
>>>> To: Doerr, Martin <martin.doerr at sap.com>; hotspot-dev developers <hotspot-dev at openjdk.java.net>
>>>> Subject: Re: RFR(XL): 8185640: Thread-local handshakes
>>>>
>>>> Hi Martin,
>>>>
>>>>> On 2017-10-18 12:11, Doerr, Martin wrote:
>>>>> Hi Robbin,
>>>>>
>>>>> so you would like to push your version first (as it does not break other platforms) and then help us to push non-Oracle platform implementations which change shared code again?
>>>>> I'd be fine with that, too.
>>>>
>>>> Yes, great!
>>>>
>>>>>
>>>>> While thinking a little longer about the interpreter implementation, a new idea came into my mind.
>>>>> I think we could significantly reduce impact on interpreter code size and performance by using safepoint polls only in a subset of bytecodes. E.g., we could use only bytecodes which perform any kind of jump by implementing something like
>>>>> if (SafepointMechanism::uses_thread_local_poll() && t->does_dispatch()) generate_safepoint_poll();
>>>>> in TemplateInterpreterGenerator::generate_and_dispatch.
>>>>
>>>> We have not seen any performance regression in simple benchmark with this.
>>>> I will do a better benchmark and compare what difference it makes.
>>>>
>>>> Thanks, Robbin
>>>>
>>>>>
>>>>> Best regards,
>>>>> Martin
>>>>>
>>>>>
>>>>> -----Original Message-----
>>>>> From: Robbin Ehn [mailto:robbin.ehn at oracle.com]
>>>>> Sent: Mittwoch, 18. Oktober 2017 11:07
>>>>> To: Doerr, Martin <martin.doerr at sap.com>; hotspot-dev developers <hotspot-dev at openjdk.java.net>
>>>>> Subject: Re: RFR(XL): 8185640: Thread-local handshakes
>>>>>
>>>>> Thanks for looking at this.
>>>>>
>>>>>> On 2017-10-17 19:58, Doerr, Martin wrote:
>>>>>> Hi Robbin,
>>>>>>
>>>>>> my first impression is very good. Thanks for providing the webrev.
>>>>>
>>>>> Great!
>>>>>
>>>>>>
>>>>>> I only don't like that "poll_page_val | poll_bit()" is used in shared code. I'd prefer to use either one or the other mechanism.
>>>>>> Would it be ok to move the decision between what to use to platform code?
>>>>>> (Some platforms could still use both if this is beneficial.)
>>>>>>
>>>>>> E.g. on PPC64, we'd like to use conditional trap instructions with special bit patterns if UseSIGTRAP is on. Would be excellent if we could implement set functions for _poll_armed_value and _poll_disarmed_value in platform code. poll_bit() also fits better into platform code in my opinion.
>>>>>
>>>>> I see no issue with this.
>>>>> Maybe SafepointMechanism::local_poll_armed should be possibly platform specific.
>>>>> Can we do this incremental when adding the platform support for PPC64?
>>>>>
>>>>> Thanks, Robbin
>>>>>
>>>>>>
>>>>>> Best regards,
>>>>>> Martin
>>>>>>
>>>>>>
>>>>>> -----Original Message-----
>>>>>> From: hotspot-dev [mailto:hotspot-dev-bounces at openjdk.java.net] On Behalf Of Robbin Ehn
>>>>>> Sent: Mittwoch, 11. Oktober 2017 15:38
>>>>>> To: hotspot-dev developers <hotspot-dev at openjdk.java.net>
>>>>>> Subject: RFR(XL): 8185640: Thread-local handshakes
>>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>> Starting the review of the code while JEP work is still not completed.
>>>>>>
>>>>>> JEP: https://bugs.openjdk.java.net/browse/JDK-8185640
>>>>>>
>>>>>> This JEP introduces a way to execute a callback on threads without performing a global VM safepoint. It makes it both possible and cheap to stop individual threads and not
>>>>>> just all threads or none.
>>>>>>
>>>>>> Entire changeset:
>>>>>> http://cr.openjdk.java.net/~rehn/8185640/v0/flat/
>>>>>>
>>>>>> Divided into 3-parts,
>>>>>> SafepointMechanism abstraction:
>>>>>> http://cr.openjdk.java.net/~rehn/8185640/v0/SafepointMechanism-0/
>>>>>> Consolidating polling page allocation:
>>>>>> http://cr.openjdk.java.net/~rehn/8185640/v0/PollingPage-1/
>>>>>> Handshakes:
>>>>>> http://cr.openjdk.java.net/~rehn/8185640/v0/Handshakes-2/
>>>>>>
>>>>>> A handshake operation is a callback that is executed for each JavaThread while that thread is in a safepoint safe state. The callback is executed either by the thread
>>>>>> itself or by the VM thread while keeping the thread in a blocked state. The big difference between safepointing and handshaking is that the per thread operation will be
>>>>>> performed on all threads as soon as possible and they will continue to execute as soon as it?s own operation is completed. If a JavaThread is known to be running, then a
>>>>>> handshake can be performed with that single JavaThread as well.
>>>>>>
>>>>>> The current safepointing scheme is modified to perform an indirection through a per-thread pointer which will allow a single thread's execution to be forced to trap on the
>>>>>> guard page. In order to force a thread to yield the VM updates the per-thread pointer for the corresponding thread to point to the guarded page.
>>>>>>
>>>>>> Example of potential use-cases:
>>>>>> -Biased lock revocation
>>>>>> -External requests for stack traces
>>>>>> -Deoptimization
>>>>>> -Async exception delivery
>>>>>> -External suspension
>>>>>> -Eliding memory barriers
>>>>>>
>>>>>> All of these will benefit the VM moving towards becoming more low-latency friendly by reducing the number of global safepoints.
>>>>>> Platforms that do not yet implement the per JavaThread poll, a fallback to normal safepoint is in place. HandshakeOneThread will then be a normal safepoint. The supported
>>>>>> platforms are Linux x64 and Solaris SPARC.
>>>>>>
>>>>>> Tested heavily with various test suits and comes with a few new tests.
>>>>>>
>>>>>> Performance testing using standardized benchmark show no signification changes, the latest number was -0.7% on Linux x64 and +1.5% Solaris SPARC (not statistically
>>>>>> ensured). A minor regression for the load vs load load on x64 is expected and a slight increase on SPARC due to the cost of ?materializing? the page vs load load.
>>>>>> The time to trigger a safepoint was measured on a large machine to not be an issue. The looping over threads and arming the polling page will benefit from the work on
>>>>>> JavaThread life-cycle (8167108 - SMR and JavaThread Lifecycle: http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2017-October/024773.html) which puts all
>>>>>> JavaThreads in an array instead of a linked list.
>>>>>>
>>>>>> Thanks, Robbin
>>>>>>
>>

From coleen.phillimore at oracle.com  Wed Oct 25 13:19:30 2017
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Wed, 25 Oct 2017 09:19:30 -0400
Subject: RFR(XL): 8185640: Thread-local handshakes
In-Reply-To: <7591f6c0-7192-78c3-fe79-56a7785c43e4@oracle.com>
References: <6f2f6259-73f1-c09c-063e-39ae528fb96f@oracle.com>
 <15dd917732444959b7785efbe6640952@sap.com>
 <ba5210b2-d40c-b559-c9bb-35e245d2208e@oracle.com>
 <18f2001cbbbd4772aa9268e6e34b4be9@sap.com>
 <770b1286-3c8e-92e5-3929-17eb4e6c3847@oracle.com>
 <edf27200af4e4fb4a8bda1831ae2d974@sap.com>
 <6e10687c-f70e-5ee7-414f-b2c22d3e8f21@oracle.com>
 <f3553eec9b1d4815adbf7c87ca550527@sap.com>
 <580AD7F0-2713-472C-A440-AAFDDA2D3EB3@oracle.com>
 <7591f6c0-7192-78c3-fe79-56a7785c43e4@oracle.com>
Message-ID: <3b7f8cb6-527d-b5f6-25d4-16bbae9d7ae2@oracle.com>


Hi Robbin,
This change (with the addition of the poll at wide_ret) looks good. It 
came out nicely in the code.
thanks,
Coleen

On 10/24/17 10:54 AM, Robbin Ehn wrote:
> Hi,
>
> I did a fix for the interpreter performance regression, it's plain and 
> simple, I kept the polling code inside dispatch_base but made it 
> optional as the verify oop.
>
> Incremental:
> http://cr.openjdk.java.net/~rehn/8185640/v5/Interpreter-Poll-7/webrev/index.html 
>
>
> Manual tested with jstack and it passes: hotspot_tier1, hotspot_handshake
>
> It reduces the polling cost of 80%, sensitive benchmarks shows -0.44% 
> regression vs TLH off. More insensitive benchmark show no regression.
>
> Thanks, Robbin
>
> On 2017-10-23 17:58, Karen Kinnear wrote:
>> Works for me
>>
>> Thanks,
>> Karen
>>
>>> On Oct 23, 2017, at 8:40 AM, Doerr, Martin <martin.doerr at sap.com> 
>>> wrote:
>>>
>>> Hi Coleen and Robbin,
>>>
>>> I'm ok with putting it into a separate RFE. I understand that there 
>>> are more fun activities than rebasing this XL change for a long time 
>>> :-)
>>> So you don't need to delay it. It's acceptable for me.
>>>
>>> Thanks, Coleen, for sharing your proposal. I appreciate it.
>>>
>>> Best regards,
>>> Martin
>>>
>>>
>>> -----Original Message-----
>>> From: Robbin Ehn [mailto:robbin.ehn at oracle.com]
>>> Sent: Montag, 23. Oktober 2017 17:26
>>> To: Doerr, Martin <martin.doerr at sap.com>; hotspot-dev developers 
>>> <hotspot-dev at openjdk.java.net>
>>> Subject: Re: RFR(XL): 8185640: Thread-local handshakes
>>>
>>> Hi Martin,
>>>
>>>> On 2017-10-18 16:05, Doerr, Martin wrote:
>>>> Hi Robbin,
>>>>
>>>> thanks for the quick reply and for doing additional benchmarks.
>>>> Please note that t->does_dispatch() was just a first idea, but 
>>>> doesn't really fit for the purpose because it's false for 
>>>> conditional branch bytecodes for example. I just didn't find an 
>>>> appropriate quick check in the existing code.
>>>> I guess you will notice a performance impact when benchmarking with 
>>>> -Xint. (I don't know if Oracle usually runs startup performance 
>>>> benchmarks.)
>>>
>>> Yes, we are seeing a performance regression, 2.5%-6% depending on 
>>> benchmark.
>>> We are committed to fix this, but it might come as separate RFE/bug 
>>> depending on
>>> the JEP's timeline.
>>>
>>> (If the fix, very unlikely, would not be done before next release, 
>>> we would
>>> change the default to off)
>>>
>>> I hope this is an acceptable path?
>>>
>>> Thanks, Robbin
>>>
>>>>
>>>> Best regards,
>>>> Martin
>>>>
>>>>
>>>> -----Original Message-----
>>>> From: Robbin Ehn [mailto:robbin.ehn at oracle.com]
>>>> Sent: Mittwoch, 18. Oktober 2017 15:58
>>>> To: Doerr, Martin <martin.doerr at sap.com>; hotspot-dev developers 
>>>> <hotspot-dev at openjdk.java.net>
>>>> Subject: Re: RFR(XL): 8185640: Thread-local handshakes
>>>>
>>>> Hi Martin,
>>>>
>>>>> On 2017-10-18 12:11, Doerr, Martin wrote:
>>>>> Hi Robbin,
>>>>>
>>>>> so you would like to push your version first (as it does not break 
>>>>> other platforms) and then help us to push non-Oracle platform 
>>>>> implementations which change shared code again?
>>>>> I'd be fine with that, too.
>>>>
>>>> Yes, great!
>>>>
>>>>>
>>>>> While thinking a little longer about the interpreter 
>>>>> implementation, a new idea came into my mind.
>>>>> I think we could significantly reduce impact on interpreter code 
>>>>> size and performance by using safepoint polls only in a subset of 
>>>>> bytecodes. E.g., we could use only bytecodes which perform any 
>>>>> kind of jump by implementing something like
>>>>> if (SafepointMechanism::uses_thread_local_poll() && 
>>>>> t->does_dispatch()) generate_safepoint_poll();
>>>>> in TemplateInterpreterGenerator::generate_and_dispatch.
>>>>
>>>> We have not seen any performance regression in simple benchmark 
>>>> with this.
>>>> I will do a better benchmark and compare what difference it makes.
>>>>
>>>> Thanks, Robbin
>>>>
>>>>>
>>>>> Best regards,
>>>>> Martin
>>>>>
>>>>>
>>>>> -----Original Message-----
>>>>> From: Robbin Ehn [mailto:robbin.ehn at oracle.com]
>>>>> Sent: Mittwoch, 18. Oktober 2017 11:07
>>>>> To: Doerr, Martin <martin.doerr at sap.com>; hotspot-dev developers 
>>>>> <hotspot-dev at openjdk.java.net>
>>>>> Subject: Re: RFR(XL): 8185640: Thread-local handshakes
>>>>>
>>>>> Thanks for looking at this.
>>>>>
>>>>>> On 2017-10-17 19:58, Doerr, Martin wrote:
>>>>>> Hi Robbin,
>>>>>>
>>>>>> my first impression is very good. Thanks for providing the webrev.
>>>>>
>>>>> Great!
>>>>>
>>>>>>
>>>>>> I only don't like that "poll_page_val | poll_bit()" is used in 
>>>>>> shared code. I'd prefer to use either one or the other mechanism.
>>>>>> Would it be ok to move the decision between what to use to 
>>>>>> platform code?
>>>>>> (Some platforms could still use both if this is beneficial.)
>>>>>>
>>>>>> E.g. on PPC64, we'd like to use conditional trap instructions 
>>>>>> with special bit patterns if UseSIGTRAP is on. Would be excellent 
>>>>>> if we could implement set functions for _poll_armed_value and 
>>>>>> _poll_disarmed_value in platform code. poll_bit() also fits 
>>>>>> better into platform code in my opinion.
>>>>>
>>>>> I see no issue with this.
>>>>> Maybe SafepointMechanism::local_poll_armed should be possibly 
>>>>> platform specific.
>>>>> Can we do this incremental when adding the platform support for 
>>>>> PPC64?
>>>>>
>>>>> Thanks, Robbin
>>>>>
>>>>>>
>>>>>> Best regards,
>>>>>> Martin
>>>>>>
>>>>>>
>>>>>> -----Original Message-----
>>>>>> From: hotspot-dev [mailto:hotspot-dev-bounces at openjdk.java.net] 
>>>>>> On Behalf Of Robbin Ehn
>>>>>> Sent: Mittwoch, 11. Oktober 2017 15:38
>>>>>> To: hotspot-dev developers <hotspot-dev at openjdk.java.net>
>>>>>> Subject: RFR(XL): 8185640: Thread-local handshakes
>>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>> Starting the review of the code while JEP work is still not 
>>>>>> completed.
>>>>>>
>>>>>> JEP: https://bugs.openjdk.java.net/browse/JDK-8185640
>>>>>>
>>>>>> This JEP introduces a way to execute a callback on threads 
>>>>>> without performing a global VM safepoint. It makes it both 
>>>>>> possible and cheap to stop individual threads and not
>>>>>> just all threads or none.
>>>>>>
>>>>>> Entire changeset:
>>>>>> http://cr.openjdk.java.net/~rehn/8185640/v0/flat/
>>>>>>
>>>>>> Divided into 3-parts,
>>>>>> SafepointMechanism abstraction:
>>>>>> http://cr.openjdk.java.net/~rehn/8185640/v0/SafepointMechanism-0/
>>>>>> Consolidating polling page allocation:
>>>>>> http://cr.openjdk.java.net/~rehn/8185640/v0/PollingPage-1/
>>>>>> Handshakes:
>>>>>> http://cr.openjdk.java.net/~rehn/8185640/v0/Handshakes-2/
>>>>>>
>>>>>> A handshake operation is a callback that is executed for each 
>>>>>> JavaThread while that thread is in a safepoint safe state. The 
>>>>>> callback is executed either by the thread
>>>>>> itself or by the VM thread while keeping the thread in a blocked 
>>>>>> state. The big difference between safepointing and handshaking is 
>>>>>> that the per thread operation will be
>>>>>> performed on all threads as soon as possible and they will 
>>>>>> continue to execute as soon as it?s own operation is completed. 
>>>>>> If a JavaThread is known to be running, then a
>>>>>> handshake can be performed with that single JavaThread as well.
>>>>>>
>>>>>> The current safepointing scheme is modified to perform an 
>>>>>> indirection through a per-thread pointer which will allow a 
>>>>>> single thread's execution to be forced to trap on the
>>>>>> guard page. In order to force a thread to yield the VM updates 
>>>>>> the per-thread pointer for the corresponding thread to point to 
>>>>>> the guarded page.
>>>>>>
>>>>>> Example of potential use-cases:
>>>>>> -Biased lock revocation
>>>>>> -External requests for stack traces
>>>>>> -Deoptimization
>>>>>> -Async exception delivery
>>>>>> -External suspension
>>>>>> -Eliding memory barriers
>>>>>>
>>>>>> All of these will benefit the VM moving towards becoming more 
>>>>>> low-latency friendly by reducing the number of global safepoints.
>>>>>> Platforms that do not yet implement the per JavaThread poll, a 
>>>>>> fallback to normal safepoint is in place. HandshakeOneThread will 
>>>>>> then be a normal safepoint. The supported
>>>>>> platforms are Linux x64 and Solaris SPARC.
>>>>>>
>>>>>> Tested heavily with various test suits and comes with a few new 
>>>>>> tests.
>>>>>>
>>>>>> Performance testing using standardized benchmark show no 
>>>>>> signification changes, the latest number was -0.7% on Linux x64 
>>>>>> and +1.5% Solaris SPARC (not statistically
>>>>>> ensured). A minor regression for the load vs load load on x64 is 
>>>>>> expected and a slight increase on SPARC due to the cost of 
>>>>>> ?materializing? the page vs load load.
>>>>>> The time to trigger a safepoint was measured on a large machine 
>>>>>> to not be an issue. The looping over threads and arming the 
>>>>>> polling page will benefit from the work on
>>>>>> JavaThread life-cycle (8167108 - SMR and JavaThread Lifecycle: 
>>>>>> http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2017-October/024773.html) 
>>>>>> which puts all
>>>>>> JavaThreads in an array instead of a linked list.
>>>>>>
>>>>>> Thanks, Robbin
>>>>>>
>>


From robbin.ehn at oracle.com  Wed Oct 25 13:35:55 2017
From: robbin.ehn at oracle.com (Robbin Ehn)
Date: Wed, 25 Oct 2017 15:35:55 +0200
Subject: RFR(XL): 8185640: Thread-local handshakes
In-Reply-To: <3b7f8cb6-527d-b5f6-25d4-16bbae9d7ae2@oracle.com>
References: <6f2f6259-73f1-c09c-063e-39ae528fb96f@oracle.com>
 <15dd917732444959b7785efbe6640952@sap.com>
 <ba5210b2-d40c-b559-c9bb-35e245d2208e@oracle.com>
 <18f2001cbbbd4772aa9268e6e34b4be9@sap.com>
 <770b1286-3c8e-92e5-3929-17eb4e6c3847@oracle.com>
 <edf27200af4e4fb4a8bda1831ae2d974@sap.com>
 <6e10687c-f70e-5ee7-414f-b2c22d3e8f21@oracle.com>
 <f3553eec9b1d4815adbf7c87ca550527@sap.com>
 <580AD7F0-2713-472C-A440-AAFDDA2D3EB3@oracle.com>
 <7591f6c0-7192-78c3-fe79-56a7785c43e4@oracle.com>
 <3b7f8cb6-527d-b5f6-25d4-16bbae9d7ae2@oracle.com>
Message-ID: <03a50ba7-7afd-aca5-d59c-0d7472b513c8@oracle.com>

Thanks Coleen, Robbin

On 2017-10-25 15:19, coleen.phillimore at oracle.com wrote:
> 
> Hi Robbin,
> This change (with the addition of the poll at wide_ret) looks good. It came out 
> nicely in the code.
> thanks,
> Coleen
> 
> On 10/24/17 10:54 AM, Robbin Ehn wrote:
>> Hi,
>>
>> I did a fix for the interpreter performance regression, it's plain and simple, 
>> I kept the polling code inside dispatch_base but made it optional as the 
>> verify oop.
>>
>> Incremental:
>> http://cr.openjdk.java.net/~rehn/8185640/v5/Interpreter-Poll-7/webrev/index.html
>>
>> Manual tested with jstack and it passes: hotspot_tier1, hotspot_handshake
>>
>> It reduces the polling cost of 80%, sensitive benchmarks shows -0.44% 
>> regression vs TLH off. More insensitive benchmark show no regression.
>>
>> Thanks, Robbin
>>
>> On 2017-10-23 17:58, Karen Kinnear wrote:
>>> Works for me
>>>
>>> Thanks,
>>> Karen
>>>
>>>> On Oct 23, 2017, at 8:40 AM, Doerr, Martin <martin.doerr at sap.com> wrote:
>>>>
>>>> Hi Coleen and Robbin,
>>>>
>>>> I'm ok with putting it into a separate RFE. I understand that there are more 
>>>> fun activities than rebasing this XL change for a long time :-)
>>>> So you don't need to delay it. It's acceptable for me.
>>>>
>>>> Thanks, Coleen, for sharing your proposal. I appreciate it.
>>>>
>>>> Best regards,
>>>> Martin
>>>>
>>>>
>>>> -----Original Message-----
>>>> From: Robbin Ehn [mailto:robbin.ehn at oracle.com]
>>>> Sent: Montag, 23. Oktober 2017 17:26
>>>> To: Doerr, Martin <martin.doerr at sap.com>; hotspot-dev developers 
>>>> <hotspot-dev at openjdk.java.net>
>>>> Subject: Re: RFR(XL): 8185640: Thread-local handshakes
>>>>
>>>> Hi Martin,
>>>>
>>>>> On 2017-10-18 16:05, Doerr, Martin wrote:
>>>>> Hi Robbin,
>>>>>
>>>>> thanks for the quick reply and for doing additional benchmarks.
>>>>> Please note that t->does_dispatch() was just a first idea, but doesn't 
>>>>> really fit for the purpose because it's false for conditional branch 
>>>>> bytecodes for example. I just didn't find an appropriate quick check in the 
>>>>> existing code.
>>>>> I guess you will notice a performance impact when benchmarking with -Xint. 
>>>>> (I don't know if Oracle usually runs startup performance benchmarks.)
>>>>
>>>> Yes, we are seeing a performance regression, 2.5%-6% depending on benchmark.
>>>> We are committed to fix this, but it might come as separate RFE/bug 
>>>> depending on
>>>> the JEP's timeline.
>>>>
>>>> (If the fix, very unlikely, would not be done before next release, we would
>>>> change the default to off)
>>>>
>>>> I hope this is an acceptable path?
>>>>
>>>> Thanks, Robbin
>>>>
>>>>>
>>>>> Best regards,
>>>>> Martin
>>>>>
>>>>>
>>>>> -----Original Message-----
>>>>> From: Robbin Ehn [mailto:robbin.ehn at oracle.com]
>>>>> Sent: Mittwoch, 18. Oktober 2017 15:58
>>>>> To: Doerr, Martin <martin.doerr at sap.com>; hotspot-dev developers 
>>>>> <hotspot-dev at openjdk.java.net>
>>>>> Subject: Re: RFR(XL): 8185640: Thread-local handshakes
>>>>>
>>>>> Hi Martin,
>>>>>
>>>>>> On 2017-10-18 12:11, Doerr, Martin wrote:
>>>>>> Hi Robbin,
>>>>>>
>>>>>> so you would like to push your version first (as it does not break other 
>>>>>> platforms) and then help us to push non-Oracle platform implementations 
>>>>>> which change shared code again?
>>>>>> I'd be fine with that, too.
>>>>>
>>>>> Yes, great!
>>>>>
>>>>>>
>>>>>> While thinking a little longer about the interpreter implementation, a new 
>>>>>> idea came into my mind.
>>>>>> I think we could significantly reduce impact on interpreter code size and 
>>>>>> performance by using safepoint polls only in a subset of bytecodes. E.g., 
>>>>>> we could use only bytecodes which perform any kind of jump by implementing 
>>>>>> something like
>>>>>> if (SafepointMechanism::uses_thread_local_poll() && t->does_dispatch()) 
>>>>>> generate_safepoint_poll();
>>>>>> in TemplateInterpreterGenerator::generate_and_dispatch.
>>>>>
>>>>> We have not seen any performance regression in simple benchmark with this.
>>>>> I will do a better benchmark and compare what difference it makes.
>>>>>
>>>>> Thanks, Robbin
>>>>>
>>>>>>
>>>>>> Best regards,
>>>>>> Martin
>>>>>>
>>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Robbin Ehn [mailto:robbin.ehn at oracle.com]
>>>>>> Sent: Mittwoch, 18. Oktober 2017 11:07
>>>>>> To: Doerr, Martin <martin.doerr at sap.com>; hotspot-dev developers 
>>>>>> <hotspot-dev at openjdk.java.net>
>>>>>> Subject: Re: RFR(XL): 8185640: Thread-local handshakes
>>>>>>
>>>>>> Thanks for looking at this.
>>>>>>
>>>>>>> On 2017-10-17 19:58, Doerr, Martin wrote:
>>>>>>> Hi Robbin,
>>>>>>>
>>>>>>> my first impression is very good. Thanks for providing the webrev.
>>>>>>
>>>>>> Great!
>>>>>>
>>>>>>>
>>>>>>> I only don't like that "poll_page_val | poll_bit()" is used in shared 
>>>>>>> code. I'd prefer to use either one or the other mechanism.
>>>>>>> Would it be ok to move the decision between what to use to platform code?
>>>>>>> (Some platforms could still use both if this is beneficial.)
>>>>>>>
>>>>>>> E.g. on PPC64, we'd like to use conditional trap instructions with 
>>>>>>> special bit patterns if UseSIGTRAP is on. Would be excellent if we could 
>>>>>>> implement set functions for _poll_armed_value and _poll_disarmed_value in 
>>>>>>> platform code. poll_bit() also fits better into platform code in my opinion.
>>>>>>
>>>>>> I see no issue with this.
>>>>>> Maybe SafepointMechanism::local_poll_armed should be possibly platform 
>>>>>> specific.
>>>>>> Can we do this incremental when adding the platform support for PPC64?
>>>>>>
>>>>>> Thanks, Robbin
>>>>>>
>>>>>>>
>>>>>>> Best regards,
>>>>>>> Martin
>>>>>>>
>>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: hotspot-dev [mailto:hotspot-dev-bounces at openjdk.java.net] On Behalf 
>>>>>>> Of Robbin Ehn
>>>>>>> Sent: Mittwoch, 11. Oktober 2017 15:38
>>>>>>> To: hotspot-dev developers <hotspot-dev at openjdk.java.net>
>>>>>>> Subject: RFR(XL): 8185640: Thread-local handshakes
>>>>>>>
>>>>>>> Hi all,
>>>>>>>
>>>>>>> Starting the review of the code while JEP work is still not completed.
>>>>>>>
>>>>>>> JEP: https://bugs.openjdk.java.net/browse/JDK-8185640
>>>>>>>
>>>>>>> This JEP introduces a way to execute a callback on threads without 
>>>>>>> performing a global VM safepoint. It makes it both possible and cheap to 
>>>>>>> stop individual threads and not
>>>>>>> just all threads or none.
>>>>>>>
>>>>>>> Entire changeset:
>>>>>>> http://cr.openjdk.java.net/~rehn/8185640/v0/flat/
>>>>>>>
>>>>>>> Divided into 3-parts,
>>>>>>> SafepointMechanism abstraction:
>>>>>>> http://cr.openjdk.java.net/~rehn/8185640/v0/SafepointMechanism-0/
>>>>>>> Consolidating polling page allocation:
>>>>>>> http://cr.openjdk.java.net/~rehn/8185640/v0/PollingPage-1/
>>>>>>> Handshakes:
>>>>>>> http://cr.openjdk.java.net/~rehn/8185640/v0/Handshakes-2/
>>>>>>>
>>>>>>> A handshake operation is a callback that is executed for each JavaThread 
>>>>>>> while that thread is in a safepoint safe state. The callback is executed 
>>>>>>> either by the thread
>>>>>>> itself or by the VM thread while keeping the thread in a blocked state. 
>>>>>>> The big difference between safepointing and handshaking is that the per 
>>>>>>> thread operation will be
>>>>>>> performed on all threads as soon as possible and they will continue to 
>>>>>>> execute as soon as it?s own operation is completed. If a JavaThread is 
>>>>>>> known to be running, then a
>>>>>>> handshake can be performed with that single JavaThread as well.
>>>>>>>
>>>>>>> The current safepointing scheme is modified to perform an indirection 
>>>>>>> through a per-thread pointer which will allow a single thread's execution 
>>>>>>> to be forced to trap on the
>>>>>>> guard page. In order to force a thread to yield the VM updates the 
>>>>>>> per-thread pointer for the corresponding thread to point to the guarded 
>>>>>>> page.
>>>>>>>
>>>>>>> Example of potential use-cases:
>>>>>>> -Biased lock revocation
>>>>>>> -External requests for stack traces
>>>>>>> -Deoptimization
>>>>>>> -Async exception delivery
>>>>>>> -External suspension
>>>>>>> -Eliding memory barriers
>>>>>>>
>>>>>>> All of these will benefit the VM moving towards becoming more low-latency 
>>>>>>> friendly by reducing the number of global safepoints.
>>>>>>> Platforms that do not yet implement the per JavaThread poll, a fallback 
>>>>>>> to normal safepoint is in place. HandshakeOneThread will then be a normal 
>>>>>>> safepoint. The supported
>>>>>>> platforms are Linux x64 and Solaris SPARC.
>>>>>>>
>>>>>>> Tested heavily with various test suits and comes with a few new tests.
>>>>>>>
>>>>>>> Performance testing using standardized benchmark show no signification 
>>>>>>> changes, the latest number was -0.7% on Linux x64 and +1.5% Solaris SPARC 
>>>>>>> (not statistically
>>>>>>> ensured). A minor regression for the load vs load load on x64 is expected 
>>>>>>> and a slight increase on SPARC due to the cost of ?materializing? the 
>>>>>>> page vs load load.
>>>>>>> The time to trigger a safepoint was measured on a large machine to not be 
>>>>>>> an issue. The looping over threads and arming the polling page will 
>>>>>>> benefit from the work on
>>>>>>> JavaThread life-cycle (8167108 - SMR and JavaThread Lifecycle: 
>>>>>>> http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2017-October/024773.html) 
>>>>>>> which puts all
>>>>>>> JavaThreads in an array instead of a linked list.
>>>>>>>
>>>>>>> Thanks, Robbin
>>>>>>>
>>>
> 

From aph at redhat.com  Wed Oct 25 15:16:40 2017
From: aph at redhat.com (Andrew Haley)
Date: Wed, 25 Oct 2017 16:16:40 +0100
Subject: RFR(XL): 8185640: Thread-local handshakes
In-Reply-To: <7591f6c0-7192-78c3-fe79-56a7785c43e4@oracle.com>
References: <6f2f6259-73f1-c09c-063e-39ae528fb96f@oracle.com>
 <15dd917732444959b7785efbe6640952@sap.com>
 <ba5210b2-d40c-b559-c9bb-35e245d2208e@oracle.com>
 <18f2001cbbbd4772aa9268e6e34b4be9@sap.com>
 <770b1286-3c8e-92e5-3929-17eb4e6c3847@oracle.com>
 <edf27200af4e4fb4a8bda1831ae2d974@sap.com>
 <6e10687c-f70e-5ee7-414f-b2c22d3e8f21@oracle.com>
 <f3553eec9b1d4815adbf7c87ca550527@sap.com>
 <580AD7F0-2713-472C-A440-AAFDDA2D3EB3@oracle.com>
 <7591f6c0-7192-78c3-fe79-56a7785c43e4@oracle.com>
Message-ID: <2ff79d24-90ab-822a-bd61-e01b79c01ada@redhat.com>

On 24/10/17 15:54, Robbin Ehn wrote:

> I did a fix for the interpreter performance regression, it's plain and simple, I 
> kept the polling code inside dispatch_base but made it optional as the verify oop.
> 
> Incremental:
> http://cr.openjdk.java.net/~rehn/8185640/v5/Interpreter-Poll-7/webrev/index.html
> 
> Manual tested with jstack and it passes: hotspot_tier1, hotspot_handshake
> 
> It reduces the polling cost of 80%, sensitive benchmarks shows -0.44% regression 
> vs TLH off. More insensitive benchmark show no regression.

I think it's not quite right: you're missing a check in tableswitch
and fast_linearswitch.  These can be used to construct loops.

-- 
Andrew Haley
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671

From coleen.phillimore at oracle.com  Wed Oct 25 16:49:54 2017
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Wed, 25 Oct 2017 12:49:54 -0400
Subject: RFR (L, tedious again, sorry) 8189610: Reconcile jvm.h and all
 jvm_md.h between java.base and hotspot
Message-ID: <8671321f-398c-5f7f-634d-9d9664e04d87@oracle.com>

Summary: removed hotspot version of jvm*h and jni*h files

Mostly used sed to remove prims/jvm.h and move #include "jvm.h" after 
precompiled.h, so if you have repetitive stress wrist issues don't click 
on most of these files.

There were more issues to resolve, however.? The JDK windows jni_md.h 
file defined jint as long and the hotspot windows jni_x86.h as int.? I 
had to choose the jdk version since it's the public version, so there 
are changes to the hotspot files for this. Generally I changed the code 
to use 'int' rather than 'jint' where the surrounding API didn't insist 
on consistently using java types. We should mostly be using C++ types 
within hotspot except in interfaces to native/JNI code.? There are a 
couple of hacks in places where adding multiple jint casts was too painful.

Tested with JPRT and tier2-4 (in progress).

open webrev at http://cr.openjdk.java.net/~coleenp/8189610.01/webrev
bug link https://bugs.openjdk.java.net/browse/JDK-8189610

I have a script to update copyright files on commit.

Thanks to Magnus and ErikJ for the makefile changes.

Thanks,
Coleen


From martin.doerr at sap.com  Wed Oct 25 19:38:02 2017
From: martin.doerr at sap.com (Doerr, Martin)
Date: Wed, 25 Oct 2017 19:38:02 +0000
Subject: RFR(XL): 8185640: Thread-local handshakes
In-Reply-To: <2ff79d24-90ab-822a-bd61-e01b79c01ada@redhat.com>
References: <6f2f6259-73f1-c09c-063e-39ae528fb96f@oracle.com>
 <15dd917732444959b7785efbe6640952@sap.com>
 <ba5210b2-d40c-b559-c9bb-35e245d2208e@oracle.com>
 <18f2001cbbbd4772aa9268e6e34b4be9@sap.com>
 <770b1286-3c8e-92e5-3929-17eb4e6c3847@oracle.com>
 <edf27200af4e4fb4a8bda1831ae2d974@sap.com>
 <6e10687c-f70e-5ee7-414f-b2c22d3e8f21@oracle.com>
 <f3553eec9b1d4815adbf7c87ca550527@sap.com>
 <580AD7F0-2713-472C-A440-AAFDDA2D3EB3@oracle.com>
 <7591f6c0-7192-78c3-fe79-56a7785c43e4@oracle.com>
 <2ff79d24-90ab-822a-bd61-e01b79c01ada@redhat.com>
Message-ID: <8d7678bf2281406da43cbe090276b51f@sap.com>

Hi Andrew,

I think you're right.

A Java program could have a goto in one of the cases of any switch which gets optimized out (by javac) replacing the branch target of the case.

So I think we need safepoint polls in all switch templates, too.

Best regards,
Martin


-----Original Message-----
From: Andrew Haley [mailto:aph at redhat.com] 
Sent: Mittwoch, 25. Oktober 2017 17:17
To: Robbin Ehn <robbin.ehn at oracle.com>; Karen Kinnear <karen.kinnear at oracle.com>; Doerr, Martin <martin.doerr at sap.com>
Cc: hotspot-dev developers <hotspot-dev at openjdk.java.net>
Subject: Re: RFR(XL): 8185640: Thread-local handshakes

On 24/10/17 15:54, Robbin Ehn wrote:

> I did a fix for the interpreter performance regression, it's plain and simple, I 
> kept the polling code inside dispatch_base but made it optional as the verify oop.
> 
> Incremental:
> http://cr.openjdk.java.net/~rehn/8185640/v5/Interpreter-Poll-7/webrev/index.html
> 
> Manual tested with jstack and it passes: hotspot_tier1, hotspot_handshake
> 
> It reduces the polling cost of 80%, sensitive benchmarks shows -0.44% regression 
> vs TLH off. More insensitive benchmark show no regression.

I think it's not quite right: you're missing a check in tableswitch
and fast_linearswitch.  These can be used to construct loops.

-- 
Andrew Haley
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671

From robbin.ehn at oracle.com  Wed Oct 25 19:52:33 2017
From: robbin.ehn at oracle.com (Robbin Ehn)
Date: Wed, 25 Oct 2017 21:52:33 +0200
Subject: RFR(XL): 8185640: Thread-local handshakes
In-Reply-To: <8d7678bf2281406da43cbe090276b51f@sap.com>
References: <6f2f6259-73f1-c09c-063e-39ae528fb96f@oracle.com>
 <15dd917732444959b7785efbe6640952@sap.com>
 <ba5210b2-d40c-b559-c9bb-35e245d2208e@oracle.com>
 <18f2001cbbbd4772aa9268e6e34b4be9@sap.com>
 <770b1286-3c8e-92e5-3929-17eb4e6c3847@oracle.com>
 <edf27200af4e4fb4a8bda1831ae2d974@sap.com>
 <6e10687c-f70e-5ee7-414f-b2c22d3e8f21@oracle.com>
 <f3553eec9b1d4815adbf7c87ca550527@sap.com>
 <580AD7F0-2713-472C-A440-AAFDDA2D3EB3@oracle.com>
 <7591f6c0-7192-78c3-fe79-56a7785c43e4@oracle.com>
 <2ff79d24-90ab-822a-bd61-e01b79c01ada@redhat.com>
 <8d7678bf2281406da43cbe090276b51f@sap.com>
Message-ID: <28bc3976-424d-1e05-cf7f-29bc38ccabcb@oracle.com>

Hi,

First thanks both for reviewing this!

On 2017-10-25 21:38, Doerr, Martin wrote:
> Hi Andrew,
> 
> I think you're right.
> 
> A Java program could have a goto in one of the cases of any switch which gets optimized out (by javac) replacing the branch target of the case.
> 
> So I think we need safepoint polls in all switch templates, too.

That's lookupswitch and binaryswitch also?

Thanks, Robbin

> 
> Best regards,
> Martin
> 
> 
> -----Original Message-----
> From: Andrew Haley [mailto:aph at redhat.com]
> Sent: Mittwoch, 25. Oktober 2017 17:17
> To: Robbin Ehn <robbin.ehn at oracle.com>; Karen Kinnear <karen.kinnear at oracle.com>; Doerr, Martin <martin.doerr at sap.com>
> Cc: hotspot-dev developers <hotspot-dev at openjdk.java.net>
> Subject: Re: RFR(XL): 8185640: Thread-local handshakes
> 
> On 24/10/17 15:54, Robbin Ehn wrote:
> 
>> I did a fix for the interpreter performance regression, it's plain and simple, I
>> kept the polling code inside dispatch_base but made it optional as the verify oop.
>>
>> Incremental:
>> http://cr.openjdk.java.net/~rehn/8185640/v5/Interpreter-Poll-7/webrev/index.html
>>
>> Manual tested with jstack and it passes: hotspot_tier1, hotspot_handshake
>>
>> It reduces the polling cost of 80%, sensitive benchmarks shows -0.44% regression
>> vs TLH off. More insensitive benchmark show no regression.
> 
> I think it's not quite right: you're missing a check in tableswitch
> and fast_linearswitch.  These can be used to construct loops.
> 

From martin.doerr at sap.com  Wed Oct 25 20:23:49 2017
From: martin.doerr at sap.com (Doerr, Martin)
Date: Wed, 25 Oct 2017 20:23:49 +0000
Subject: RFR(XL): 8185640: Thread-local handshakes
In-Reply-To: <28bc3976-424d-1e05-cf7f-29bc38ccabcb@oracle.com>
References: <6f2f6259-73f1-c09c-063e-39ae528fb96f@oracle.com>
 <15dd917732444959b7785efbe6640952@sap.com>
 <ba5210b2-d40c-b559-c9bb-35e245d2208e@oracle.com>
 <18f2001cbbbd4772aa9268e6e34b4be9@sap.com>
 <770b1286-3c8e-92e5-3929-17eb4e6c3847@oracle.com>
 <edf27200af4e4fb4a8bda1831ae2d974@sap.com>
 <6e10687c-f70e-5ee7-414f-b2c22d3e8f21@oracle.com>
 <f3553eec9b1d4815adbf7c87ca550527@sap.com>
 <580AD7F0-2713-472C-A440-AAFDDA2D3EB3@oracle.com>
 <7591f6c0-7192-78c3-fe79-56a7785c43e4@oracle.com>
 <2ff79d24-90ab-822a-bd61-e01b79c01ada@redhat.com>
 <8d7678bf2281406da43cbe090276b51f@sap.com>
 <28bc3976-424d-1e05-cf7f-29bc38ccabcb@oracle.com>
Message-ID: <818e352d5e3a450491cf0c140bf129d6@sap.com>

Hi Robbin,

as far as I can see tableswitch, fast_linearswitch and fast_binaryswitch should get the poll.
lookupswitch gets rewritten to fast_linearswitch or fast_binaryswitch.

Sorry that my proposal created extra work for you. Thanks for doing it.

Best regards,
Martin


-----Original Message-----
From: Robbin Ehn [mailto:robbin.ehn at oracle.com] 
Sent: Mittwoch, 25. Oktober 2017 21:53
To: Doerr, Martin <martin.doerr at sap.com>; Andrew Haley <aph at redhat.com>; Karen Kinnear <karen.kinnear at oracle.com>
Cc: hotspot-dev developers <hotspot-dev at openjdk.java.net>
Subject: Re: RFR(XL): 8185640: Thread-local handshakes

Hi,

First thanks both for reviewing this!

On 2017-10-25 21:38, Doerr, Martin wrote:
> Hi Andrew,
> 
> I think you're right.
> 
> A Java program could have a goto in one of the cases of any switch which gets optimized out (by javac) replacing the branch target of the case.
> 
> So I think we need safepoint polls in all switch templates, too.

That's lookupswitch and binaryswitch also?

Thanks, Robbin

> 
> Best regards,
> Martin
> 
> 
> -----Original Message-----
> From: Andrew Haley [mailto:aph at redhat.com]
> Sent: Mittwoch, 25. Oktober 2017 17:17
> To: Robbin Ehn <robbin.ehn at oracle.com>; Karen Kinnear <karen.kinnear at oracle.com>; Doerr, Martin <martin.doerr at sap.com>
> Cc: hotspot-dev developers <hotspot-dev at openjdk.java.net>
> Subject: Re: RFR(XL): 8185640: Thread-local handshakes
> 
> On 24/10/17 15:54, Robbin Ehn wrote:
> 
>> I did a fix for the interpreter performance regression, it's plain and simple, I
>> kept the polling code inside dispatch_base but made it optional as the verify oop.
>>
>> Incremental:
>> http://cr.openjdk.java.net/~rehn/8185640/v5/Interpreter-Poll-7/webrev/index.html
>>
>> Manual tested with jstack and it passes: hotspot_tier1, hotspot_handshake
>>
>> It reduces the polling cost of 80%, sensitive benchmarks shows -0.44% regression
>> vs TLH off. More insensitive benchmark show no regression.
> 
> I think it's not quite right: you're missing a check in tableswitch
> and fast_linearswitch.  These can be used to construct loops.
> 

From martin.doerr at sap.com  Wed Oct 25 22:05:43 2017
From: martin.doerr at sap.com (Doerr, Martin)
Date: Wed, 25 Oct 2017 22:05:43 +0000
Subject: RFR(XL): 8185640: Thread-local handshakes
In-Reply-To: <818e352d5e3a450491cf0c140bf129d6@sap.com>
References: <6f2f6259-73f1-c09c-063e-39ae528fb96f@oracle.com>
 <15dd917732444959b7785efbe6640952@sap.com>
 <ba5210b2-d40c-b559-c9bb-35e245d2208e@oracle.com>
 <18f2001cbbbd4772aa9268e6e34b4be9@sap.com>
 <770b1286-3c8e-92e5-3929-17eb4e6c3847@oracle.com>
 <edf27200af4e4fb4a8bda1831ae2d974@sap.com>
 <6e10687c-f70e-5ee7-414f-b2c22d3e8f21@oracle.com>
 <f3553eec9b1d4815adbf7c87ca550527@sap.com>
 <580AD7F0-2713-472C-A440-AAFDDA2D3EB3@oracle.com>
 <7591f6c0-7192-78c3-fe79-56a7785c43e4@oracle.com>
 <2ff79d24-90ab-822a-bd61-e01b79c01ada@redhat.com>
 <8d7678bf2281406da43cbe090276b51f@sap.com>
 <28bc3976-424d-1e05-cf7f-29bc38ccabcb@oracle.com>
 <818e352d5e3a450491cf0c140bf129d6@sap.com>
Message-ID: <1c4a025da6ad4d39bedc1d6a12549b87@sap.com>

Hi, it's me again.

after looking at the bytecodes again, I remembered that ret is olny for jsr. I think polling should also be done for the regular returns.
A poll at the beginning of TemplateTable::_return should do the job. Unfortunately, it doesn't fit into your dispatch scheme.

Best regards,
Martin


-----Original Message-----
From: hotspot-dev [mailto:hotspot-dev-bounces at openjdk.java.net] On Behalf Of Doerr, Martin
Sent: Mittwoch, 25. Oktober 2017 22:24
To: Robbin Ehn <robbin.ehn at oracle.com>; Andrew Haley <aph at redhat.com>; Karen Kinnear <karen.kinnear at oracle.com>
Cc: hotspot-dev developers <hotspot-dev at openjdk.java.net>
Subject: RE: RFR(XL): 8185640: Thread-local handshakes

Hi Robbin,

as far as I can see tableswitch, fast_linearswitch and fast_binaryswitch should get the poll.
lookupswitch gets rewritten to fast_linearswitch or fast_binaryswitch.

Sorry that my proposal created extra work for you. Thanks for doing it.

Best regards,
Martin


-----Original Message-----
From: Robbin Ehn [mailto:robbin.ehn at oracle.com] 
Sent: Mittwoch, 25. Oktober 2017 21:53
To: Doerr, Martin <martin.doerr at sap.com>; Andrew Haley <aph at redhat.com>; Karen Kinnear <karen.kinnear at oracle.com>
Cc: hotspot-dev developers <hotspot-dev at openjdk.java.net>
Subject: Re: RFR(XL): 8185640: Thread-local handshakes

Hi,

First thanks both for reviewing this!

On 2017-10-25 21:38, Doerr, Martin wrote:
> Hi Andrew,
> 
> I think you're right.
> 
> A Java program could have a goto in one of the cases of any switch which gets optimized out (by javac) replacing the branch target of the case.
> 
> So I think we need safepoint polls in all switch templates, too.

That's lookupswitch and binaryswitch also?

Thanks, Robbin

> 
> Best regards,
> Martin
> 
> 
> -----Original Message-----
> From: Andrew Haley [mailto:aph at redhat.com]
> Sent: Mittwoch, 25. Oktober 2017 17:17
> To: Robbin Ehn <robbin.ehn at oracle.com>; Karen Kinnear <karen.kinnear at oracle.com>; Doerr, Martin <martin.doerr at sap.com>
> Cc: hotspot-dev developers <hotspot-dev at openjdk.java.net>
> Subject: Re: RFR(XL): 8185640: Thread-local handshakes
> 
> On 24/10/17 15:54, Robbin Ehn wrote:
> 
>> I did a fix for the interpreter performance regression, it's plain and simple, I
>> kept the polling code inside dispatch_base but made it optional as the verify oop.
>>
>> Incremental:
>> http://cr.openjdk.java.net/~rehn/8185640/v5/Interpreter-Poll-7/webrev/index.html
>>
>> Manual tested with jstack and it passes: hotspot_tier1, hotspot_handshake
>>
>> It reduces the polling cost of 80%, sensitive benchmarks shows -0.44% regression
>> vs TLH off. More insensitive benchmark show no regression.
> 
> I think it's not quite right: you're missing a check in tableswitch
> and fast_linearswitch.  These can be used to construct loops.
> 

From aph at redhat.com  Thu Oct 26 08:58:31 2017
From: aph at redhat.com (Andrew Haley)
Date: Thu, 26 Oct 2017 09:58:31 +0100
Subject: RFR(XL): 8185640: Thread-local handshakes
In-Reply-To: <1c4a025da6ad4d39bedc1d6a12549b87@sap.com>
References: <6f2f6259-73f1-c09c-063e-39ae528fb96f@oracle.com>
 <15dd917732444959b7785efbe6640952@sap.com>
 <ba5210b2-d40c-b559-c9bb-35e245d2208e@oracle.com>
 <18f2001cbbbd4772aa9268e6e34b4be9@sap.com>
 <770b1286-3c8e-92e5-3929-17eb4e6c3847@oracle.com>
 <edf27200af4e4fb4a8bda1831ae2d974@sap.com>
 <6e10687c-f70e-5ee7-414f-b2c22d3e8f21@oracle.com>
 <f3553eec9b1d4815adbf7c87ca550527@sap.com>
 <580AD7F0-2713-472C-A440-AAFDDA2D3EB3@oracle.com>
 <7591f6c0-7192-78c3-fe79-56a7785c43e4@oracle.com>
 <2ff79d24-90ab-822a-bd61-e01b79c01ada@redhat.com>
 <8d7678bf2281406da43cbe090276b51f@sap.com>
 <28bc3976-424d-1e05-cf7f-29bc38ccabcb@oracle.com>
 <818e352d5e3a450491cf0c140bf129d6@sap.com>
 <1c4a025da6ad4d39bedc1d6a12549b87@sap.com>
Message-ID: <93cce80b-e9d7-f016-1324-2b0f5fac48c4@redhat.com>

On 25/10/17 23:05, Doerr, Martin wrote:

> after looking at the bytecodes again, I remembered that ret is olny
> for jsr. I think polling should also be done for the regular
> returns.
> A poll at the beginning of TemplateTable::_return should do the
> job. Unfortunately, it doesn't fit into your dispatch scheme.

I'm wondering if this is a good idea at all: it could increase the
latency of taking a safepoint in bytecode.  Granted, it does avoid
some significant code bloat in the interpreter.

BTW, I don't understand why interpreted code doesn't simply read the
polling page.  Or we could even simply read-protect the bytecode
dispatch tables themselves.

-- 
Andrew Haley
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671

From aph at redhat.com  Thu Oct 26 08:59:22 2017
From: aph at redhat.com (Andrew Haley)
Date: Thu, 26 Oct 2017 09:59:22 +0100
Subject: RFR(XL): 8185640: Thread-local handshakes
In-Reply-To: <93cce80b-e9d7-f016-1324-2b0f5fac48c4@redhat.com>
References: <6f2f6259-73f1-c09c-063e-39ae528fb96f@oracle.com>
 <15dd917732444959b7785efbe6640952@sap.com>
 <ba5210b2-d40c-b559-c9bb-35e245d2208e@oracle.com>
 <18f2001cbbbd4772aa9268e6e34b4be9@sap.com>
 <770b1286-3c8e-92e5-3929-17eb4e6c3847@oracle.com>
 <edf27200af4e4fb4a8bda1831ae2d974@sap.com>
 <6e10687c-f70e-5ee7-414f-b2c22d3e8f21@oracle.com>
 <f3553eec9b1d4815adbf7c87ca550527@sap.com>
 <580AD7F0-2713-472C-A440-AAFDDA2D3EB3@oracle.com>
 <7591f6c0-7192-78c3-fe79-56a7785c43e4@oracle.com>
 <2ff79d24-90ab-822a-bd61-e01b79c01ada@redhat.com>
 <8d7678bf2281406da43cbe090276b51f@sap.com>
 <28bc3976-424d-1e05-cf7f-29bc38ccabcb@oracle.com>
 <818e352d5e3a450491cf0c140bf129d6@sap.com>
 <1c4a025da6ad4d39bedc1d6a12549b87@sap.com>
 <93cce80b-e9d7-f016-1324-2b0f5fac48c4@redhat.com>
Message-ID: <2600c955-57b3-f442-f9fa-0e064ea3916f@redhat.com>

On 26/10/17 09:58, Andrew Haley wrote:
> Or we could even simply read-protect the bytecode
> dispatch tables themselves.

But not with thread-local handshakes, of course.

-- 
Andrew Haley
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671

From martin.doerr at sap.com  Thu Oct 26 09:30:37 2017
From: martin.doerr at sap.com (Doerr, Martin)
Date: Thu, 26 Oct 2017 09:30:37 +0000
Subject: RFR(XL): 8185640: Thread-local handshakes
In-Reply-To: <93cce80b-e9d7-f016-1324-2b0f5fac48c4@redhat.com>
References: <6f2f6259-73f1-c09c-063e-39ae528fb96f@oracle.com>
 <15dd917732444959b7785efbe6640952@sap.com>
 <ba5210b2-d40c-b559-c9bb-35e245d2208e@oracle.com>
 <18f2001cbbbd4772aa9268e6e34b4be9@sap.com>
 <770b1286-3c8e-92e5-3929-17eb4e6c3847@oracle.com>
 <edf27200af4e4fb4a8bda1831ae2d974@sap.com>
 <6e10687c-f70e-5ee7-414f-b2c22d3e8f21@oracle.com>
 <f3553eec9b1d4815adbf7c87ca550527@sap.com>
 <580AD7F0-2713-472C-A440-AAFDDA2D3EB3@oracle.com>
 <7591f6c0-7192-78c3-fe79-56a7785c43e4@oracle.com>
 <2ff79d24-90ab-822a-bd61-e01b79c01ada@redhat.com>
 <8d7678bf2281406da43cbe090276b51f@sap.com>
 <28bc3976-424d-1e05-cf7f-29bc38ccabcb@oracle.com>
 <818e352d5e3a450491cf0c140bf129d6@sap.com>
 <1c4a025da6ad4d39bedc1d6a12549b87@sap.com>
 <93cce80b-e9d7-f016-1324-2b0f5fac48c4@redhat.com>
Message-ID: <e53ea73c7b414ce791ddae87ce14d63e@sap.com>

Hi Andrew,

I don't think this will increase safepoint latency (if implemented appropriately). Methods compiled by C2 may contain counted loops  (with int range) without safepoint. So this may be quite long in comparison to an interpreted method which can only contain up to 64 k bytecodes while every branch contains a safepoint check. (One might be kind of concerned about no poll in calls in the current implementation.)

Best regards,
Martin

 
-----Original Message-----
From: Andrew Haley [mailto:aph at redhat.com] 
Sent: Donnerstag, 26. Oktober 2017 10:59
To: Doerr, Martin <martin.doerr at sap.com>; Robbin Ehn <robbin.ehn at oracle.com>; Karen Kinnear <karen.kinnear at oracle.com>; Coleen Phillimore (coleen.phillimore at oracle.com) <coleen.phillimore at oracle.com>
Cc: hotspot-dev developers <hotspot-dev at openjdk.java.net>
Subject: Re: RFR(XL): 8185640: Thread-local handshakes

On 25/10/17 23:05, Doerr, Martin wrote:

> after looking at the bytecodes again, I remembered that ret is olny
> for jsr. I think polling should also be done for the regular
> returns.
> A poll at the beginning of TemplateTable::_return should do the
> job. Unfortunately, it doesn't fit into your dispatch scheme.

I'm wondering if this is a good idea at all: it could increase the
latency of taking a safepoint in bytecode.  Granted, it does avoid
some significant code bloat in the interpreter.

BTW, I don't understand why interpreted code doesn't simply read the
polling page.  Or we could even simply read-protect the bytecode
dispatch tables themselves.

-- 
Andrew Haley
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671

From aph at redhat.com  Thu Oct 26 09:39:53 2017
From: aph at redhat.com (Andrew Haley)
Date: Thu, 26 Oct 2017 10:39:53 +0100
Subject: RFR(XL): 8185640: Thread-local handshakes
In-Reply-To: <e53ea73c7b414ce791ddae87ce14d63e@sap.com>
References: <6f2f6259-73f1-c09c-063e-39ae528fb96f@oracle.com>
 <15dd917732444959b7785efbe6640952@sap.com>
 <ba5210b2-d40c-b559-c9bb-35e245d2208e@oracle.com>
 <18f2001cbbbd4772aa9268e6e34b4be9@sap.com>
 <770b1286-3c8e-92e5-3929-17eb4e6c3847@oracle.com>
 <edf27200af4e4fb4a8bda1831ae2d974@sap.com>
 <6e10687c-f70e-5ee7-414f-b2c22d3e8f21@oracle.com>
 <f3553eec9b1d4815adbf7c87ca550527@sap.com>
 <580AD7F0-2713-472C-A440-AAFDDA2D3EB3@oracle.com>
 <7591f6c0-7192-78c3-fe79-56a7785c43e4@oracle.com>
 <2ff79d24-90ab-822a-bd61-e01b79c01ada@redhat.com>
 <8d7678bf2281406da43cbe090276b51f@sap.com>
 <28bc3976-424d-1e05-cf7f-29bc38ccabcb@oracle.com>
 <818e352d5e3a450491cf0c140bf129d6@sap.com>
 <1c4a025da6ad4d39bedc1d6a12549b87@sap.com>
 <93cce80b-e9d7-f016-1324-2b0f5fac48c4@redhat.com>
 <e53ea73c7b414ce791ddae87ce14d63e@sap.com>
Message-ID: <57ce3fa7-5ba7-2525-3e9f-8b65ee34a24d@redhat.com>

On 26/10/17 10:30, Doerr, Martin wrote:

> I don't think this will increase safepoint latency (if implemented
> appropriately). Methods compiled by C2 may contain counted loops
> (with int range) without safepoint. So this may be quite long in
> comparison to an interpreted method which can only contain up to 64k
> bytecodes while every branch contains a safepoint check.

This is to say, I think, that we already have one source of severe
safepoint delays, so why not have another one?  64k bytecodes is a
lot.

> (One might be kind of concerned about no poll in calls in the
> current implementation.)

I'm not sure why.  For every call, there is a return.

-- 
Andrew Haley
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671

From martin.doerr at sap.com  Thu Oct 26 09:44:01 2017
From: martin.doerr at sap.com (Doerr, Martin)
Date: Thu, 26 Oct 2017 09:44:01 +0000
Subject: RFR(XL): 8185640: Thread-local handshakes
In-Reply-To: <57ce3fa7-5ba7-2525-3e9f-8b65ee34a24d@redhat.com>
References: <6f2f6259-73f1-c09c-063e-39ae528fb96f@oracle.com>
 <15dd917732444959b7785efbe6640952@sap.com>
 <ba5210b2-d40c-b559-c9bb-35e245d2208e@oracle.com>
 <18f2001cbbbd4772aa9268e6e34b4be9@sap.com>
 <770b1286-3c8e-92e5-3929-17eb4e6c3847@oracle.com>
 <edf27200af4e4fb4a8bda1831ae2d974@sap.com>
 <6e10687c-f70e-5ee7-414f-b2c22d3e8f21@oracle.com>
 <f3553eec9b1d4815adbf7c87ca550527@sap.com>
 <580AD7F0-2713-472C-A440-AAFDDA2D3EB3@oracle.com>
 <7591f6c0-7192-78c3-fe79-56a7785c43e4@oracle.com>
 <2ff79d24-90ab-822a-bd61-e01b79c01ada@redhat.com>
 <8d7678bf2281406da43cbe090276b51f@sap.com>
 <28bc3976-424d-1e05-cf7f-29bc38ccabcb@oracle.com>
 <818e352d5e3a450491cf0c140bf129d6@sap.com>
 <1c4a025da6ad4d39bedc1d6a12549b87@sap.com>
 <93cce80b-e9d7-f016-1324-2b0f5fac48c4@redhat.com>
 <e53ea73c7b414ce791ddae87ce14d63e@sap.com>
 <57ce3fa7-5ba7-2525-3e9f-8b65ee34a24d@redhat.com>
Message-ID: <b6881bc490724e3a86f80af1cf20a999@sap.com>

64k is the absolute worst case. I guess it won't take long until a branch gets reached in typical bytecode.

My point regarding the call is that it may be a tail recursion which fills up the stack.

Best regards,
Martin


-----Original Message-----
From: Andrew Haley [mailto:aph at redhat.com] 
Sent: Donnerstag, 26. Oktober 2017 11:40
To: Doerr, Martin <martin.doerr at sap.com>; Robbin Ehn <robbin.ehn at oracle.com>; Karen Kinnear <karen.kinnear at oracle.com>; Coleen Phillimore (coleen.phillimore at oracle.com) <coleen.phillimore at oracle.com>
Cc: hotspot-dev developers <hotspot-dev at openjdk.java.net>
Subject: Re: RFR(XL): 8185640: Thread-local handshakes

On 26/10/17 10:30, Doerr, Martin wrote:

> I don't think this will increase safepoint latency (if implemented
> appropriately). Methods compiled by C2 may contain counted loops
> (with int range) without safepoint. So this may be quite long in
> comparison to an interpreted method which can only contain up to 64k
> bytecodes while every branch contains a safepoint check.

This is to say, I think, that we already have one source of severe
safepoint delays, so why not have another one?  64k bytecodes is a
lot.

> (One might be kind of concerned about no poll in calls in the
> current implementation.)

I'm not sure why.  For every call, there is a return.

-- 
Andrew Haley
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671

From magnus.ihse.bursie at oracle.com  Thu Oct 26 09:57:15 2017
From: magnus.ihse.bursie at oracle.com (Magnus Ihse Bursie)
Date: Thu, 26 Oct 2017 11:57:15 +0200
Subject: RFR (L, tedious again, sorry) 8189610: Reconcile jvm.h and all
 jvm_md.h between java.base and hotspot
In-Reply-To: <8671321f-398c-5f7f-634d-9d9664e04d87@oracle.com>
References: <8671321f-398c-5f7f-634d-9d9664e04d87@oracle.com>
Message-ID: <05bf853a-52f8-c450-4171-89f6b64d793a@oracle.com>

Coleen,

Thank you for addressing this!

On 2017-10-25 18:49, coleen.phillimore at oracle.com wrote:
> Summary: removed hotspot version of jvm*h and jni*h files
>
> Mostly used sed to remove prims/jvm.h and move #include "jvm.h" after 
> precompiled.h, so if you have repetitive stress wrist issues don't 
> click on most of these files.
>
> There were more issues to resolve, however.? The JDK windows jni_md.h 
> file defined jint as long and the hotspot windows jni_x86.h as int.? I 
> had to choose the jdk version since it's the public version, so there 
> are changes to the hotspot files for this. Generally I changed the 
> code to use 'int' rather than 'jint' where the surrounding API didn't 
> insist on consistently using java types. We should mostly be using C++ 
> types within hotspot except in interfaces to native/JNI code.? There 
> are a couple of hacks in places where adding multiple jint casts was 
> too painful.
>
> Tested with JPRT and tier2-4 (in progress).
>
> open webrev at http://cr.openjdk.java.net/~coleenp/8189610.01/webrev

Looks great!

Just a few comments:

* src/java.base/unix/native/include/jni_md.h:

I don't think the externally_visible attribute should be there for arm. 
I know this was the case for the corresponding hotspot file for arm, but 
that was techically incorrect. The proper dependency here is that 
externally_visible should be in all JNIEXPORT if and only if we're 
building with JVM feature "link-time-opt". Traditionally, that feature 
been enabled when building arm32 builds, and only then, so there's been 
a (coincidentally) connection here. Nowadays, Oracle does not care about 
the arm32 builds, and I'm not sure if anyone else is building them with 
link-time-opt enabled.

It does seem wrong to me to export this behavior in the public jni_md.h 
file, though. I think the correct way to solve this, if we should 
continue supporting link-time-opt is to make sure this attribute is set 
for exported hotspot functions. If it's still needed, that is. A quick 
googling seems to indicate that visibility("default") might be enough in 
modern gcc's.

A third option is to remove the support for link-time-opt entirely, if 
it's not really used.

* src/java.base/unix/native/include/jvm_md.h and 
src/java.base/windows/native/include/jvm_md.h:

These files define a public API, and contain non-trivial changes. I 
suspect you should file a CSR request. (Even though I realize you're 
only matching the header file with the reality.)

/Magnus

> bug link https://bugs.openjdk.java.net/browse/JDK-8189610
>
> I have a script to update copyright files on commit.
>
> Thanks to Magnus and ErikJ for the makefile changes.
>
> Thanks,
> Coleen
>


From nils.eliasson at oracle.com  Thu Oct 26 12:11:37 2017
From: nils.eliasson at oracle.com (Nils Eliasson)
Date: Thu, 26 Oct 2017 14:11:37 +0200
Subject: JDK10/RFR(M): 8172232: SPARC ISA/CPU feature detection is
 broken/insufficient (on Linux).
In-Reply-To: <3fcc865d-3eda-a341-e112-8417711ee3e5@oracle.com>
References: <7d5e1ebb-7de8-66f1-a1f0-db465bcad4ab@oracle.com>
 <9f2896ca-65dc-557f-793c-4235499cc340@oracle.com>
 <3fcc865d-3eda-a341-e112-8417711ee3e5@oracle.com>
Message-ID: <c04440fc-0adf-383b-a5c0-a6f2df9b89d0@oracle.com>

Thanks for fixing Patric,

Looks good!

Regards,
Nils

On 2017-10-04 11:04, Patric Hedlin wrote:
> Thanks for reviewing Vladimir.
>
> On 09/29/2017 08:56 PM, Vladimir Kozlov wrote:
>> In general it is fine. Few notes.
>> You use ifdef DEBUG_SPARC_CAPS which is undefed at the beginning. Is 
>> it set by gcc by default?
>>
> Removed.
>
>> Coding style for methods definitions - open parenthesis should be on 
>> the same line:
>>
>> +  bool match(const char* s) const
>> +  {
>>
> Updated/re-formated.
>
> Refreshed webrev.
>
> @Adrian: Please validate.
>
> Best regards,
> Patric
>
>> Thanks,
>> Vladimir
>>
>> On 9/29/17 6:08 AM, Patric Hedlin wrote:
>>> Dear all,
>>>
>>> I would like to ask for help to review the following change/update:
>>>
>>> Issue:  https://bugs.openjdk.java.net/browse/JDK-8172232
>>>
>>> Webrev: http://cr.openjdk.java.net/~phedlin/tr8172232/
>>>
>>>
>>> 8172232: SPARC ISA/CPU feature detection is broken/insufficient (on 
>>> Linux).
>>>
>>>      Subsumes (duplicate) JDK-8186579: 
>>> VM_Version::platform_features() needs update on linux-sparc.
>>>
>>>
>>> Caveat:
>>>
>>>      This update will introduce some redundancies into the code 
>>> base, features and definitions
>>>      currently not used, addressed by subsequent bug or feature 
>>> updates/patches. Fujitsu HW is
>>>      treated very conservatively.
>>>
>>>
>>> Testing:
>>>
>>>      JDK9/JDK10 local jtreg/hotspot
>>>
>>>
>>> Thanks to Adrian for additional test (and review) support.
>>>
>>> Tested-By: John Paul Adrian Glaubitz <glaubitz at physik.fu-berlin.de>
>>>
>>>
>>> Best regards,
>>> Patric
>>>
>


From erik.osterlund at oracle.com  Thu Oct 26 14:39:47 2017
From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=)
Date: Thu, 26 Oct 2017 16:39:47 +0200
Subject: RFR(XL): 8185640: Thread-local handshakes
In-Reply-To: <93cce80b-e9d7-f016-1324-2b0f5fac48c4@redhat.com>
References: <6f2f6259-73f1-c09c-063e-39ae528fb96f@oracle.com>
 <15dd917732444959b7785efbe6640952@sap.com>
 <ba5210b2-d40c-b559-c9bb-35e245d2208e@oracle.com>
 <18f2001cbbbd4772aa9268e6e34b4be9@sap.com>
 <770b1286-3c8e-92e5-3929-17eb4e6c3847@oracle.com>
 <edf27200af4e4fb4a8bda1831ae2d974@sap.com>
 <6e10687c-f70e-5ee7-414f-b2c22d3e8f21@oracle.com>
 <f3553eec9b1d4815adbf7c87ca550527@sap.com>
 <580AD7F0-2713-472C-A440-AAFDDA2D3EB3@oracle.com>
 <7591f6c0-7192-78c3-fe79-56a7785c43e4@oracle.com>
 <2ff79d24-90ab-822a-bd61-e01b79c01ada@redhat.com>
 <8d7678bf2281406da43cbe090276b51f@sap.com>
 <28bc3976-424d-1e05-cf7f-29bc38ccabcb@oracle.com>
 <818e352d5e3a450491cf0c140bf129d6@sap.com>
 <1c4a025da6ad4d39bedc1d6a12549b87@sap.com>
 <93cce80b-e9d7-f016-1324-2b0f5fac48c4@redhat.com>
Message-ID: <59F1F3B3.10701@oracle.com>

Hi Andrew,

On 2017-10-26 10:58, Andrew Haley wrote:
> BTW, I don't understand why interpreted code doesn't simply read the 
> polling page. Or we could even simply read-protect the bytecode 
> dispatch tables themselves. 

The reason we do not poll the page in the interpreter is that we need to 
generate appropriate relocation entries in the code blob for the PCs 
that we poll on, so that we in the signal handler can look up the code 
blob, walk the relocation entries, and find precisely why we got the 
trap, i.e. due to the poll, and precisely what kind of poll, so we know 
what trampoline needs to be taken into the runtime. While constructing 
something that does that is indeed possible, it simply did not seem 
worth the trouble compared to using a branch in these paths. The same 
reasoning applies for the poll performed in the native wrapper when 
waking up from native and transitioning into Java. It performs a 
conditional branch instead of indirect load to avoid signal handler 
logic for polls that are not performance critical.

Only the polls in JIT-compiled code use the optimized indirect load 
mechanism.

And we do not want to read-protect the bytecode dispatch tables, because 
we want the ability to stop individual threads, and that would stop all 
of them.

I hope that explains it.

Thanks,
/Erik

From bob.vandette at oracle.com  Thu Oct 26 14:45:49 2017
From: bob.vandette at oracle.com (Bob Vandette)
Date: Thu, 26 Oct 2017 10:45:49 -0400
Subject: RFR: 8146115 - Improve docker container detection and resource
 configuration usage 
In-Reply-To: <E5A49BC8-B64A-4145-B351-E8B33FCCE28C@oracle.com>
References: <74630458-926E-4B3E-B967-6F6ADCA0A406@oracle.com>
 <2d9dd746-63e1-cade-28f9-5ca1ae1c253e@oracle.com>
 <200F07CB-35DA-492B-B78D-9EC033EE0431@oracle.com>
 <833ba1a5-49fc-bb24-ff99-994011af52aa@oracle.com>
 <C47EF684-41C9-4BF3-9678-A5E80A87C5D2@oracle.com>
 <B7430867-3BE1-4B8D-B252-87DED25396D3@oracle.com>
 <E5A49BC8-B64A-4145-B351-E8B33FCCE28C@oracle.com>
Message-ID: <321E60F4-567F-4648-BC5C-53903B6C95BF@oracle.com>


> On Oct 25, 2017, at 2:57 AM, Kim Barrett <kim.barrett at oracle.com> wrote:
> 
>> On Oct 24, 2017, at 10:11 AM, Bob Vandette <bob.vandette at oracle.com> wrote:
>> 
>> 
>>> On Oct 23, 2017, at 12:52 AM, Kim Barrett <kim.barrett at oracle.com> wrote:
>>> 
>>>> On Sep 27, 2017, at 9:20 PM, David Holmes <david.holmes at oracle.com> wrote:
>>>>>> 62     void set_subsystem_path(char *cgroup_path) {
>>>>>> 
>>>>>> If this takes a "const char*" will it save you from casting string literals to "char*" elsewhere?
>>>>> I tried several different ways of declaring the container accessor functions and
>>>>> always ended up with warnings due to scanf not being able to validate arguments
>>>>> since the format string didn?t end up being a string literal.  I originally was using templates
>>>>> and then ended up with the macros.  I tried several different casts but could resolve the problem.
>>>> 
>>>> Sounds like something Kim Barrett should take a look at :)
>>> 
>>> Fortunately, I just happened by.
>>> 
>>> The warnings are because we compile with -Wformat=2, which enables
>>> -Wformat-nonliteral (among other things).
>>> 
>>> Use PRAGMA_FORMAT_NONLITERAL_IGNORED, e.g.
>>> 
>>> PRAGMA_DIAG_PUSH
>>> PRAGMA_FORMAT_NONLITERAL_IGNORED
>>> <function definition>
>>> PRAGMA_DIAG_POP
>>> 
>>> That will silence warnings about sscanf (or anything else!) with a
>>> non-literal format string within that <function definition>.
>> 
>> Thanks but I ended up taking a different approach that resulted in more compact code.
>> 
>> http://cr.openjdk.java.net/~bobv/8146115/webrev.02
> 
> Not a review, just a few more comments in passing.
> 
> ------------------------------------------------------------------------------ 
> src/hotspot/os/linux/osContainer_linux.cpp 
> 150             log_debug(os, container)("Type %s not found in file %s\n",    \
> 151                                      scan_fmt , buf);                     \
> 
> uses buf as path, but buf has been clobbered to contain contents from
> file.
> 
> Similarly for
> 155         log_debug(os, container)("Empty file %s\n", buf);                 \

I fixed these by adding an additional buffer for the read.

> 
> ------------------------------------------------------------------------------
> src/hotspot/os/linux/osContainer_linux.cpp 
> 158       log_debug(os, container)("file not found %s\n", buf);               \
> 
> There are many reasons why fopen might fail, and merging them all into
> a "file not found" message could be quite confusing.  It would be much
> better to report the error from errno.

I added os::strerror(errno) to all failures from fopen to provide more detail.

> 
> ------------------------------------------------------------------------------ 
> src/hotspot/os/linux/osContainer_linux.cpp 
> 
> Something like the following (where the obvious helpers are made up to
> keep this short) would eliminate the macrology.
> 
> PRAGMA_DIAG_PUSH  
> PRAGMA_FORMAT_NONLITERAL_IGNORED
> template<typename T>
> int get_subsystem_file_contents_value(CgroupSubsystem* c,
>                                  const char* filename,
>                                  T* returnval,
>                                  const char* scan_fmt,
> 				  const char* description) {
>  const char* line = get_subsystem_file_line(c, filename);
>  if (line != NULL) {
>    if (sscanf(line, scan_fmt, returnval) == 1) {
>      return 0;
>    } else {
>      report_subsystem_file_contents_parse_error(description, c, filename);
>    }
>  }
>  return OSCONTAINER_ERROR;
> }
> PRAGMA_DIAG_POP
> 
> int subsystem_file_contents_int(CgroupSubsystem* c,
>                                const char* filename,
>                                int* returnval) {
>  return get_subsystem_file_contents_value(c, filename, returnval, "%d", "int");
> }
> 

I originally tried to use a template but ran into the issue of the literal and strings
needed to be handled differently.  I wasn?t sure how to limit the length of the string
but I now see that I can use something like ?%1023s?.  I?ll give it a try.

Bob.


From aph at redhat.com  Thu Oct 26 16:05:11 2017
From: aph at redhat.com (Andrew Haley)
Date: Thu, 26 Oct 2017 17:05:11 +0100
Subject: RFR(XL): 8185640: Thread-local handshakes
In-Reply-To: <59F1F3B3.10701@oracle.com>
References: <6f2f6259-73f1-c09c-063e-39ae528fb96f@oracle.com>
 <15dd917732444959b7785efbe6640952@sap.com>
 <ba5210b2-d40c-b559-c9bb-35e245d2208e@oracle.com>
 <18f2001cbbbd4772aa9268e6e34b4be9@sap.com>
 <770b1286-3c8e-92e5-3929-17eb4e6c3847@oracle.com>
 <edf27200af4e4fb4a8bda1831ae2d974@sap.com>
 <6e10687c-f70e-5ee7-414f-b2c22d3e8f21@oracle.com>
 <f3553eec9b1d4815adbf7c87ca550527@sap.com>
 <580AD7F0-2713-472C-A440-AAFDDA2D3EB3@oracle.com>
 <7591f6c0-7192-78c3-fe79-56a7785c43e4@oracle.com>
 <2ff79d24-90ab-822a-bd61-e01b79c01ada@redhat.com>
 <8d7678bf2281406da43cbe090276b51f@sap.com>
 <28bc3976-424d-1e05-cf7f-29bc38ccabcb@oracle.com>
 <818e352d5e3a450491cf0c140bf129d6@sap.com>
 <1c4a025da6ad4d39bedc1d6a12549b87@sap.com>
 <93cce80b-e9d7-f016-1324-2b0f5fac48c4@redhat.com> <59F1F3B3.10701@oracle.com>
Message-ID: <43837915-a3a3-b36f-940e-1327937f0f17@redhat.com>

On 26/10/17 15:39, Erik ?sterlund wrote:

> The reason we do not poll the page in the interpreter is that we
> need to generate appropriate relocation entries in the code blob for
> the PCs that we poll on, so that we in the signal handler can look
> up the code blob, walk the relocation entries, and find precisely
> why we got the trap, i.e. due to the poll, and precisely what kind
> of poll, so we know what trampoline needs to be taken into the
> runtime.

Not really, no.  If we know that we're in the interpreter and the
faulting address is the safepoint poll, then we can read all of the
context we need from the interpreter registers and the frame.

> While constructing something that does that is indeed possible, it
> simply did not seem worth the trouble compared to using a branch in
> these paths. The same reasoning applies for the poll performed in
> the native wrapper when waking up from native and transitioning into
> Java. It performs a conditional branch instead of indirect load to
> avoid signal handler logic for polls that are not performance
> critical.

If we're talking about performance, the existing bytecode interpreter
is exquisitely carefully coded, even going to the extent of having
multiple dispatch tables for safepoint- and non-safepoint cases.
Clearly the original authors weren't thinking that code was not
performance critical or they wouldn't have done what they did.  I
suppose, though, that the design we have is from the early days when
people diligently strove to make the interpreter as fast as possible.

-- 
Andrew Haley
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671

From erik.osterlund at oracle.com  Thu Oct 26 17:00:11 2017
From: erik.osterlund at oracle.com (Erik Osterlund)
Date: Thu, 26 Oct 2017 19:00:11 +0200
Subject: RFR(XL): 8185640: Thread-local handshakes
In-Reply-To: <43837915-a3a3-b36f-940e-1327937f0f17@redhat.com>
References: <6f2f6259-73f1-c09c-063e-39ae528fb96f@oracle.com>
 <15dd917732444959b7785efbe6640952@sap.com>
 <ba5210b2-d40c-b559-c9bb-35e245d2208e@oracle.com>
 <18f2001cbbbd4772aa9268e6e34b4be9@sap.com>
 <770b1286-3c8e-92e5-3929-17eb4e6c3847@oracle.com>
 <edf27200af4e4fb4a8bda1831ae2d974@sap.com>
 <6e10687c-f70e-5ee7-414f-b2c22d3e8f21@oracle.com>
 <f3553eec9b1d4815adbf7c87ca550527@sap.com>
 <580AD7F0-2713-472C-A440-AAFDDA2D3EB3@oracle.com>
 <7591f6c0-7192-78c3-fe79-56a7785c43e4@oracle.com>
 <2ff79d24-90ab-822a-bd61-e01b79c01ada@redhat.com>
 <8d7678bf2281406da43cbe090276b51f@sap.com>
 <28bc3976-424d-1e05-cf7f-29bc38ccabcb@oracle.com>
 <818e352d5e3a450491cf0c140bf129d6@sap.com>
 <1c4a025da6ad4d39bedc1d6a12549b87@sap.com>
 <93cce80b-e9d7-f016-1324-2b0f5fac48c4@redhat.com> <59F1F3B3.10701@oracle.com>
 <43837915-a3a3-b36f-940e-1327937f0f17@redhat.com>
Message-ID: <2EB9D7C3-B868-4C3E-BD88-6A4F92A39999@oracle.com>

Hi Andrew,

> On 26 Oct 2017, at 18:05, Andrew Haley <aph at redhat.com> wrote:
> 
>> On 26/10/17 15:39, Erik ?sterlund wrote:
>> 
>> The reason we do not poll the page in the interpreter is that we
>> need to generate appropriate relocation entries in the code blob for
>> the PCs that we poll on, so that we in the signal handler can look
>> up the code blob, walk the relocation entries, and find precisely
>> why we got the trap, i.e. due to the poll, and precisely what kind
>> of poll, so we know what trampoline needs to be taken into the
>> runtime.
> 
> Not really, no.  If we know that we're in the interpreter and the
> faulting address is the safepoint poll, then we can read all of the
> context we need from the interpreter registers and the frame.

That sounds like what I said. As I said, it is definitely possible to dig out that it was an interpreter safepoint poll causing the trap given the execution context in the interpreter (and appropriate metadata generated for the trapping PC), and send the trapping thread back to a trampoline that saves state appropriately and calls into the runtime to yield to the safepoint synchronizer, like we do for the JIT-compiled code. 

But the cost of the conditional branch is empirically (this was attempted and measured a while ago) approximately the same as the indirect load during "normal circumstances". The indirect load was only marginally better. Therefore that added complexity with the signal handler dance was simply not warranted for the interpreter. It was only warranted when polling in the absolutely most performance critical code, i.e. JIT compiled code.

> 
>> While constructing something that does that is indeed possible, it
>> simply did not seem worth the trouble compared to using a branch in
>> these paths. The same reasoning applies for the poll performed in
>> the native wrapper when waking up from native and transitioning into
>> Java. It performs a conditional branch instead of indirect load to
>> avoid signal handler logic for polls that are not performance
>> critical.
> 
> If we're talking about performance, the existing bytecode interpreter
> is exquisitely carefully coded, even going to the extent of having
> multiple dispatch tables for safepoint- and non-safepoint cases.
> Clearly the original authors weren't thinking that code was not
> performance critical or they wouldn't have done what they did.  I
> suppose, though, that the design we have is from the early days when
> people diligently strove to make the interpreter as fast as possible.

On the other hand, branches have become a lot faster in "recent" years, and this one is particularly trivial to predict. Therefore I prefer to base design decisions on empirical measurements. And introducing that complexity for an close to insignificantly faster interpreter poll does not seem encouraging to me. Do you agree?

Thanks,
/Erik

> -- 
> Andrew Haley
> Java Platform Lead Engineer
> Red Hat UK Ltd. <https://www.redhat.com>
> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From paul.sandoz at oracle.com  Thu Oct 26 17:03:15 2017
From: paul.sandoz at oracle.com (Paul Sandoz)
Date: Thu, 26 Oct 2017 10:03:15 -0700
Subject: [10] RFR 8186046 Minimal ConstantDynamic support 
Message-ID: <A3420FD7-0556-4865-9CE7-2E5252DD1381@oracle.com>

Hi,

Please review the following patch for minimal dynamic constant support:

  http://cr.openjdk.java.net/~psandoz/jdk10/JDK-8186046-minimal-condy-support-hs/webrev/ <http://cr.openjdk.java.net/~psandoz/jdk10/JDK-8186046-minimal-condy-support-hs/webrev/>

  https://bugs.openjdk.java.net/browse/JDK-8186046 <https://bugs.openjdk.java.net/browse/JDK-8186046>
  https://bugs.openjdk.java.net/browse/JDK-8186209 <https://bugs.openjdk.java.net/browse/JDK-8186209>

This patch is based on the JDK 10 unified HotSpot repository. Testing so far looks good.

By minimal i mean just the support in the runtime for a dynamic constant pool entry to be referenced by a LDC instruction or a bootstrap method argument. Much of the work leverages the foundations built by invoke dynamic but is arguably simpler since resolution is less complex.

A small set of bootstrap methods will be proposed as a follow on issue for 10 (these are currently being refined in the amber repository).

Bootstrap method invocation has not changed (and the rules are the same for dynamic constants and indy). It is planned to enhance this in a further major release to support lazy resolution of bootstrap method arguments.

The CSR for the VM specification is here:

  https://bugs.openjdk.java.net/browse/JDK-8189199 <https://bugs.openjdk.java.net/browse/JDK-8189199>

the j.l.invoke package documentation was also updated but please consider the VM specification as the definitive "source of truth" (we may clean up this area further later on so it becomes more informative, and that may also apply to duplicative text on MethodHandles/VarHandles).

Any AoT-related work will be deferred to a future release.

?

This patch only supports x64 platforms. There is a small set of changes specific to x64 (specifically to support null and primitives constants, as prior to this patch null was used as a sentinel for resolution and certain primitives types would never have been encountered, such as say byte).

We will need to follow up with the SPARC platform and it is hoped/anticipated that OpenJDK members responsible for other platforms (namely ARM and PPC) will separately provide patches.

?

Many of tests rely on an experimental byte code API that supports the generation of byte code with dynamic constants.

One test uses class file bytes produced from a modified version of asmtools.  The modifications have now been pushed but a new version of asmtools need to be rolled into jtreg before the test can operate directly on asmtools information rather than embedding class file bytes directly in the test.

?

Paul.

From aph at redhat.com  Thu Oct 26 17:19:06 2017
From: aph at redhat.com (Andrew Haley)
Date: Thu, 26 Oct 2017 18:19:06 +0100
Subject: RFR(XL): 8185640: Thread-local handshakes
In-Reply-To: <2EB9D7C3-B868-4C3E-BD88-6A4F92A39999@oracle.com>
References: <6f2f6259-73f1-c09c-063e-39ae528fb96f@oracle.com>
 <15dd917732444959b7785efbe6640952@sap.com>
 <ba5210b2-d40c-b559-c9bb-35e245d2208e@oracle.com>
 <18f2001cbbbd4772aa9268e6e34b4be9@sap.com>
 <770b1286-3c8e-92e5-3929-17eb4e6c3847@oracle.com>
 <edf27200af4e4fb4a8bda1831ae2d974@sap.com>
 <6e10687c-f70e-5ee7-414f-b2c22d3e8f21@oracle.com>
 <f3553eec9b1d4815adbf7c87ca550527@sap.com>
 <580AD7F0-2713-472C-A440-AAFDDA2D3EB3@oracle.com>
 <7591f6c0-7192-78c3-fe79-56a7785c43e4@oracle.com>
 <2ff79d24-90ab-822a-bd61-e01b79c01ada@redhat.com>
 <8d7678bf2281406da43cbe090276b51f@sap.com>
 <28bc3976-424d-1e05-cf7f-29bc38ccabcb@oracle.com>
 <818e352d5e3a450491cf0c140bf129d6@sap.com>
 <1c4a025da6ad4d39bedc1d6a12549b87@sap.com>
 <93cce80b-e9d7-f016-1324-2b0f5fac48c4@redhat.com> <59F1F3B3.10701@oracle.com>
 <43837915-a3a3-b36f-940e-1327937f0f17@redhat.com>
 <2EB9D7C3-B868-4C3E-BD88-6A4F92A39999@oracle.com>
Message-ID: <b4d75241-f87f-3aef-ad55-897d1f91be1e@redhat.com>

On 26/10/17 18:00, Erik Osterlund wrote:
> Hi Andrew,
> 
>> On 26 Oct 2017, at 18:05, Andrew Haley <aph at redhat.com> wrote:
>>
>>> On 26/10/17 15:39, Erik ?sterlund wrote:
>>>
>>> The reason we do not poll the page in the interpreter is that we
>>> need to generate appropriate relocation entries in the code blob for
>>> the PCs that we poll on, so that we in the signal handler can look
>>> up the code blob, walk the relocation entries, and find precisely
>>> why we got the trap, i.e. due to the poll, and precisely what kind
>>> of poll, so we know what trampoline needs to be taken into the
>>> runtime.
>>
>> Not really, no.  If we know that we're in the interpreter and the
>> faulting address is the safepoint poll, then we can read all of the
>> context we need from the interpreter registers and the frame.
> 
> That sounds like what I said.

Not exactly.  We do not need to generate any more relocation entries.

> But the cost of the conditional branch is empirically (this was
> attempted and measured a while ago) approximately the same as the
> indirect load during "normal circumstances". The indirect load was
> only marginally better.

That's interesting.  The cost of the SEGV trap going through the
kernel is fairly high, and I'm now wondering if, for very fast
safepoint responses, we'd be better off not doing it.  The cost of the
write protect, given that it probably involves an IPI on all
processors, isn't cheap either.

>>> While constructing something that does that is indeed possible, it
>>> simply did not seem worth the trouble compared to using a branch in
>>> these paths. The same reasoning applies for the poll performed in
>>> the native wrapper when waking up from native and transitioning into
>>> Java. It performs a conditional branch instead of indirect load to
>>> avoid signal handler logic for polls that are not performance
>>> critical.
>>
>> If we're talking about performance, the existing bytecode interpreter
>> is exquisitely carefully coded, even going to the extent of having
>> multiple dispatch tables for safepoint- and non-safepoint cases.
>> Clearly the original authors weren't thinking that code was not
>> performance critical or they wouldn't have done what they did.  I
>> suppose, though, that the design we have is from the early days when
>> people diligently strove to make the interpreter as fast as possible.
> 
> On the other hand, branches have become a lot faster in "recent"
> years, and this one is particularly trivial to predict. Therefore I
> prefer to base design decisions on empirical measurements. And
> introducing that complexity for an close to insignificantly faster
> interpreter poll does not seem encouraging to me. Do you agree?

Perhaps.  It's interesting that the result falls one way in compiled
code and the other in interpreted code.  If the choice is so very
finely balanced, though, it sort-of makes sense.

-- 
Andrew Haley
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671

From kim.barrett at oracle.com  Thu Oct 26 18:45:03 2017
From: kim.barrett at oracle.com (Kim Barrett)
Date: Thu, 26 Oct 2017 14:45:03 -0400
Subject: RFR: 8163897: oop_store has unnecessary memory barriers
In-Reply-To: <FE567DFD-9700-4CA2-A33C-BCBD68C1E07E@oracle.com>
References: <FE567DFD-9700-4CA2-A33C-BCBD68C1E07E@oracle.com>
Message-ID: <198FAF45-59AD-4618-86B7-279C81248F9B@oracle.com>

> On Oct 23, 2017, at 1:44 AM, Kim Barrett <kim.barrett at oracle.com> wrote:
> 
> Please review this change to the oop_store function template, which
> removes some unnecessary memory barriers, moves CMS-specific code into
> GC-specific (though not completely CMS-specific) areas, and cleans up
> the API a bit.  See the CR for more details about the problems.

Due to some miscommunication, Erik O and I have both developed
solutions to this.  Mine is a stand-alone piece of work for me, while
his is some number of changes in a long patch train.  In the interest
of not imposing possibly messy merging on Erik, I'm withdrawing this
RFR and reassigning the bug to him.


From mandy.chung at oracle.com  Thu Oct 26 18:47:19 2017
From: mandy.chung at oracle.com (mandy chung)
Date: Thu, 26 Oct 2017 11:47:19 -0700
Subject: RFR (L, tedious again, sorry) 8189610: Reconcile jvm.h and all
 jvm_md.h between java.base and hotspot
In-Reply-To: <05bf853a-52f8-c450-4171-89f6b64d793a@oracle.com>
References: <8671321f-398c-5f7f-634d-9d9664e04d87@oracle.com>
 <05bf853a-52f8-c450-4171-89f6b64d793a@oracle.com>
Message-ID: <a4ee3f97-a6c7-d661-d5fd-a6bb34153ec5@oracle.com>


On 10/26/17 2:57 AM, Magnus Ihse Bursie wrote:
> A third option is to remove the support for link-time-opt entirely, if 
> it's not really used.
>
> * src/java.base/unix/native/include/jvm_md.h and 
> src/java.base/windows/native/include/jvm_md.h:
>
> These files define a public API, and contain non-trivial changes. I 
> suspect you should file a CSR request. (Even though I realize you're 
> only matching the header file with the reality.)

jvm.h and jvm_md.h are not public API and they are not copied to the 
$JAVA_HOME/includes directly.? This does raise the question that jvm*.h 
may belong to other location than src/java.base/{share,$OS}/native/include.

Mandy

From coleen.phillimore at oracle.com  Thu Oct 26 20:34:30 2017
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Thu, 26 Oct 2017 16:34:30 -0400
Subject: RFR (L, tedious again, sorry) 8189610: Reconcile jvm.h and all
 jvm_md.h between java.base and hotspot
In-Reply-To: <a4ee3f97-a6c7-d661-d5fd-a6bb34153ec5@oracle.com>
References: <8671321f-398c-5f7f-634d-9d9664e04d87@oracle.com>
 <05bf853a-52f8-c450-4171-89f6b64d793a@oracle.com>
 <a4ee3f97-a6c7-d661-d5fd-a6bb34153ec5@oracle.com>
Message-ID: <8e157a28-5397-95c1-03dc-de6d0d3d37e8@oracle.com>


On 10/26/17 2:47 PM, mandy chung wrote:
>
>
> On 10/26/17 2:57 AM, Magnus Ihse Bursie wrote:
>> A third option is to remove the support for link-time-opt entirely, 
>> if it's not really used.
>>
>> * src/java.base/unix/native/include/jvm_md.h and 
>> src/java.base/windows/native/include/jvm_md.h:
>>
>> These files define a public API, and contain non-trivial changes. I 
>> suspect you should file a CSR request. (Even though I realize you're 
>> only matching the header file with the reality.)
>
> jvm.h and jvm_md.h are not public API and they are not copied to the 
> $JAVA_HOME/includes directly.? This does raise the question that 
> jvm*.h may belong to other location than 
> src/java.base/{share,$OS}/native/include.

I'm not sure where else it would go honestly, but it could be moved 
outside this changeset.? The good thing about where it is, is that the 
-I directives in the makefiles find both jni.h and jvm.h.
thanks,
Coleen
>
> Mandy


From coleen.phillimore at oracle.com  Thu Oct 26 20:44:05 2017
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Thu, 26 Oct 2017 16:44:05 -0400
Subject: RFR (L, tedious again, sorry) 8189610: Reconcile jvm.h and all
 jvm_md.h between java.base and hotspot
In-Reply-To: <05bf853a-52f8-c450-4171-89f6b64d793a@oracle.com>
References: <8671321f-398c-5f7f-634d-9d9664e04d87@oracle.com>
 <05bf853a-52f8-c450-4171-89f6b64d793a@oracle.com>
Message-ID: <72f2aac7-ed20-c995-913b-ee4341a2a978@oracle.com>

 ?Hi Magnus,

Thank you for reviewing this.?? I have a new version that takes out the 
hack in globalDefinitions.hpp and adds casts to 
src/hotspot/share/opto/type.cpp instead.

Also some fixes from Martin at SAP.

open webrev at http://cr.openjdk.java.net/~coleenp/8189610.02/webrev

see below.

On 10/26/17 5:57 AM, Magnus Ihse Bursie wrote:
> Coleen,
>
> Thank you for addressing this!
>
> On 2017-10-25 18:49, coleen.phillimore at oracle.com wrote:
>> Summary: removed hotspot version of jvm*h and jni*h files
>>
>> Mostly used sed to remove prims/jvm.h and move #include "jvm.h" after 
>> precompiled.h, so if you have repetitive stress wrist issues don't 
>> click on most of these files.
>>
>> There were more issues to resolve, however.? The JDK windows jni_md.h 
>> file defined jint as long and the hotspot windows jni_x86.h as int.? 
>> I had to choose the jdk version since it's the public version, so 
>> there are changes to the hotspot files for this. Generally I changed 
>> the code to use 'int' rather than 'jint' where the surrounding API 
>> didn't insist on consistently using java types. We should mostly be 
>> using C++ types within hotspot except in interfaces to native/JNI 
>> code.? There are a couple of hacks in places where adding multiple 
>> jint casts was too painful.
>>
>> Tested with JPRT and tier2-4 (in progress).
>>
>> open webrev at http://cr.openjdk.java.net/~coleenp/8189610.01/webrev
>
> Looks great!
>
> Just a few comments:
>
> * src/java.base/unix/native/include/jni_md.h:
>
> I don't think the externally_visible attribute should be there for 
> arm. I know this was the case for the corresponding hotspot file for 
> arm, but that was techically incorrect. The proper dependency here is 
> that externally_visible should be in all JNIEXPORT if and only if 
> we're building with JVM feature "link-time-opt". Traditionally, that 
> feature been enabled when building arm32 builds, and only then, so 
> there's been a (coincidentally) connection here. Nowadays, Oracle does 
> not care about the arm32 builds, and I'm not sure if anyone else is 
> building them with link-time-opt enabled.
>
> It does seem wrong to me to export this behavior in the public 
> jni_md.h file, though. I think the correct way to solve this, if we 
> should continue supporting link-time-opt is to make sure this 
> attribute is set for exported hotspot functions. If it's still needed, 
> that is. A quick googling seems to indicate that visibility("default") 
> might be enough in modern gcc's.
>
> A third option is to remove the support for link-time-opt entirely, if 
> it's not really used.

I didn't know how to change this since we are still building ARM with 
the jdk10/hs repository, and ARM needed this change.? I could wait until 
we bring down the jdk10/master changes that remove the ARM build and 
remove this conditional before I push.? Or we could file an RFE to 
remove link-time-opt (?) and remove it then?

>
> * src/java.base/unix/native/include/jvm_md.h and 
> src/java.base/windows/native/include/jvm_md.h:
>
> These files define a public API, and contain non-trivial changes. I 
> suspect you should file a CSR request. (Even though I realize you're 
> only matching the header file with the reality.)
>

I filed the CSR.?? Waiting for the next steps.

Thanks,
Coleen

> /Magnus
>
>> bug link https://bugs.openjdk.java.net/browse/JDK-8189610
>>
>> I have a script to update copyright files on commit.
>>
>> Thanks to Magnus and ErikJ for the makefile changes.
>>
>> Thanks,
>> Coleen
>>
>


From mandy.chung at oracle.com  Thu Oct 26 21:27:19 2017
From: mandy.chung at oracle.com (mandy chung)
Date: Thu, 26 Oct 2017 14:27:19 -0700
Subject: RFR (L, tedious again, sorry) 8189610: Reconcile jvm.h and all
 jvm_md.h between java.base and hotspot
In-Reply-To: <8e157a28-5397-95c1-03dc-de6d0d3d37e8@oracle.com>
References: <8671321f-398c-5f7f-634d-9d9664e04d87@oracle.com>
 <05bf853a-52f8-c450-4171-89f6b64d793a@oracle.com>
 <a4ee3f97-a6c7-d661-d5fd-a6bb34153ec5@oracle.com>
 <8e157a28-5397-95c1-03dc-de6d0d3d37e8@oracle.com>
Message-ID: <15a07ec6-3fc3-f757-1711-8d088d194115@oracle.com>


On 10/26/17 1:34 PM, coleen.phillimore at oracle.com wrote:
>
>
> On 10/26/17 2:47 PM, mandy chung wrote:
>>
>>
>> On 10/26/17 2:57 AM, Magnus Ihse Bursie wrote:
>>> A third option is to remove the support for link-time-opt entirely, 
>>> if it's not really used.
>>>
>>> * src/java.base/unix/native/include/jvm_md.h and 
>>> src/java.base/windows/native/include/jvm_md.h:
>>>
>>> These files define a public API, and contain non-trivial changes. I 
>>> suspect you should file a CSR request. (Even though I realize you're 
>>> only matching the header file with the reality.)
>>
>> jvm.h and jvm_md.h are not public API and they are not copied to the 
>> $JAVA_HOME/includes directly.? This does raise the question that 
>> jvm*.h may belong to other location than 
>> src/java.base/{share,$OS}/native/include.
>
> I'm not sure where else it would go honestly, but it could be moved 
> outside this changeset.? The good thing about where it is, is that the 
> -I directives in the makefiles find both jni.h and jvm.h.

I agree we should keep this location for this change (the location is a 
separate issue).? I reviewed the change that looks good to me.

Mandy

From ioi.lam at oracle.com  Thu Oct 26 21:53:15 2017
From: ioi.lam at oracle.com (Ioi Lam)
Date: Thu, 26 Oct 2017 14:53:15 -0700
Subject: RFR [S] JDK-8179624 [REDO] Avoid repeated calls to
 JavaThread::last_frame in InterpreterRuntime
Message-ID: <842ce767-4436-02a3-f536-b71fed1fa6ed@oracle.com>

Hi,

Please review the following change. It's a redo of a previous botched
attempt (JDK-8179305) that had a typo which caused JIT-related crashes.

Thanks to Dean for spotting the typo.

+ Bug
https://bugs.openjdk.java.net/browse/JDK-8179624


+ The full changeset:
http://cr.openjdk.java.net/~iklam/jdk10/8179624-redo-8179305-avoid-last-frame.v01.full/


+ The delta from the botched attempt
 ? (fixing the typo with monitor_begin/monitor_end):
http://cr.openjdk.java.net/~iklam/jdk10/8179624-redo-8179305-avoid-last-frame.v01.redo_delta/


+ Testing:
hotspot tier1~5 tests.


Thanks
- Ioi

From dean.long at oracle.com  Thu Oct 26 23:41:35 2017
From: dean.long at oracle.com (dean.long at oracle.com)
Date: Thu, 26 Oct 2017 16:41:35 -0700
Subject: RFR [S] JDK-8179624 [REDO] Avoid repeated calls to
 JavaThread::last_frame in InterpreterRuntime
In-Reply-To: <842ce767-4436-02a3-f536-b71fed1fa6ed@oracle.com>
References: <842ce767-4436-02a3-f536-b71fed1fa6ed@oracle.com>
Message-ID: <958fef30-03d7-5b8c-1f3b-0bcca945565f@oracle.com>

Looks good.

dl


On 10/26/17 2:53 PM, Ioi Lam wrote:
> Hi,
>
> Please review the following change. It's a redo of a previous botched
> attempt (JDK-8179305) that had a typo which caused JIT-related crashes.
>
> Thanks to Dean for spotting the typo.
>
> + Bug
> https://bugs.openjdk.java.net/browse/JDK-8179624
>
>
> + The full changeset:
> http://cr.openjdk.java.net/~iklam/jdk10/8179624-redo-8179305-avoid-last-frame.v01.full/ 
>
>
>
> + The delta from the botched attempt
> ? (fixing the typo with monitor_begin/monitor_end):
> http://cr.openjdk.java.net/~iklam/jdk10/8179624-redo-8179305-avoid-last-frame.v01.redo_delta/ 
>
>
>
> + Testing:
> hotspot tier1~5 tests.
>
>
> Thanks
> - Ioi


From hohensee at amazon.com  Thu Oct 26 23:54:35 2017
From: hohensee at amazon.com (Hohensee, Paul)
Date: Thu, 26 Oct 2017 23:54:35 +0000
Subject: RFR(XL): 8185640: Thread-local handshakes
In-Reply-To: <b4d75241-f87f-3aef-ad55-897d1f91be1e@redhat.com>
References: <6f2f6259-73f1-c09c-063e-39ae528fb96f@oracle.com>
 <15dd917732444959b7785efbe6640952@sap.com>
 <ba5210b2-d40c-b559-c9bb-35e245d2208e@oracle.com>
 <18f2001cbbbd4772aa9268e6e34b4be9@sap.com>
 <770b1286-3c8e-92e5-3929-17eb4e6c3847@oracle.com>
 <edf27200af4e4fb4a8bda1831ae2d974@sap.com>
 <6e10687c-f70e-5ee7-414f-b2c22d3e8f21@oracle.com>
 <f3553eec9b1d4815adbf7c87ca550527@sap.com>
 <580AD7F0-2713-472C-A440-AAFDDA2D3EB3@oracle.com>
 <7591f6c0-7192-78c3-fe79-56a7785c43e4@oracle.com>
 <2ff79d24-90ab-822a-bd61-e01b79c01ada@redhat.com>
 <8d7678bf2281406da43cbe090276b51f@sap.com>
 <28bc3976-424d-1e05-cf7f-29bc38ccabcb@oracle.com>
 <818e352d5e3a450491cf0c140bf129d6@sap.com>
 <1c4a025da6ad4d39bedc1d6a12549b87@sap.com>
 <93cce80b-e9d7-f016-1324-2b0f5fac48c4@redhat.com> <59F1F3B3.10701@oracle.com>
 <43837915-a3a3-b36f-940e-1327937f0f17@redhat.com>
 <2EB9D7C3-B868-4C3E-BD88-6A4F92A39999@oracle.com>
 <b4d75241-f87f-3aef-ad55-897d1f91be1e@redhat.com>
Message-ID: <4815B009-174E-4363-A60F-7EC3D4EDE3ED@amazon.com>

As a reference point, Android Java branches on a flag in the TLS rather than issuing a poisoned page probe. On x86 at least, there?s no performance disadvantage: branch prediction makes the compare-and-branch pair a single-cycle operation in the vast majority of cases.

The interpreter was built at a time when branches had non-zero cost, as evidenced by the prediction bits in the sparc64 predicted branch instructions. The compare-and-branch code sequence takes up icache space in the interpreter (vs. zero for switching the dispatch table) and icache is still a limited resource on modern processors, so that?s an argument for switching dispatch tables. For compiled code, compare-and-branch takes a bit more space than the current poison page probe, but not enough to matter imo. Compiled code is executed far more than interpreter code, so I?d go with optimizing compiled code performance.

Thanks,

Paul

On 10/26/17, 10:20 AM, "hotspot-dev on behalf of Andrew Haley" <hotspot-dev-bounces at openjdk.java.net on behalf of aph at redhat.com> wrote:

    On 26/10/17 18:00, Erik Osterlund wrote:
    > Hi Andrew,
    > 
    >> On 26 Oct 2017, at 18:05, Andrew Haley <aph at redhat.com> wrote:
    >>
    >>> On 26/10/17 15:39, Erik ?sterlund wrote:
    >>>
    >>> The reason we do not poll the page in the interpreter is that we
    >>> need to generate appropriate relocation entries in the code blob for
    >>> the PCs that we poll on, so that we in the signal handler can look
    >>> up the code blob, walk the relocation entries, and find precisely
    >>> why we got the trap, i.e. due to the poll, and precisely what kind
    >>> of poll, so we know what trampoline needs to be taken into the
    >>> runtime.
    >>
    >> Not really, no.  If we know that we're in the interpreter and the
    >> faulting address is the safepoint poll, then we can read all of the
    >> context we need from the interpreter registers and the frame.
    > 
    > That sounds like what I said.
    
    Not exactly.  We do not need to generate any more relocation entries.
    
    > But the cost of the conditional branch is empirically (this was
    > attempted and measured a while ago) approximately the same as the
    > indirect load during "normal circumstances". The indirect load was
    > only marginally better.
    
    That's interesting.  The cost of the SEGV trap going through the
    kernel is fairly high, and I'm now wondering if, for very fast
    safepoint responses, we'd be better off not doing it.  The cost of the
    write protect, given that it probably involves an IPI on all
    processors, isn't cheap either.
    
    >>> While constructing something that does that is indeed possible, it
    >>> simply did not seem worth the trouble compared to using a branch in
    >>> these paths. The same reasoning applies for the poll performed in
    >>> the native wrapper when waking up from native and transitioning into
    >>> Java. It performs a conditional branch instead of indirect load to
    >>> avoid signal handler logic for polls that are not performance
    >>> critical.
    >>
    >> If we're talking about performance, the existing bytecode interpreter
    >> is exquisitely carefully coded, even going to the extent of having
    >> multiple dispatch tables for safepoint- and non-safepoint cases.
    >> Clearly the original authors weren't thinking that code was not
    >> performance critical or they wouldn't have done what they did.  I
    >> suppose, though, that the design we have is from the early days when
    >> people diligently strove to make the interpreter as fast as possible.
    > 
    > On the other hand, branches have become a lot faster in "recent"
    > years, and this one is particularly trivial to predict. Therefore I
    > prefer to base design decisions on empirical measurements. And
    > introducing that complexity for an close to insignificantly faster
    > interpreter poll does not seem encouraging to me. Do you agree?
    
    Perhaps.  It's interesting that the result falls one way in compiled
    code and the other in interpreted code.  If the choice is so very
    finely balanced, though, it sort-of makes sense.
    
    -- 
    Andrew Haley
    Java Platform Lead Engineer
    Red Hat UK Ltd. <https://www.redhat.com>
    EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671
    

From ioi.lam at oracle.com  Fri Oct 27 00:02:37 2017
From: ioi.lam at oracle.com (Ioi Lam)
Date: Thu, 26 Oct 2017 17:02:37 -0700
Subject: RFR [S] JDK-8179624 [REDO] Avoid repeated calls to
 JavaThread::last_frame in InterpreterRuntime
In-Reply-To: <958fef30-03d7-5b8c-1f3b-0bcca945565f@oracle.com>
References: <842ce767-4436-02a3-f536-b71fed1fa6ed@oracle.com>
 <958fef30-03d7-5b8c-1f3b-0bcca945565f@oracle.com>
Message-ID: <dd60694d-47b6-ac3c-e5bf-81a0d83f8723@oracle.com>

Thanks Dean!

- Ioi


On 10/26/17 4:41 PM, dean.long at oracle.com wrote:
> Looks good.
>
> dl
>
>
> On 10/26/17 2:53 PM, Ioi Lam wrote:
>> Hi,
>>
>> Please review the following change. It's a redo of a previous botched
>> attempt (JDK-8179305) that had a typo which caused JIT-related crashes.
>>
>> Thanks to Dean for spotting the typo.
>>
>> + Bug
>> https://bugs.openjdk.java.net/browse/JDK-8179624
>>
>>
>> + The full changeset:
>> http://cr.openjdk.java.net/~iklam/jdk10/8179624-redo-8179305-avoid-last-frame.v01.full/ 
>>
>>
>>
>> + The delta from the botched attempt
>> ? (fixing the typo with monitor_begin/monitor_end):
>> http://cr.openjdk.java.net/~iklam/jdk10/8179624-redo-8179305-avoid-last-frame.v01.redo_delta/ 
>>
>>
>>
>> + Testing:
>> hotspot tier1~5 tests.
>>
>>
>> Thanks
>> - Ioi
>


From erik.osterlund at oracle.com  Fri Oct 27 06:51:48 2017
From: erik.osterlund at oracle.com (Erik Osterlund)
Date: Fri, 27 Oct 2017 08:51:48 +0200
Subject: RFR(XL): 8185640: Thread-local handshakes
In-Reply-To: <4815B009-174E-4363-A60F-7EC3D4EDE3ED@amazon.com>
References: <6f2f6259-73f1-c09c-063e-39ae528fb96f@oracle.com>
 <15dd917732444959b7785efbe6640952@sap.com>
 <ba5210b2-d40c-b559-c9bb-35e245d2208e@oracle.com>
 <18f2001cbbbd4772aa9268e6e34b4be9@sap.com>
 <770b1286-3c8e-92e5-3929-17eb4e6c3847@oracle.com>
 <edf27200af4e4fb4a8bda1831ae2d974@sap.com>
 <6e10687c-f70e-5ee7-414f-b2c22d3e8f21@oracle.com>
 <f3553eec9b1d4815adbf7c87ca550527@sap.com>
 <580AD7F0-2713-472C-A440-AAFDDA2D3EB3@oracle.com>
 <7591f6c0-7192-78c3-fe79-56a7785c43e4@oracle.com>
 <2ff79d24-90ab-822a-bd61-e01b79c01ada@redhat.com>
 <8d7678bf2281406da43cbe090276b51f@sap.com>
 <28bc3976-424d-1e05-cf7f-29bc38ccabcb@oracle.com>
 <818e352d5e3a450491cf0c140bf129d6@sap.com>
 <1c4a025da6ad4d39bedc1d6a12549b87@sap.com>
 <93cce80b-e9d7-f016-1324-2b0f5fac48c4@redhat.com> <59F1F3B3.10701@oracle.com>
 <43837915-a3a3-b36f-940e-1327937f0f17@redhat.com>
 <2EB9D7C3-B868-4C3E-BD88-6A4F92A39999@oracle.com>
 <b4d75241-f87f-3aef-ad55-897d1f91be1e@redhat.com>
 <4815B009-174E-4363-A60F-7EC3D4EDE3ED@amazon.com>
Message-ID: <F7E25DF2-DC0D-4F69-93E0-D562EDDEBBA9@oracle.com>

Hi Paul,

Regarding confitional branch on the TLS, I have the following to say:

1) Mikael Gerdin tried an earlier prorotype doing that, and found that indirect lod was more desirable for now. The reason is that the performance of the branch variant is more sensitive to chip details such as the number of branch ports on the reservation stations (double branch ports were introduced in haswell). On some chips the branch would marginally win, on some it would marginally lose. But there are more pathological cases for the branch, like e.g. a nonsense loop that does not do anything but loop. Arguably that is a nonsense benchmark though. But since the indirect load was less sensitive to the chip details, always performed well consistently, was more predictable and deterministic, that approach was selected. Perhaps this decision may change in a few years, but it seems a bit early for that now.

2) As for the number of bytes in the code stream of the global testl (x86) to a conditional branch on TLS, you can get an optimal encoding of the branch variant of the same length, 6 bytes, on x86. The optimal testb on offset zero is 4 bytes and a short branch is 2 bytes. For the curious reader: in the past (years ago now) I prototyped getting the optimal machine encoding of a TLS conditional branch poll. I ended up exposing different thread pointers to the TLS register at an offset into Thread in the JIT to be able to get that offset zero, and changing locking code to deal with the owner being misaligned, and all sorts of fun. But it ultimately didn't seem to make any measurable difference at all. But I got the T-shirt anyway.

Hope this explains why the indirect load was chosen over the conditional branch.

Thanks,
/Erik

> On 27 Oct 2017, at 01:54, Hohensee, Paul <hohensee at amazon.com> wrote:
> 
> As a reference point, Android Java branches on a flag in the TLS rather than issuing a poisoned page probe. On x86 at least, there?s no performance disadvantage: branch prediction makes the compare-and-branch pair a single-cycle operation in the vast majority of cases.
> 
> The interpreter was built at a time when branches had non-zero cost, as evidenced by the prediction bits in the sparc64 predicted branch instructions. The compare-and-branch code sequence takes up icache space in the interpreter (vs. zero for switching the dispatch table) and icache is still a limited resource on modern processors, so that?s an argument for switching dispatch tables. For compiled code, compare-and-branch takes a bit more space than the current poison page probe, but not enough to matter imo. Compiled code is executed far more than interpreter code, so I?d go with optimizing compiled code performance.
> 
> Thanks,
> 
> Paul
> 
> On 10/26/17, 10:20 AM, "hotspot-dev on behalf of Andrew Haley" <hotspot-dev-bounces at openjdk.java.net on behalf of aph at redhat.com> wrote:
> 
>    On 26/10/17 18:00, Erik Osterlund wrote:
>> Hi Andrew,
>> 
>>>> On 26 Oct 2017, at 18:05, Andrew Haley <aph at redhat.com> wrote:
>>>> 
>>>> On 26/10/17 15:39, Erik ?sterlund wrote:
>>>> 
>>>> The reason we do not poll the page in the interpreter is that we
>>>> need to generate appropriate relocation entries in the code blob for
>>>> the PCs that we poll on, so that we in the signal handler can look
>>>> up the code blob, walk the relocation entries, and find precisely
>>>> why we got the trap, i.e. due to the poll, and precisely what kind
>>>> of poll, so we know what trampoline needs to be taken into the
>>>> runtime.
>>> 
>>> Not really, no.  If we know that we're in the interpreter and the
>>> faulting address is the safepoint poll, then we can read all of the
>>> context we need from the interpreter registers and the frame.
>> 
>> That sounds like what I said.
> 
>    Not exactly.  We do not need to generate any more relocation entries.
> 
>> But the cost of the conditional branch is empirically (this was
>> attempted and measured a while ago) approximately the same as the
>> indirect load during "normal circumstances". The indirect load was
>> only marginally better.
> 
>    That's interesting.  The cost of the SEGV trap going through the
>    kernel is fairly high, and I'm now wondering if, for very fast
>    safepoint responses, we'd be better off not doing it.  The cost of the
>    write protect, given that it probably involves an IPI on all
>    processors, isn't cheap either.
> 
>>>> While constructing something that does that is indeed possible, it
>>>> simply did not seem worth the trouble compared to using a branch in
>>>> these paths. The same reasoning applies for the poll performed in
>>>> the native wrapper when waking up from native and transitioning into
>>>> Java. It performs a conditional branch instead of indirect load to
>>>> avoid signal handler logic for polls that are not performance
>>>> critical.
>>> 
>>> If we're talking about performance, the existing bytecode interpreter
>>> is exquisitely carefully coded, even going to the extent of having
>>> multiple dispatch tables for safepoint- and non-safepoint cases.
>>> Clearly the original authors weren't thinking that code was not
>>> performance critical or they wouldn't have done what they did.  I
>>> suppose, though, that the design we have is from the early days when
>>> people diligently strove to make the interpreter as fast as possible.
>> 
>> On the other hand, branches have become a lot faster in "recent"
>> years, and this one is particularly trivial to predict. Therefore I
>> prefer to base design decisions on empirical measurements. And
>> introducing that complexity for an close to insignificantly faster
>> interpreter poll does not seem encouraging to me. Do you agree?
> 
>    Perhaps.  It's interesting that the result falls one way in compiled
>    code and the other in interpreted code.  If the choice is so very
>    finely balanced, though, it sort-of makes sense.
> 
>    -- 
>    Andrew Haley
>    Java Platform Lead Engineer
>    Red Hat UK Ltd. <https://www.redhat.com>
>    EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671
> 
> 


From erik.osterlund at oracle.com  Fri Oct 27 07:11:32 2017
From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=)
Date: Fri, 27 Oct 2017 09:11:32 +0200
Subject: RFR(XL): 8185640: Thread-local handshakes
In-Reply-To: <b4d75241-f87f-3aef-ad55-897d1f91be1e@redhat.com>
References: <6f2f6259-73f1-c09c-063e-39ae528fb96f@oracle.com>
 <15dd917732444959b7785efbe6640952@sap.com>
 <ba5210b2-d40c-b559-c9bb-35e245d2208e@oracle.com>
 <18f2001cbbbd4772aa9268e6e34b4be9@sap.com>
 <770b1286-3c8e-92e5-3929-17eb4e6c3847@oracle.com>
 <edf27200af4e4fb4a8bda1831ae2d974@sap.com>
 <6e10687c-f70e-5ee7-414f-b2c22d3e8f21@oracle.com>
 <f3553eec9b1d4815adbf7c87ca550527@sap.com>
 <580AD7F0-2713-472C-A440-AAFDDA2D3EB3@oracle.com>
 <7591f6c0-7192-78c3-fe79-56a7785c43e4@oracle.com>
 <2ff79d24-90ab-822a-bd61-e01b79c01ada@redhat.com>
 <8d7678bf2281406da43cbe090276b51f@sap.com>
 <28bc3976-424d-1e05-cf7f-29bc38ccabcb@oracle.com>
 <818e352d5e3a450491cf0c140bf129d6@sap.com>
 <1c4a025da6ad4d39bedc1d6a12549b87@sap.com>
 <93cce80b-e9d7-f016-1324-2b0f5fac48c4@redhat.com> <59F1F3B3.10701@oracle.com>
 <43837915-a3a3-b36f-940e-1327937f0f17@redhat.com>
 <2EB9D7C3-B868-4C3E-BD88-6A4F92A39999@oracle.com>
 <b4d75241-f87f-3aef-ad55-897d1f91be1e@redhat.com>
Message-ID: <59F2DC24.8050701@oracle.com>

Hi Andrew,

On 2017-10-26 19:19, Andrew Haley wrote:
> On 26/10/17 18:00, Erik Osterlund wrote:
>> Hi Andrew,
>>
>>> On 26 Oct 2017, at 18:05, Andrew Haley <aph at redhat.com> wrote:
>>>
>>>> On 26/10/17 15:39, Erik ?sterlund wrote:
>>>>
>>>> The reason we do not poll the page in the interpreter is that we
>>>> need to generate appropriate relocation entries in the code blob for
>>>> the PCs that we poll on, so that we in the signal handler can look
>>>> up the code blob, walk the relocation entries, and find precisely
>>>> why we got the trap, i.e. due to the poll, and precisely what kind
>>>> of poll, so we know what trampoline needs to be taken into the
>>>> runtime.
>>> Not really, no.  If we know that we're in the interpreter and the
>>> faulting address is the safepoint poll, then we can read all of the
>>> context we need from the interpreter registers and the frame.
>> That sounds like what I said.
> Not exactly.  We do not need to generate any more relocation entries.

Maybe.

>> But the cost of the conditional branch is empirically (this was
>> attempted and measured a while ago) approximately the same as the
>> indirect load during "normal circumstances". The indirect load was
>> only marginally better.
> That's interesting.  The cost of the SEGV trap going through the
> kernel is fairly high, and I'm now wondering if, for very fast
> safepoint responses, we'd be better off not doing it.  The cost of the
> write protect, given that it probably involves an IPI on all
> processors, isn't cheap either.

The current mechanism does not use mprotect to stop threads. It has one 
global trapping page and one global not trapping page. It simply 
performs stores to flip the polling word to point at the trapping page. 
So I am not so concerned about TLB shootdown costs here.
As for the SEGV, the mechanism was stress tested (shooting handshakes on 
all threads continuously) to see how expensive the SEGV was, and the 
outcome was that it was surprisingly cheap. So we did not pursue making 
the slow path faster.

>
>>>> While constructing something that does that is indeed possible, it
>>>> simply did not seem worth the trouble compared to using a branch in
>>>> these paths. The same reasoning applies for the poll performed in
>>>> the native wrapper when waking up from native and transitioning into
>>>> Java. It performs a conditional branch instead of indirect load to
>>>> avoid signal handler logic for polls that are not performance
>>>> critical.
>>> If we're talking about performance, the existing bytecode interpreter
>>> is exquisitely carefully coded, even going to the extent of having
>>> multiple dispatch tables for safepoint- and non-safepoint cases.
>>> Clearly the original authors weren't thinking that code was not
>>> performance critical or they wouldn't have done what they did.  I
>>> suppose, though, that the design we have is from the early days when
>>> people diligently strove to make the interpreter as fast as possible.
>> On the other hand, branches have become a lot faster in "recent"
>> years, and this one is particularly trivial to predict. Therefore I
>> prefer to base design decisions on empirical measurements. And
>> introducing that complexity for an close to insignificantly faster
>> interpreter poll does not seem encouraging to me. Do you agree?
> Perhaps.  It's interesting that the result falls one way in compiled
> code and the other in interpreted code.  If the choice is so very
> finely balanced, though, it sort-of makes sense.

Yeah. I wrote about that decision to use indirect load instead of 
conditional branch in compiled code in an email to Paul if you are 
interested.

Thanks,
/Erik

From david.holmes at oracle.com  Fri Oct 27 07:23:34 2017
From: david.holmes at oracle.com (David Holmes)
Date: Fri, 27 Oct 2017 17:23:34 +1000
Subject: RFR (L, tedious again, sorry) 8189610: Reconcile jvm.h and all
 jvm_md.h between java.base and hotspot
In-Reply-To: <72f2aac7-ed20-c995-913b-ee4341a2a978@oracle.com>
References: <8671321f-398c-5f7f-634d-9d9664e04d87@oracle.com>
 <05bf853a-52f8-c450-4171-89f6b64d793a@oracle.com>
 <72f2aac7-ed20-c995-913b-ee4341a2a978@oracle.com>
Message-ID: <55ec3559-c593-bcb6-51b0-4639da126068@oracle.com>

Hi Coleen,

Thanks for tackling this.

> Summary: removed hotspot version of jvm*h and jni*h files

Can you update the bug synopsis to show it covers both sets of files please.

I hate to start with this (and it took me quite a while to realize it) 
but as Mandy pointed out jvm.h is not an exported interface from the JDK 
to the outside world (so not subject to CSR review), but is a private 
interface between the JVM and the JDK libraries. So I think really jvm.h 
belongs in the hotspot sources where it was, while jni.h belongs in the 
exported JDK sources. In which case the bulk of your changes to the 
hotspot files would not be needed - sorry.

Moving on ...

First to address the initial comments/query you had:

> The JDK windows jni_md.h file defined jint as long and the hotspot
> windows jni_x86.h as int. I had to choose the jdk version since it's the
> public version, so there are changes to the hotspot files for this.

On Windows int and long are always the same as it uses ILP32 or LLP64 
(not LP64 like *nix platforms). So either choice should be fine. That 
said there are some odd casting issues I comment on below. Does the VS 
compiler complain about mixing int and long in expressions?

> Generally I changed the code to use 'int' rather than 'jint' where the
> surrounding API didn't insist on consistently using java types. We
> should mostly be using C++ types within hotspot except in interfaces to
> native/JNI code.

I think you pulled too hard on a few threads here and things are 
starting to unravel. There are numerous cases I refer to below where 
either the cast seems unnecessary/inappropriate or else highlights a 
bunch of additional changes that also need to be made. The fan out from 
this could be horrendous. Unless you actually get some kind of error - 
and I'd like to understand the details of those - I would not suggest 
making these changes as part of this work.

Looking through I have a quite a few queries/comments - apologies in 
advance as I know how tedious this is:

make/hotspot/lib/CompileLibjsig.gmk
src/java.base/solaris/native/libjsig/jsig.c

Took a while to figure out why the include was needed. :) As a follow up 
I suggest just deleting the -I include directive, delete the 
Solaris-only definition of JSIG_VERSION_1_4_1, and delete everything to 
do with JVM_get_libjsig_version. It is all obsolete.

---

src/hotspot/cpu/arm/interp_masm_arm.cpp

Why did you need to add the jvm.h include?

---

src/hotspot/os/windows/os_windows.cpp.

The type of process_exiting should be uint to match the DWORD of 
GetCurrentThreadID. Then you should need any casts. Also you missed this 
jint cast:

3796         process_exiting != (jint)GetCurrentThreadId()) {

---

src/hotspot/share/c1/c1_Canonicalizer.hpp

   43 #ifdef _WINDOWS
   44   // jint is defined as long in jni_md.h, so convert from int to jint
   45   void set_constant(int x)                       { 
set_constant((jint)x); }
   46 #endif

Why is this necessary? int and long are the same on Windows. The whole 
point is that jint hides the underlying type, so where does this go wrong?

---

src/hotspot/share/c1/c1_LinearScan.cpp

  ConstantIntValue((jint)0);

why is this cast needed? what causes the ambiguity? (If this was a 
template I'd understand ;-) ). Also didn't you change that constructor 
to take an int anyway - not that I think it should - see below.

---

src/hotspot/share/ci/ciReplay.cpp

793         jint* dims = NEW_RESOURCE_ARRAY(jint, rank);

why should this be jint?

---

src/hotspot/share/classfile/altHashing.cpp

Okay this looks more consistent with jint.

---

src/hotspot/share/code/debugInfo.hpp

These changes seem wrong. We have:

ConstantLongValue(jlong value)
ConstantDoubleValue(jdouble value)

so we should have:

ConstantIntValue(jint value)

---

src/hotspot/share/code/relocInfo.cpp

Change seems unnecessary - int32_t is fine

---

src/hotspot/share/compiler/compileBroker.cpp
src/hotspot/share/compiler/compileBroker.hpp

I see a complete mix of int and jint in this class, so why make the one 
change you did ??

---

src/hotspot/share/jvmci/jvmciCompilerToVM.cpp

1700     tty->write((char*) start, MIN2(length, (jint)O_BUFLEN));

why did you need to add the jint cast? It's used without any cast on the 
next two lines:

1701     length -= O_BUFLEN;
1702     offset += O_BUFLEN;

??

---

src/hotspot/share/jvmci/jvmciRuntime.cpp

Looking around this code it seems very confused about types - eg the 
previous function is declared jboolean yet returns a jint on one path! 
It isn't clear to me if the return type is what should be changed or the 
parameter type? I would just leave this alone.

---

src/hotspot/share/opto/mulnode.cpp

Okay TypeInt has jint parts, so the remaining int32_t declarations (A, 
B, C, D) should also be jint.

---

src/hotspot/share/opto/parse3.cpp

I agree with the changes you made, but then:

  419     jint dim_con = find_int_con(length[j], -1);

should also be changed.

And obviously MultiArrayExpandLimit should be defined as int not intx!

---

src/hotspot/share/opto/phaseX.cpp

I can see that intcon(jint i) is consistent with longcon(jlong l), but 
the use of "i" in the code is more consistent with int than jint.

---

src/hotspot/share/opto/type.cpp

1505 int TypeInt::hash(void) const {
1506   return java_add(java_add(_lo, _hi), java_add((jint)_widen, 
(jint)Type::Int));
1507 }

I can see that the (jint) casts you added make sense, but then the whole 
function should be returning jint not int. Ditto the other hash functions.

---

src/hotspot/share/prims/jni.cpp

I think vm_created should be a bool. In fact all the fields you changed 
are logically bools - do Atomics work for bool now?

---

src/hotspot/share/prims/jvm.cpp

is_attachable is the terminology used in the JDK code.

---

src/hotspot/share/prims/jvmtiEnvBase.cpp
src/hotspot/share/prims/jvmtiImpl.cpp

Are you making parameters consistent with the fields they initialize?

---

src/hotspot/share/prims/jvmtiTagMap.cpp

There is a mix of int and jint for slot in this code. You fixed some, 
but this remains:

2440 inline bool CallbackInvoker::report_stack_ref_root(jlong thread_tag,
2441                                                    jlong tid,
2442                                                    jint depth,
2443                                                    jmethodID method,
2444                                                    jlocation bci,
2445                                                    jint slot,

---

src/hotspot/share/runtime/perfData.cpp

Callers pass both jint and int, so param type seems arbitrary.

---

src/hotspot/share/runtime/perfMemory.cpp
src/hotspot/share/runtime/perfMemory.hpp

PerfMemory::_initialized should ideally be a bool - can OrderAccess 
handle that now?

---

src/java.base/share/native/include/jvm.h

Not clear why the jio functions are not also JNICALL ?

---

src/java.base/unix/native/include/jni_md.h

There is no need to special case ARM. The differences in the existing 
code were for LTO support and that is now irrelevant.

---

src/java.base/unix/native/include/jvm_md.h

I know you've just copied this across, but it seems wrong to me:

  57 // Hack: MAXPATHLEN is 4095 on some Linux and 4096 on others. This may
   58 //       cause problems if JVM and the rest of JDK are built on 
different
   59 //       Linux releases. Here we define JVM_MAXPATHLEN to be 
MAXPATHLEN + 1,
   60 //       so buffers declared in VM are always >= 4096.
   61 #define JVM_MAXPATHLEN MAXPATHLEN + 1

It doesn't make sense to me to define an internal "max path length" that 
can _exceed_ the platform max!

That aside there's no support for building different parts of the JDK on 
different platforms and then bringing them together. And in any case I 
would think the real problem would be building on a platform that uses 
4096 and running on one that uses 4095!

But that aside this is a Linux hack and should be guarded by ifdef 
LINUX. (I doubt BSD needs it, the bsd file is just a copy of the linux 
one - the JDK macosx version does the right thing). Solaris and AIX 
should stay as-is at MAXPATHLEN.

  86 #define ASYNC_SIGNAL     SIGJVM2

This only exists on Solaris so I think should be in #ifdef SOLARIS, to 
make that clear.

---

src/java.base/windows/native/include/jvm_md.h

Given the differences between the two versions either something has been 
broken or "extern C" declarations are not needed :)

---

That was a really painful way to spend most of my Friday. TGIF! :)

Thanks,
David
-----


On 27/10/2017 6:44 AM, coleen.phillimore at oracle.com wrote:
>  ?Hi Magnus,
> 
> Thank you for reviewing this.?? I have a new version that takes out the 
> hack in globalDefinitions.hpp and adds casts to 
> src/hotspot/share/opto/type.cpp instead.
> 
> Also some fixes from Martin at SAP.
> 
> open webrev at http://cr.openjdk.java.net/~coleenp/8189610.02/webrev
> 
> see below.
> 
> On 10/26/17 5:57 AM, Magnus Ihse Bursie wrote:
>> Coleen,
>>
>> Thank you for addressing this!
>>
>> On 2017-10-25 18:49, coleen.phillimore at oracle.com wrote:
>>> Summary: removed hotspot version of jvm*h and jni*h files
>>>
>>> Mostly used sed to remove prims/jvm.h and move #include "jvm.h" after 
>>> precompiled.h, so if you have repetitive stress wrist issues don't 
>>> click on most of these files.
>>>
>>> There were more issues to resolve, however.? The JDK windows jni_md.h 
>>> file defined jint as long and the hotspot windows jni_x86.h as int. I 
>>> had to choose the jdk version since it's the public version, so there 
>>> are changes to the hotspot files for this. Generally I changed the 
>>> code to use 'int' rather than 'jint' where the surrounding API didn't 
>>> insist on consistently using java types. We should mostly be using 
>>> C++ types within hotspot except in interfaces to native/JNI code.  
>>> There are a couple of hacks in places where adding multiple jint 
>>> casts was too painful.
>>>
>>> Tested with JPRT and tier2-4 (in progress).
>>>
>>> open webrev at http://cr.openjdk.java.net/~coleenp/8189610.01/webrev
>>
>> Looks great!
>>
>> Just a few comments:
>>
>> * src/java.base/unix/native/include/jni_md.h:
>>
>> I don't think the externally_visible attribute should be there for 
>> arm. I know this was the case for the corresponding hotspot file for 
>> arm, but that was techically incorrect. The proper dependency here is 
>> that externally_visible should be in all JNIEXPORT if and only if 
>> we're building with JVM feature "link-time-opt". Traditionally, that 
>> feature been enabled when building arm32 builds, and only then, so 
>> there's been a (coincidentally) connection here. Nowadays, Oracle does 
>> not care about the arm32 builds, and I'm not sure if anyone else is 
>> building them with link-time-opt enabled.
>>
>> It does seem wrong to me to export this behavior in the public 
>> jni_md.h file, though. I think the correct way to solve this, if we 
>> should continue supporting link-time-opt is to make sure this 
>> attribute is set for exported hotspot functions. If it's still needed, 
>> that is. A quick googling seems to indicate that visibility("default") 
>> might be enough in modern gcc's.
>>
>> A third option is to remove the support for link-time-opt entirely, if 
>> it's not really used.
> 
> I didn't know how to change this since we are still building ARM with 
> the jdk10/hs repository, and ARM needed this change.? I could wait until 
> we bring down the jdk10/master changes that remove the ARM build and 
> remove this conditional before I push.? Or we could file an RFE to 
> remove link-time-opt (?) and remove it then?
> 
>>
>> * src/java.base/unix/native/include/jvm_md.h and 
>> src/java.base/windows/native/include/jvm_md.h:
>>
>> These files define a public API, and contain non-trivial changes. I 
>> suspect you should file a CSR request. (Even though I realize you're 
>> only matching the header file with the reality.)
>>
> 
> I filed the CSR.?? Waiting for the next steps.
> 
> Thanks,
> Coleen
> 
>> /Magnus
>>
>>> bug link https://bugs.openjdk.java.net/browse/JDK-8189610
>>>
>>> I have a script to update copyright files on commit.
>>>
>>> Thanks to Magnus and ErikJ for the makefile changes.
>>>
>>> Thanks,
>>> Coleen
>>>
>>
> 

From aph at redhat.com  Fri Oct 27 08:26:18 2017
From: aph at redhat.com (Andrew Haley)
Date: Fri, 27 Oct 2017 09:26:18 +0100
Subject: RFR(XL): 8185640: Thread-local handshakes
In-Reply-To: <59F2DC24.8050701@oracle.com>
References: <6f2f6259-73f1-c09c-063e-39ae528fb96f@oracle.com>
 <15dd917732444959b7785efbe6640952@sap.com>
 <ba5210b2-d40c-b559-c9bb-35e245d2208e@oracle.com>
 <18f2001cbbbd4772aa9268e6e34b4be9@sap.com>
 <770b1286-3c8e-92e5-3929-17eb4e6c3847@oracle.com>
 <edf27200af4e4fb4a8bda1831ae2d974@sap.com>
 <6e10687c-f70e-5ee7-414f-b2c22d3e8f21@oracle.com>
 <f3553eec9b1d4815adbf7c87ca550527@sap.com>
 <580AD7F0-2713-472C-A440-AAFDDA2D3EB3@oracle.com>
 <7591f6c0-7192-78c3-fe79-56a7785c43e4@oracle.com>
 <2ff79d24-90ab-822a-bd61-e01b79c01ada@redhat.com>
 <8d7678bf2281406da43cbe090276b51f@sap.com>
 <28bc3976-424d-1e05-cf7f-29bc38ccabcb@oracle.com>
 <818e352d5e3a450491cf0c140bf129d6@sap.com>
 <1c4a025da6ad4d39bedc1d6a12549b87@sap.com>
 <93cce80b-e9d7-f016-1324-2b0f5fac48c4@redhat.com> <59F1F3B3.10701@oracle.com>
 <43837915-a3a3-b36f-940e-1327937f0f17@redhat.com>
 <2EB9D7C3-B868-4C3E-BD88-6A4F92A39999@oracle.com>
 <b4d75241-f87f-3aef-ad55-897d1f91be1e@redhat.com>
 <59F2DC24.8050701@oracle.com>
Message-ID: <cd99843e-2602-c423-c74e-210884713ef5@redhat.com>

On 27/10/17 08:11, Erik ?sterlund wrote:

> The current mechanism does not use mprotect to stop threads.

Eh?  Sure it does: you're talking about the new, proposed mechanism
that's the subject of this patch, surely.

> It has one global trapping page and one global not trapping page. It
> simply performs stores to flip the polling word to point at the
> trapping page.  So I am not so concerned about TLB shootdown costs
> here.  As for the SEGV, the mechanism was stress tested (shooting
> handshakes on all threads continuously) to see how expensive the
> SEGV was, and the outcome was that it was surprisingly cheap. So we
> did not pursue making the slow path faster.

Interesting.  It's a lot of code.

-- 
Andrew Haley
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671

From erik.osterlund at oracle.com  Fri Oct 27 08:36:42 2017
From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=)
Date: Fri, 27 Oct 2017 10:36:42 +0200
Subject: RFR(XL): 8185640: Thread-local handshakes
In-Reply-To: <cd99843e-2602-c423-c74e-210884713ef5@redhat.com>
References: <6f2f6259-73f1-c09c-063e-39ae528fb96f@oracle.com>
 <ba5210b2-d40c-b559-c9bb-35e245d2208e@oracle.com>
 <18f2001cbbbd4772aa9268e6e34b4be9@sap.com>
 <770b1286-3c8e-92e5-3929-17eb4e6c3847@oracle.com>
 <edf27200af4e4fb4a8bda1831ae2d974@sap.com>
 <6e10687c-f70e-5ee7-414f-b2c22d3e8f21@oracle.com>
 <f3553eec9b1d4815adbf7c87ca550527@sap.com>
 <580AD7F0-2713-472C-A440-AAFDDA2D3EB3@oracle.com>
 <7591f6c0-7192-78c3-fe79-56a7785c43e4@oracle.com>
 <2ff79d24-90ab-822a-bd61-e01b79c01ada@redhat.com>
 <8d7678bf2281406da43cbe090276b51f@sap.com>
 <28bc3976-424d-1e05-cf7f-29bc38ccabcb@oracle.com>
 <818e352d5e3a450491cf0c140bf129d6@sap.com>
 <1c4a025da6ad4d39bedc1d6a12549b87@sap.com>
 <93cce80b-e9d7-f016-1324-2b0f5fac48c4@redhat.com> <59F1F3B3.10701@oracle.com>
 <43837915-a3a3-b36f-940e-1327937f0f17@redhat.com>
 <2EB9D7C3-B868-4C3E-BD88-6A4F92A39999@oracle.com>
 <b4d75241-f87f-3aef-ad55-897d1f91be1e@redhat.com>
 <59F2DC24.8050701@oracle.com>
 <cd99843e-2602-c423-c74e-210884713ef5@redhat.com>
Message-ID: <59F2F01A.403@oracle.com>

Hi Andrew,

On 2017-10-27 10:26, Andrew Haley wrote:
> On 27/10/17 08:11, Erik ?sterlund wrote:
>
>> The current mechanism does not use mprotect to stop threads.
> Eh?  Sure it does: you're talking about the new, proposed mechanism
> that's the subject of this patch, surely.

Yes indeed. Sorry, I was unclear.

>
>> It has one global trapping page and one global not trapping page. It
>> simply performs stores to flip the polling word to point at the
>> trapping page.  So I am not so concerned about TLB shootdown costs
>> here.  As for the SEGV, the mechanism was stress tested (shooting
>> handshakes on all threads continuously) to see how expensive the
>> SEGV was, and the outcome was that it was surprisingly cheap. So we
>> did not pursue making the slow path faster.
> Interesting.  It's a lot of code.

:)

Thanks,
/Erik

From serguei.spitsyn at oracle.com  Fri Oct 27 08:52:54 2017
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Fri, 27 Oct 2017 01:52:54 -0700
Subject: RFR: SA: JDK-8189798: SA cleanup - part 1
In-Reply-To: <e7cf9e4a-7986-317d-56dc-7594e0b3c798@oracle.com>
References: <18501902-23db-de6c-b83d-640cd33df836@oracle.com>
 <e7cf9e4a-7986-317d-56dc-7594e0b3c798@oracle.com>
Message-ID: <691d8166-5395-906a-4256-ef0ab2e2773a@oracle.com>

Hi Jini,

The fix looks good to me.

Thanks,
Serguei


On 10/24/17 00:31, Jini George wrote:
> Adding hotspot-dev too.
>
> Thanks,
> Jini.
>
> On 10/24/2017 12:05 PM, Jini George wrote:
>> Hello,
>>
>> As a part of SA next, I am working on writing a test case which 
>> compares the fields and the types of the fields of the SA java 
>> classes with the corresponding entries in the vmStructs tables. This, 
>> to some extent, would help in preventing errors in SA due to the 
>> changes in hotspot. As a precursor to this, I am in the process of 
>> making some cleanup related changes (mostly in SA). I plan to have 
>> the changes done in parts. For this webrev, most of the changes are for:
>>
>> 1. Avoiding having some values being redefined in SA. Instead have 
>> those exported through vmStructs, and read it in SA. 
>> (CompactibleFreeListSpace::_min_chunk_size_in_bytes, 
>> CompactibleFreeListSpace::IndexSetSize)
>>
>> Redefinition of hotspot values in SA makes SA error prone, when the 
>> value gets altered in hotspot and the corresponding modification gets 
>> missed out in SA.
>>
>> 2. To remove some unused code (JNIid.java).
>> 3. Add the missing "CMSBitMap::_bmStartWord" in vmStructs.
>> 4. Modify variable names in SA and hotspot to match the counterpart 
>> names, so that the comparison of the fields become easier. Most of 
>> the changes belong to this group.
>>
>> Could I please get reviews done for these precursor changes ?
>>
>> JBS Id: https://bugs.openjdk.java.net/browse/JDK-8189798
>> webrev: http://cr.openjdk.java.net/~jgeorge/8189798/webrev.00/
>>
>> Thank you,
>> Jini.
>>


From coleen.phillimore at oracle.com  Fri Oct 27 11:12:06 2017
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Fri, 27 Oct 2017 07:12:06 -0400
Subject: RFR (L, tedious again, sorry) 8189610: Reconcile jvm.h and all
 jvm_md.h between java.base and hotspot
In-Reply-To: <15a07ec6-3fc3-f757-1711-8d088d194115@oracle.com>
References: <8671321f-398c-5f7f-634d-9d9664e04d87@oracle.com>
 <05bf853a-52f8-c450-4171-89f6b64d793a@oracle.com>
 <a4ee3f97-a6c7-d661-d5fd-a6bb34153ec5@oracle.com>
 <8e157a28-5397-95c1-03dc-de6d0d3d37e8@oracle.com>
 <15a07ec6-3fc3-f757-1711-8d088d194115@oracle.com>
Message-ID: <77e1ab82-4307-671f-1ca8-fd7f8a557b2c@oracle.com>

Thank you for reviewing this, Mandy!
Coleen

On 10/26/17 5:27 PM, mandy chung wrote:
>
>
> On 10/26/17 1:34 PM, coleen.phillimore at oracle.com wrote:
>>
>>
>> On 10/26/17 2:47 PM, mandy chung wrote:
>>>
>>>
>>> On 10/26/17 2:57 AM, Magnus Ihse Bursie wrote:
>>>> A third option is to remove the support for link-time-opt entirely, 
>>>> if it's not really used.
>>>>
>>>> * src/java.base/unix/native/include/jvm_md.h and 
>>>> src/java.base/windows/native/include/jvm_md.h:
>>>>
>>>> These files define a public API, and contain non-trivial changes. I 
>>>> suspect you should file a CSR request. (Even though I realize 
>>>> you're only matching the header file with the reality.)
>>>
>>> jvm.h and jvm_md.h are not public API and they are not copied to the 
>>> $JAVA_HOME/includes directly.? This does raise the question that 
>>> jvm*.h may belong to other location than 
>>> src/java.base/{share,$OS}/native/include.
>>
>> I'm not sure where else it would go honestly, but it could be moved 
>> outside this changeset.? The good thing about where it is, is that 
>> the -I directives in the makefiles find both jni.h and jvm.h.
>
> I agree we should keep this location for this change (the location is 
> a separate issue).? I reviewed the change that looks good to me.
>
> Mandy


From magnus.ihse.bursie at oracle.com  Fri Oct 27 11:44:53 2017
From: magnus.ihse.bursie at oracle.com (Magnus Ihse Bursie)
Date: Fri, 27 Oct 2017 13:44:53 +0200
Subject: RFR (L, tedious again, sorry) 8189610: Reconcile jvm.h and all
 jvm_md.h between java.base and hotspot
In-Reply-To: <72f2aac7-ed20-c995-913b-ee4341a2a978@oracle.com>
References: <8671321f-398c-5f7f-634d-9d9664e04d87@oracle.com>
 <05bf853a-52f8-c450-4171-89f6b64d793a@oracle.com>
 <72f2aac7-ed20-c995-913b-ee4341a2a978@oracle.com>
Message-ID: <c7314ae5-58b2-4e4c-73f4-6423a19f31e1@oracle.com>


On 2017-10-26 22:44, coleen.phillimore at oracle.com wrote:
> ?Hi Magnus,
>
> Thank you for reviewing this.?? I have a new version that takes out 
> the hack in globalDefinitions.hpp and adds casts to 
> src/hotspot/share/opto/type.cpp instead.
>
> Also some fixes from Martin at SAP.
>
> open webrev at http://cr.openjdk.java.net/~coleenp/8189610.02/webrev
>
> see below.
>
> On 10/26/17 5:57 AM, Magnus Ihse Bursie wrote:
>> Coleen,
>>
>> Thank you for addressing this!
>>
>> On 2017-10-25 18:49, coleen.phillimore at oracle.com wrote:
>>> Summary: removed hotspot version of jvm*h and jni*h files
>>>
>>> Mostly used sed to remove prims/jvm.h and move #include "jvm.h" 
>>> after precompiled.h, so if you have repetitive stress wrist issues 
>>> don't click on most of these files.
>>>
>>> There were more issues to resolve, however.? The JDK windows 
>>> jni_md.h file defined jint as long and the hotspot windows jni_x86.h 
>>> as int.? I had to choose the jdk version since it's the public 
>>> version, so there are changes to the hotspot files for this. 
>>> Generally I changed the code to use 'int' rather than 'jint' where 
>>> the surrounding API didn't insist on consistently using java types. 
>>> We should mostly be using C++ types within hotspot except in 
>>> interfaces to native/JNI code. There are a couple of hacks in places 
>>> where adding multiple jint casts was too painful.
>>>
>>> Tested with JPRT and tier2-4 (in progress).
>>>
>>> open webrev at http://cr.openjdk.java.net/~coleenp/8189610.01/webrev
>>
>> Looks great!
>>
>> Just a few comments:
>>
>> * src/java.base/unix/native/include/jni_md.h:
>>
>> I don't think the externally_visible attribute should be there for 
>> arm. I know this was the case for the corresponding hotspot file for 
>> arm, but that was techically incorrect. The proper dependency here is 
>> that externally_visible should be in all JNIEXPORT if and only if 
>> we're building with JVM feature "link-time-opt". Traditionally, that 
>> feature been enabled when building arm32 builds, and only then, so 
>> there's been a (coincidentally) connection here. Nowadays, Oracle 
>> does not care about the arm32 builds, and I'm not sure if anyone else 
>> is building them with link-time-opt enabled.
>>
>> It does seem wrong to me to export this behavior in the public 
>> jni_md.h file, though. I think the correct way to solve this, if we 
>> should continue supporting link-time-opt is to make sure this 
>> attribute is set for exported hotspot functions. If it's still 
>> needed, that is. A quick googling seems to indicate that 
>> visibility("default") might be enough in modern gcc's.
>>
>> A third option is to remove the support for link-time-opt entirely, 
>> if it's not really used.
>
> I didn't know how to change this since we are still building ARM with 
> the jdk10/hs repository, and ARM needed this change.? I could wait 
> until we bring down the jdk10/master changes that remove the ARM build 
> and remove this conditional before I push.? Or we could file an RFE to 
> remove link-time-opt (?) and remove it then?

I'm looking into the link-time-opt issue right now. I think it boils 
down to us using an incorrect flag to gcc when linking, -fwhole-program, 
when -fuse-linker-plugin should have been used. This caused all exported 
symbols to disappear unless they were attributed with 
externally_visible, which makes sense for a program but not a shared 
library. I'm currently trying to verify that -fuse-linker-plugin removes 
the need for the externally_visible attribute when using link-time-opt. 
If it does, I'll open a separate bug to fix that, and if I push that 
first, you can safely delete the externally_visible attributes.

/Magnus

>
>>
>> * src/java.base/unix/native/include/jvm_md.h and 
>> src/java.base/windows/native/include/jvm_md.h:
>>
>> These files define a public API, and contain non-trivial changes. I 
>> suspect you should file a CSR request. (Even though I realize you're 
>> only matching the header file with the reality.)
>>
>
> I filed the CSR.?? Waiting for the next steps.
>
> Thanks,
> Coleen
>
>> /Magnus
>>
>>> bug link https://bugs.openjdk.java.net/browse/JDK-8189610
>>>
>>> I have a script to update copyright files on commit.
>>>
>>> Thanks to Magnus and ErikJ for the makefile changes.
>>>
>>> Thanks,
>>> Coleen
>>>
>>
>


From coleen.phillimore at oracle.com  Fri Oct 27 12:13:12 2017
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Fri, 27 Oct 2017 08:13:12 -0400
Subject: RFR (L, tedious again, sorry) 8189610: Reconcile jvm.h and all
 jvm_md.h between java.base and hotspot
In-Reply-To: <55ec3559-c593-bcb6-51b0-4639da126068@oracle.com>
References: <8671321f-398c-5f7f-634d-9d9664e04d87@oracle.com>
 <05bf853a-52f8-c450-4171-89f6b64d793a@oracle.com>
 <72f2aac7-ed20-c995-913b-ee4341a2a978@oracle.com>
 <55ec3559-c593-bcb6-51b0-4639da126068@oracle.com>
Message-ID: <a05deb31-c7bf-4fda-f621-d39cfb56abb4@oracle.com>


On 10/27/17 3:23 AM, David Holmes wrote:
> Hi Coleen,
>
> Thanks for tackling this.
>
>> Summary: removed hotspot version of jvm*h and jni*h files
>
> Can you update the bug synopsis to show it covers both sets of files 
> please.
>
> I hate to start with this (and it took me quite a while to realize it) 
> but as Mandy pointed out jvm.h is not an exported interface from the 
> JDK to the outside world (so not subject to CSR review), but is a 
> private interface between the JVM and the JDK libraries. So I think 
> really jvm.h belongs in the hotspot sources where it was, while jni.h 
> belongs in the exported JDK sources. In which case the bulk of your 
> changes to the hotspot files would not be needed - sorry.

Maybe someone can make that decision and change at a later date. The 
point of this change is that there is now only one of these files that 
is shared.? I don't think jvm.h and the jvm_md.h belong on the hotspot 
sources for the jdk to find them in some random prims and os dependent 
directories.

I'm happy to withdraw the CSR.? We generally use the CSR process to add 
and remove JVM_ interfaces even though they're a private interface in 
case some other JVM/JDK combination relies on them. The changes to these 
files are very minor though and not likely to cause any even theoretical 
incompatibility, so I'll withdraw it.
>
> Moving on ...
>
> First to address the initial comments/query you had:
>
>> The JDK windows jni_md.h file defined jint as long and the hotspot
>> windows jni_x86.h as int. I had to choose the jdk version since it's the
>> public version, so there are changes to the hotspot files for this.
>
> On Windows int and long are always the same as it uses ILP32 or LLP64 
> (not LP64 like *nix platforms). So either choice should be fine. That 
> said there are some odd casting issues I comment on below. Does the VS 
> compiler complain about mixing int and long in expressions?

Yes, it does even though int and long are the same representation.
>
>> Generally I changed the code to use 'int' rather than 'jint' where the
>> surrounding API didn't insist on consistently using java types. We
>> should mostly be using C++ types within hotspot except in interfaces to
>> native/JNI code.
>
> I think you pulled too hard on a few threads here and things are 
> starting to unravel. There are numerous cases I refer to below where 
> either the cast seems unnecessary/inappropriate or else highlights a 
> bunch of additional changes that also need to be made. The fan out 
> from this could be horrendous. Unless you actually get some kind of 
> error - and I'd like to understand the details of those - I would not 
> suggest making these changes as part of this work.

I didn't make any change unless there was was an error.? I have 100 
failed JPRT jobs to confirm!? I eventually got a Windows system to 
compile and test this on.?? Actually some of the changes came out 
better.? Cases where we use jint as a bool simply turned to int.? We do 
not have an overload for bool for cmpxchg.
>
> Looking through I have a quite a few queries/comments - apologies in 
> advance as I know how tedious this is:
>
> make/hotspot/lib/CompileLibjsig.gmk
> src/java.base/solaris/native/libjsig/jsig.c
>
> Took a while to figure out why the include was needed. :) As a follow 
> up I suggest just deleting the -I include directive, delete the 
> Solaris-only definition of JSIG_VERSION_1_4_1, and delete everything 
> to do with JVM_get_libjsig_version. It is all obsolete.

Can I patch up jsig in a separate RFE?? I don't remember why this broke 
so I simply moved JSIG #define.? Is jsig obsolete?? Removing JVM_* 
definitions generally requires a CSR.
>
> ---
>
> src/hotspot/cpu/arm/interp_masm_arm.cpp
>
> Why did you need to add the jvm.h include?
>

 ? tbz(Raccess_flags, JVM_ACC_SYNCHRONIZED_BIT, unlocked);

> ---
>
> src/hotspot/os/windows/os_windows.cpp.
>
> The type of process_exiting should be uint to match the DWORD of 
> GetCurrentThreadID. Then you should need any casts. Also you missed 
> this jint cast:
>
> 3796???????? process_exiting != (jint)GetCurrentThreadId()) {

Yes, that's better to change process_exiting to a DWORD.? It needs a 
DWORD cast to 0 in the cmpxchg.

 ??????? Atomic::cmpxchg(GetCurrentThreadId(), &process_exiting, (DWORD)0);

These templates are picky.
>
> ---
>
> src/hotspot/share/c1/c1_Canonicalizer.hpp
>
> ? 43 #ifdef _WINDOWS
> ? 44?? // jint is defined as long in jni_md.h, so convert from int to 
> jint
> ? 45?? void set_constant(int x)?????????????????????? { 
> set_constant((jint)x); }
> ? 46 #endif
>
> Why is this necessary? int and long are the same on Windows. The whole 
> point is that jint hides the underlying type, so where does this go 
> wrong?

No, they are not the same types even though they have the same 
representation!
>
> ---
>
> src/hotspot/share/c1/c1_LinearScan.cpp
>
> ?ConstantIntValue((jint)0);
>
> why is this cast needed? what causes the ambiguity? (If this was a 
> template I'd understand ;-) ). Also didn't you change that constructor 
> to take an int anyway - not that I think it should - see below.

Yes, it caused an ambiguity.? 0 matches 'int' but it doesn't match 
'long' better than any pointer type.? So this cast is needed.
>
> ---
>
> src/hotspot/share/ci/ciReplay.cpp
>
> 793???????? jint* dims = NEW_RESOURCE_ARRAY(jint, rank);
>
> why should this be jint?

To avoid a cast from int* to jint* in the line below:

          value = kelem->multi_allocate(rank, dims, CHECK);


>
> ---
>
> src/hotspot/share/classfile/altHashing.cpp
>
> Okay this looks more consistent with jint.

Yes.? I translated this from some native code iirc.
>
> ---
>
> src/hotspot/share/code/debugInfo.hpp
>
> These changes seem wrong. We have:
>
> ConstantLongValue(jlong value)
> ConstantDoubleValue(jdouble value)
>
> so we should have:
>
> ConstantIntValue(jint value)

Again, there are multiple call sites with '0', which match int trivially 
but are confused with long.? It's less consistent I agree but better to 
not cast all the call sites.
>
> ---
>
> src/hotspot/share/code/relocInfo.cpp
>
> Change seems unnecessary - int32_t is fine
>

No, int32_t doesn't match the calls below it.? They all assume _lo and 
_hi are jint.
> ---
>
> src/hotspot/share/compiler/compileBroker.cpp
> src/hotspot/share/compiler/compileBroker.hpp
>
> I see a complete mix of int and jint in this class, so why make the 
> one change you did ??

This is another case of using jint as a flag with cmpxchg.? The 
templates for cmpxchg want the types to match and 0 and 1 are 
essentially 'int'.? This is a lot cleaner this way.

>
> ---
>
> src/hotspot/share/jvmci/jvmciCompilerToVM.cpp
>
> 1700???? tty->write((char*) start, MIN2(length, (jint)O_BUFLEN));
>
> why did you need to add the jint cast? It's used without any cast on 
> the next two lines:
>
> 1701???? length -= O_BUFLEN;
> 1702???? offset += O_BUFLEN;
>

There's a conversion from O_BUFLEN from int to long in 1701 and 1702.?? 
MIN2 is a template that wants the types to match exactly.

> ??
>
> ---
>
> src/hotspot/share/jvmci/jvmciRuntime.cpp
>
> Looking around this code it seems very confused about types - eg the 
> previous function is declared jboolean yet returns a jint on one path! 
> It isn't clear to me if the return type is what should be changed or 
> the parameter type? I would just leave this alone.

I can't leave it alone because it doesn't compile that way.? This was 
the minimal change and yea, does look a bit inconsistent.
>
> ---
>
> src/hotspot/share/opto/mulnode.cpp
>
> Okay TypeInt has jint parts, so the remaining int32_t declarations (A, 
> B, C, D) should also be jint.

Yes.? c2 uses jint types.
>
> ---
>
> src/hotspot/share/opto/parse3.cpp
>
> I agree with the changes you made, but then:
>
> ?419???? jint dim_con = find_int_con(length[j], -1);
>
> should also be changed.
>
> And obviously MultiArrayExpandLimit should be defined as int not intx!

Everything in globals.hpp is intx.? That's a thread that I don't want to 
pull on!

Changed dim_con to int.
>
> ---
>
> src/hotspot/share/opto/phaseX.cpp
>
> I can see that intcon(jint i) is consistent with longcon(jlong l), but 
> the use of "i" in the code is more consistent with int than jint.

huh?? really?
>
> ---
>
> src/hotspot/share/opto/type.cpp
>
> 1505 int TypeInt::hash(void) const {
> 1506?? return java_add(java_add(_lo, _hi), java_add((jint)_widen, 
> (jint)Type::Int));
> 1507 }
>
> I can see that the (jint) casts you added make sense, but then the 
> whole function should be returning jint not int. Ditto the other hash 
> functions.

I'm not messing with this, this is the minimal in type fixing that I'm 
going to do here.
>
> ---
>
> src/hotspot/share/prims/jni.cpp
>
> I think vm_created should be a bool. In fact all the fields you 
> changed are logically bools - do Atomics work for bool now?

No, they do not.?? I had thought bool would be better originally too.
>
> ---
>
> src/hotspot/share/prims/jvm.cpp
>
> is_attachable is the terminology used in the JDK code.

Well the JDK version had is_attach_supported() as the flag name so I 
used that in this one place.
>
> ---
>
> src/hotspot/share/prims/jvmtiEnvBase.cpp
> src/hotspot/share/prims/jvmtiImpl.cpp
>
> Are you making parameters consistent with the fields they initialize?

They're consistent with the declarations now.
>
> ---
>
> src/hotspot/share/prims/jvmtiTagMap.cpp
>
> There is a mix of int and jint for slot in this code. You fixed some, 
> but this remains:
>
> 2440 inline bool CallbackInvoker::report_stack_ref_root(jlong thread_tag,
> 2441??????????????????????????????????????????????????? jlong tid,
> 2442??????????????????????????????????????????????????? jint depth,
> 2443??????????????????????????????????????????????????? jmethodID method,
> 2444??????????????????????????????????????????????????? jlocation bci,
> 2445??????????????????????????????????????????????????? jint slot,

Right for consistency with the declarations.
>
> ---
>
> src/hotspot/share/runtime/perfData.cpp
>
> Callers pass both jint and int, so param type seems arbitrary.

They are, but importantly they match the declarations.
>
> ---
>
> src/hotspot/share/runtime/perfMemory.cpp
> src/hotspot/share/runtime/perfMemory.hpp
>
> PerfMemory::_initialized should ideally be a bool - can OrderAccess 
> handle that now?

Nope.
>
> ---
>
> src/java.base/share/native/include/jvm.h
>
> Not clear why the jio functions are not also JNICALL ?

They are now.? The JDK version didn't have JNICALL.? JVM needs JNICALL.? 
I can't tell you why JDK didn't need JNICALL linkage.
>
> ---
>
> src/java.base/unix/native/include/jni_md.h
>
> There is no need to special case ARM. The differences in the existing 
> code were for LTO support and that is now irrelevant.

See discussion with Magnus.?? We still build ARM for jdk10/hs so I 
needed this conditional or of course I wouldn't have added it.? We can 
remove it with LTO support.
>
> ---
>
> src/java.base/unix/native/include/jvm_md.h
>
> I know you've just copied this across, but it seems wrong to me:
>
> ?57 // Hack: MAXPATHLEN is 4095 on some Linux and 4096 on others. This 
> may
> ? 58 //?????? cause problems if JVM and the rest of JDK are built on 
> different
> ? 59 //?????? Linux releases. Here we define JVM_MAXPATHLEN to be 
> MAXPATHLEN + 1,
> ? 60 //?????? so buffers declared in VM are always >= 4096.
> ? 61 #define JVM_MAXPATHLEN MAXPATHLEN + 1
>
> It doesn't make sense to me to define an internal "max path length" 
> that can _exceed_ the platform max!
>
> That aside there's no support for building different parts of the JDK 
> on different platforms and then bringing them together. And in any 
> case I would think the real problem would be building on a platform 
> that uses 4096 and running on one that uses 4095!
>
> But that aside this is a Linux hack and should be guarded by ifdef 
> LINUX. (I doubt BSD needs it, the bsd file is just a copy of the linux 
> one - the JDK macosx version does the right thing). Solaris and AIX 
> should stay as-is at MAXPATHLEN.

All of the unix platforms had MAXPATHLEN+1.? I'll leave it for now and 
we can investigate that further.

>
> ?86 #define ASYNC_SIGNAL???? SIGJVM2
>
> This only exists on Solaris so I think should be in #ifdef SOLARIS, to 
> make that clear.

Ok.? I'll add this.
>
> ---
>
> src/java.base/windows/native/include/jvm_md.h
>
> Given the differences between the two versions either something has 
> been broken or "extern C" declarations are not needed :)

Well, they are needed for Hotspot to build and do not prevent jdk from 
building.? I don't know what was broken.
>
> ---
>
> That was a really painful way to spend most of my Friday. TGIF! :)

Thanks for going through it.? See comments inline for changes. 
Generating a webrev takes hours so I'm not going to do that unless you 
insist.

Thanks,
Coleen


>
> Thanks,
> David
> -----
>
>
> On 27/10/2017 6:44 AM, coleen.phillimore at oracle.com wrote:
>> ??Hi Magnus,
>>
>> Thank you for reviewing this.?? I have a new version that takes out 
>> the hack in globalDefinitions.hpp and adds casts to 
>> src/hotspot/share/opto/type.cpp instead.
>>
>> Also some fixes from Martin at SAP.
>>
>> open webrev at http://cr.openjdk.java.net/~coleenp/8189610.02/webrev
>>
>> see below.
>>
>> On 10/26/17 5:57 AM, Magnus Ihse Bursie wrote:
>>> Coleen,
>>>
>>> Thank you for addressing this!
>>>
>>> On 2017-10-25 18:49, coleen.phillimore at oracle.com wrote:
>>>> Summary: removed hotspot version of jvm*h and jni*h files
>>>>
>>>> Mostly used sed to remove prims/jvm.h and move #include "jvm.h" 
>>>> after precompiled.h, so if you have repetitive stress wrist issues 
>>>> don't click on most of these files.
>>>>
>>>> There were more issues to resolve, however.? The JDK windows 
>>>> jni_md.h file defined jint as long and the hotspot windows 
>>>> jni_x86.h as int. I had to choose the jdk version since it's the 
>>>> public version, so there are changes to the hotspot files for this. 
>>>> Generally I changed the code to use 'int' rather than 'jint' where 
>>>> the surrounding API didn't insist on consistently using java types. 
>>>> We should mostly be using C++ types within hotspot except in 
>>>> interfaces to native/JNI code.? There are a couple of hacks in 
>>>> places where adding multiple jint casts was too painful.
>>>>
>>>> Tested with JPRT and tier2-4 (in progress).
>>>>
>>>> open webrev at http://cr.openjdk.java.net/~coleenp/8189610.01/webrev
>>>
>>> Looks great!
>>>
>>> Just a few comments:
>>>
>>> * src/java.base/unix/native/include/jni_md.h:
>>>
>>> I don't think the externally_visible attribute should be there for 
>>> arm. I know this was the case for the corresponding hotspot file for 
>>> arm, but that was techically incorrect. The proper dependency here 
>>> is that externally_visible should be in all JNIEXPORT if and only if 
>>> we're building with JVM feature "link-time-opt". Traditionally, that 
>>> feature been enabled when building arm32 builds, and only then, so 
>>> there's been a (coincidentally) connection here. Nowadays, Oracle 
>>> does not care about the arm32 builds, and I'm not sure if anyone 
>>> else is building them with link-time-opt enabled.
>>>
>>> It does seem wrong to me to export this behavior in the public 
>>> jni_md.h file, though. I think the correct way to solve this, if we 
>>> should continue supporting link-time-opt is to make sure this 
>>> attribute is set for exported hotspot functions. If it's still 
>>> needed, that is. A quick googling seems to indicate that 
>>> visibility("default") might be enough in modern gcc's.
>>>
>>> A third option is to remove the support for link-time-opt entirely, 
>>> if it's not really used.
>>
>> I didn't know how to change this since we are still building ARM with 
>> the jdk10/hs repository, and ARM needed this change.? I could wait 
>> until we bring down the jdk10/master changes that remove the ARM 
>> build and remove this conditional before I push. Or we could file an 
>> RFE to remove link-time-opt (?) and remove it then?
>>
>>>
>>> * src/java.base/unix/native/include/jvm_md.h and 
>>> src/java.base/windows/native/include/jvm_md.h:
>>>
>>> These files define a public API, and contain non-trivial changes. I 
>>> suspect you should file a CSR request. (Even though I realize you're 
>>> only matching the header file with the reality.)
>>>
>>
>> I filed the CSR.?? Waiting for the next steps.
>>
>> Thanks,
>> Coleen
>>
>>> /Magnus
>>>
>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8189610
>>>>
>>>> I have a script to update copyright files on commit.
>>>>
>>>> Thanks to Magnus and ErikJ for the makefile changes.
>>>>
>>>> Thanks,
>>>> Coleen
>>>>
>>>
>>


From coleen.phillimore at oracle.com  Fri Oct 27 12:15:34 2017
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Fri, 27 Oct 2017 08:15:34 -0400
Subject: RFR (L, tedious again, sorry) 8189610: Reconcile jvm.h and all
 jvm_md.h between java.base and hotspot
In-Reply-To: <c7314ae5-58b2-4e4c-73f4-6423a19f31e1@oracle.com>
References: <8671321f-398c-5f7f-634d-9d9664e04d87@oracle.com>
 <05bf853a-52f8-c450-4171-89f6b64d793a@oracle.com>
 <72f2aac7-ed20-c995-913b-ee4341a2a978@oracle.com>
 <c7314ae5-58b2-4e4c-73f4-6423a19f31e1@oracle.com>
Message-ID: <54c2f8c4-dcec-7b87-024f-65353c91242f@oracle.com>


On 10/27/17 7:44 AM, Magnus Ihse Bursie wrote:
>
> On 2017-10-26 22:44, coleen.phillimore at oracle.com wrote:
>> ?Hi Magnus,
>>
>> Thank you for reviewing this.?? I have a new version that takes out 
>> the hack in globalDefinitions.hpp and adds casts to 
>> src/hotspot/share/opto/type.cpp instead.
>>
>> Also some fixes from Martin at SAP.
>>
>> open webrev at http://cr.openjdk.java.net/~coleenp/8189610.02/webrev
>>
>> see below.
>>
>> On 10/26/17 5:57 AM, Magnus Ihse Bursie wrote:
>>> Coleen,
>>>
>>> Thank you for addressing this!
>>>
>>> On 2017-10-25 18:49, coleen.phillimore at oracle.com wrote:
>>>> Summary: removed hotspot version of jvm*h and jni*h files
>>>>
>>>> Mostly used sed to remove prims/jvm.h and move #include "jvm.h" 
>>>> after precompiled.h, so if you have repetitive stress wrist issues 
>>>> don't click on most of these files.
>>>>
>>>> There were more issues to resolve, however.? The JDK windows 
>>>> jni_md.h file defined jint as long and the hotspot windows 
>>>> jni_x86.h as int.? I had to choose the jdk version since it's the 
>>>> public version, so there are changes to the hotspot files for this. 
>>>> Generally I changed the code to use 'int' rather than 'jint' where 
>>>> the surrounding API didn't insist on consistently using java types. 
>>>> We should mostly be using C++ types within hotspot except in 
>>>> interfaces to native/JNI code. There are a couple of hacks in 
>>>> places where adding multiple jint casts was too painful.
>>>>
>>>> Tested with JPRT and tier2-4 (in progress).
>>>>
>>>> open webrev at http://cr.openjdk.java.net/~coleenp/8189610.01/webrev
>>>
>>> Looks great!
>>>
>>> Just a few comments:
>>>
>>> * src/java.base/unix/native/include/jni_md.h:
>>>
>>> I don't think the externally_visible attribute should be there for 
>>> arm. I know this was the case for the corresponding hotspot file for 
>>> arm, but that was techically incorrect. The proper dependency here 
>>> is that externally_visible should be in all JNIEXPORT if and only if 
>>> we're building with JVM feature "link-time-opt". Traditionally, that 
>>> feature been enabled when building arm32 builds, and only then, so 
>>> there's been a (coincidentally) connection here. Nowadays, Oracle 
>>> does not care about the arm32 builds, and I'm not sure if anyone 
>>> else is building them with link-time-opt enabled.
>>>
>>> It does seem wrong to me to export this behavior in the public 
>>> jni_md.h file, though. I think the correct way to solve this, if we 
>>> should continue supporting link-time-opt is to make sure this 
>>> attribute is set for exported hotspot functions. If it's still 
>>> needed, that is. A quick googling seems to indicate that 
>>> visibility("default") might be enough in modern gcc's.
>>>
>>> A third option is to remove the support for link-time-opt entirely, 
>>> if it's not really used.
>>
>> I didn't know how to change this since we are still building ARM with 
>> the jdk10/hs repository, and ARM needed this change.? I could wait 
>> until we bring down the jdk10/master changes that remove the ARM 
>> build and remove this conditional before I push. Or we could file an 
>> RFE to remove link-time-opt (?) and remove it then?
>
> I'm looking into the link-time-opt issue right now. I think it boils 
> down to us using an incorrect flag to gcc when linking, 
> -fwhole-program, when -fuse-linker-plugin should have been used. This 
> caused all exported symbols to disappear unless they were attributed 
> with externally_visible, which makes sense for a program but not a 
> shared library. I'm currently trying to verify that 
> -fuse-linker-plugin removes the need for the externally_visible 
> attribute when using link-time-opt. If it does, I'll open a separate 
> bug to fix that, and if I push that first, you can safely delete the 
> externally_visible attributes.

Thanks Magnus.? Let me know when you push this change and I'll update my 
change to remove this #ifdef ARM code.?? Please push to the hs repo though.

Thanks!
Coleen

>
> /Magnus
>
>>
>>>
>>> * src/java.base/unix/native/include/jvm_md.h and 
>>> src/java.base/windows/native/include/jvm_md.h:
>>>
>>> These files define a public API, and contain non-trivial changes. I 
>>> suspect you should file a CSR request. (Even though I realize you're 
>>> only matching the header file with the reality.)
>>>
>>
>> I filed the CSR.?? Waiting for the next steps.
>>
>> Thanks,
>> Coleen
>>
>>> /Magnus
>>>
>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8189610
>>>>
>>>> I have a script to update copyright files on commit.
>>>>
>>>> Thanks to Magnus and ErikJ for the makefile changes.
>>>>
>>>> Thanks,
>>>> Coleen
>>>>
>>>
>>
>


From david.holmes at oracle.com  Fri Oct 27 12:31:33 2017
From: david.holmes at oracle.com (David Holmes)
Date: Fri, 27 Oct 2017 22:31:33 +1000
Subject: RFR (L, tedious again, sorry) 8189610: Reconcile jvm.h and all
 jvm_md.h between java.base and hotspot
In-Reply-To: <c7314ae5-58b2-4e4c-73f4-6423a19f31e1@oracle.com>
References: <8671321f-398c-5f7f-634d-9d9664e04d87@oracle.com>
 <05bf853a-52f8-c450-4171-89f6b64d793a@oracle.com>
 <72f2aac7-ed20-c995-913b-ee4341a2a978@oracle.com>
 <c7314ae5-58b2-4e4c-73f4-6423a19f31e1@oracle.com>
Message-ID: <a57947e8-b6a7-2e1c-23a2-e7369789e53d@oracle.com>

Magnus,

LTO is irrelevant now.

David

On 27/10/2017 9:44 PM, Magnus Ihse Bursie wrote:
> 
> On 2017-10-26 22:44, coleen.phillimore at oracle.com wrote:
>> ?Hi Magnus,
>>
>> Thank you for reviewing this.?? I have a new version that takes out 
>> the hack in globalDefinitions.hpp and adds casts to 
>> src/hotspot/share/opto/type.cpp instead.
>>
>> Also some fixes from Martin at SAP.
>>
>> open webrev at http://cr.openjdk.java.net/~coleenp/8189610.02/webrev
>>
>> see below.
>>
>> On 10/26/17 5:57 AM, Magnus Ihse Bursie wrote:
>>> Coleen,
>>>
>>> Thank you for addressing this!
>>>
>>> On 2017-10-25 18:49, coleen.phillimore at oracle.com wrote:
>>>> Summary: removed hotspot version of jvm*h and jni*h files
>>>>
>>>> Mostly used sed to remove prims/jvm.h and move #include "jvm.h" 
>>>> after precompiled.h, so if you have repetitive stress wrist issues 
>>>> don't click on most of these files.
>>>>
>>>> There were more issues to resolve, however.? The JDK windows 
>>>> jni_md.h file defined jint as long and the hotspot windows jni_x86.h 
>>>> as int.? I had to choose the jdk version since it's the public 
>>>> version, so there are changes to the hotspot files for this. 
>>>> Generally I changed the code to use 'int' rather than 'jint' where 
>>>> the surrounding API didn't insist on consistently using java types. 
>>>> We should mostly be using C++ types within hotspot except in 
>>>> interfaces to native/JNI code. There are a couple of hacks in places 
>>>> where adding multiple jint casts was too painful.
>>>>
>>>> Tested with JPRT and tier2-4 (in progress).
>>>>
>>>> open webrev at http://cr.openjdk.java.net/~coleenp/8189610.01/webrev
>>>
>>> Looks great!
>>>
>>> Just a few comments:
>>>
>>> * src/java.base/unix/native/include/jni_md.h:
>>>
>>> I don't think the externally_visible attribute should be there for 
>>> arm. I know this was the case for the corresponding hotspot file for 
>>> arm, but that was techically incorrect. The proper dependency here is 
>>> that externally_visible should be in all JNIEXPORT if and only if 
>>> we're building with JVM feature "link-time-opt". Traditionally, that 
>>> feature been enabled when building arm32 builds, and only then, so 
>>> there's been a (coincidentally) connection here. Nowadays, Oracle 
>>> does not care about the arm32 builds, and I'm not sure if anyone else 
>>> is building them with link-time-opt enabled.
>>>
>>> It does seem wrong to me to export this behavior in the public 
>>> jni_md.h file, though. I think the correct way to solve this, if we 
>>> should continue supporting link-time-opt is to make sure this 
>>> attribute is set for exported hotspot functions. If it's still 
>>> needed, that is. A quick googling seems to indicate that 
>>> visibility("default") might be enough in modern gcc's.
>>>
>>> A third option is to remove the support for link-time-opt entirely, 
>>> if it's not really used.
>>
>> I didn't know how to change this since we are still building ARM with 
>> the jdk10/hs repository, and ARM needed this change.? I could wait 
>> until we bring down the jdk10/master changes that remove the ARM build 
>> and remove this conditional before I push.? Or we could file an RFE to 
>> remove link-time-opt (?) and remove it then?
> 
> I'm looking into the link-time-opt issue right now. I think it boils 
> down to us using an incorrect flag to gcc when linking, -fwhole-program, 
> when -fuse-linker-plugin should have been used. This caused all exported 
> symbols to disappear unless they were attributed with 
> externally_visible, which makes sense for a program but not a shared 
> library. I'm currently trying to verify that -fuse-linker-plugin removes 
> the need for the externally_visible attribute when using link-time-opt. 
> If it does, I'll open a separate bug to fix that, and if I push that 
> first, you can safely delete the externally_visible attributes.


> /Magnus
> 
>>
>>>
>>> * src/java.base/unix/native/include/jvm_md.h and 
>>> src/java.base/windows/native/include/jvm_md.h:
>>>
>>> These files define a public API, and contain non-trivial changes. I 
>>> suspect you should file a CSR request. (Even though I realize you're 
>>> only matching the header file with the reality.)
>>>
>>
>> I filed the CSR.?? Waiting for the next steps.
>>
>> Thanks,
>> Coleen
>>
>>> /Magnus
>>>
>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8189610
>>>>
>>>> I have a script to update copyright files on commit.
>>>>
>>>> Thanks to Magnus and ErikJ for the makefile changes.
>>>>
>>>> Thanks,
>>>> Coleen
>>>>
>>>
>>
> 

From robbin.ehn at oracle.com  Fri Oct 27 13:14:32 2017
From: robbin.ehn at oracle.com (Robbin Ehn)
Date: Fri, 27 Oct 2017 15:14:32 +0200
Subject: RFR(XL): 8185640: Thread-local handshakes
In-Reply-To: <59F2F01A.403@oracle.com>
References: <6f2f6259-73f1-c09c-063e-39ae528fb96f@oracle.com>
 <18f2001cbbbd4772aa9268e6e34b4be9@sap.com>
 <770b1286-3c8e-92e5-3929-17eb4e6c3847@oracle.com>
 <edf27200af4e4fb4a8bda1831ae2d974@sap.com>
 <6e10687c-f70e-5ee7-414f-b2c22d3e8f21@oracle.com>
 <f3553eec9b1d4815adbf7c87ca550527@sap.com>
 <580AD7F0-2713-472C-A440-AAFDDA2D3EB3@oracle.com>
 <7591f6c0-7192-78c3-fe79-56a7785c43e4@oracle.com>
 <2ff79d24-90ab-822a-bd61-e01b79c01ada@redhat.com>
 <8d7678bf2281406da43cbe090276b51f@sap.com>
 <28bc3976-424d-1e05-cf7f-29bc38ccabcb@oracle.com>
 <818e352d5e3a450491cf0c140bf129d6@sap.com>
 <1c4a025da6ad4d39bedc1d6a12549b87@sap.com>
 <93cce80b-e9d7-f016-1324-2b0f5fac48c4@redhat.com> <59F1F3B3.10701@oracle.com>
 <43837915-a3a3-b36f-940e-1327937f0f17@redhat.com>
 <2EB9D7C3-B868-4C3E-BD88-6A4F92A39999@oracle.com>
 <b4d75241-f87f-3aef-ad55-897d1f91be1e@redhat.com>
 <59F2DC24.8050701@oracle.com>
 <cd99843e-2602-c423-c74e-210884713ef5@redhat.com> <59F2F01A.403@oracle.com>
Message-ID: <d0fe324f-26ed-7fca-e8f9-81b1ca4f452d@oracle.com>

Hi all,

Poll in switches:
http://cr.openjdk.java.net/~rehn/8185640/v7/Interpreter-Poll-Switch-10/

Poll in return:
http://cr.openjdk.java.net/~rehn/8185640/v7/Interpreter-Poll-Ret-11/

Please take an extra look at poll in return.

Sanity tested, big test run still running (99% complete - OK).

Performance regression for the added polls increased to total of -0.68% vs 
global poll. (was -0.44%)

We are discussing the opt-out option, the newest suggestion is to make it 
diagnostic. Opinions?

For anyone applying these patches, the number 9 patch changes the option from 
product. I have not sent that out.

Thanks, Robbin


From aph at redhat.com  Fri Oct 27 13:21:32 2017
From: aph at redhat.com (Andrew Haley)
Date: Fri, 27 Oct 2017 14:21:32 +0100
Subject: RFR(XL): 8185640: Thread-local handshakes
In-Reply-To: <d0fe324f-26ed-7fca-e8f9-81b1ca4f452d@oracle.com>
References: <6f2f6259-73f1-c09c-063e-39ae528fb96f@oracle.com>
 <770b1286-3c8e-92e5-3929-17eb4e6c3847@oracle.com>
 <edf27200af4e4fb4a8bda1831ae2d974@sap.com>
 <6e10687c-f70e-5ee7-414f-b2c22d3e8f21@oracle.com>
 <f3553eec9b1d4815adbf7c87ca550527@sap.com>
 <580AD7F0-2713-472C-A440-AAFDDA2D3EB3@oracle.com>
 <7591f6c0-7192-78c3-fe79-56a7785c43e4@oracle.com>
 <2ff79d24-90ab-822a-bd61-e01b79c01ada@redhat.com>
 <8d7678bf2281406da43cbe090276b51f@sap.com>
 <28bc3976-424d-1e05-cf7f-29bc38ccabcb@oracle.com>
 <818e352d5e3a450491cf0c140bf129d6@sap.com>
 <1c4a025da6ad4d39bedc1d6a12549b87@sap.com>
 <93cce80b-e9d7-f016-1324-2b0f5fac48c4@redhat.com> <59F1F3B3.10701@oracle.com>
 <43837915-a3a3-b36f-940e-1327937f0f17@redhat.com>
 <2EB9D7C3-B868-4C3E-BD88-6A4F92A39999@oracle.com>
 <b4d75241-f87f-3aef-ad55-897d1f91be1e@redhat.com>
 <59F2DC24.8050701@oracle.com>
 <cd99843e-2602-c423-c74e-210884713ef5@redhat.com> <59F2F01A.403@oracle.com>
 <d0fe324f-26ed-7fca-e8f9-81b1ca4f452d@oracle.com>
Message-ID: <3d3474e5-2380-8209-cb95-3ca8cc4aa4ed@redhat.com>

On 27/10/17 14:14, Robbin Ehn wrote:
> We are discussing the opt-out option, the newest suggestion is to make it 
> diagnostic. Opinions?

We're working on ultra-low-pause-time garbage collection, and it would be very
useful to be able to safepoint the interpreter at any bytecode, not at jumps.
It is a performance-related option rather than diagonstic.

-- 
Andrew Haley
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671

From david.holmes at oracle.com  Fri Oct 27 13:37:42 2017
From: david.holmes at oracle.com (David Holmes)
Date: Fri, 27 Oct 2017 23:37:42 +1000
Subject: RFR (L, tedious again, sorry) 8189610: Reconcile jvm.h and all
 jvm_md.h between java.base and hotspot
In-Reply-To: <a05deb31-c7bf-4fda-f621-d39cfb56abb4@oracle.com>
References: <8671321f-398c-5f7f-634d-9d9664e04d87@oracle.com>
 <05bf853a-52f8-c450-4171-89f6b64d793a@oracle.com>
 <72f2aac7-ed20-c995-913b-ee4341a2a978@oracle.com>
 <55ec3559-c593-bcb6-51b0-4639da126068@oracle.com>
 <a05deb31-c7bf-4fda-f621-d39cfb56abb4@oracle.com>
Message-ID: <4509dce7-10f8-4558-2adb-90d4745e054e@oracle.com>

On 27/10/2017 10:13 PM, coleen.phillimore at oracle.com wrote:
> 
> 
> On 10/27/17 3:23 AM, David Holmes wrote:
>> Hi Coleen,
>>
>> Thanks for tackling this.
>>
>>> Summary: removed hotspot version of jvm*h and jni*h files
>>
>> Can you update the bug synopsis to show it covers both sets of files 
>> please.
>>
>> I hate to start with this (and it took me quite a while to realize it) 
>> but as Mandy pointed out jvm.h is not an exported interface from the 
>> JDK to the outside world (so not subject to CSR review), but is a 
>> private interface between the JVM and the JDK libraries. So I think 
>> really jvm.h belongs in the hotspot sources where it was, while jni.h 
>> belongs in the exported JDK sources. In which case the bulk of your 
>> changes to the hotspot files would not be needed - sorry.
> 
> Maybe someone can make that decision and change at a later date. The 
> point of this change is that there is now only one of these files that 
> is shared.? I don't think jvm.h and the jvm_md.h belong on the hotspot 
> sources for the jdk to find them in some random prims and os dependent 
> directories.

The one file that is needed is a hotspot file - jvm.h defines the 
interface that hotspot exports via jvm.cpp.

If you leave jvm.h in hotspot/prims then a very large chunk of your 
boilerplate changes are not needed. The JDK code doesn't care what the 
name of the directory is - whatever it is just gets added as a -I 
directive (the JDK code will include "jvm.h" not "prims/jvm.h" the way 
hotspot sources do.

This isn't something we want to change back or move again later. 
Whatever we do now we live with.

> I'm happy to withdraw the CSR.? We generally use the CSR process to add 
> and remove JVM_ interfaces even though they're a private interface in 
> case some other JVM/JDK combination relies on them. The changes to these 
> files are very minor though and not likely to cause any even theoretical 
> incompatibility, so I'll withdraw it.
>>
>> Moving on ...
>>
>> First to address the initial comments/query you had:
>>
>>> The JDK windows jni_md.h file defined jint as long and the hotspot
>>> windows jni_x86.h as int. I had to choose the jdk version since it's the
>>> public version, so there are changes to the hotspot files for this.
>>
>> On Windows int and long are always the same as it uses ILP32 or LLP64 
>> (not LP64 like *nix platforms). So either choice should be fine. That 
>> said there are some odd casting issues I comment on below. Does the VS 
>> compiler complain about mixing int and long in expressions?
> 
> Yes, it does even though int and long are the same representation.

And what an absolute mess that makes. :(

>>
>>> Generally I changed the code to use 'int' rather than 'jint' where the
>>> surrounding API didn't insist on consistently using java types. We
>>> should mostly be using C++ types within hotspot except in interfaces to
>>> native/JNI code.
>>
>> I think you pulled too hard on a few threads here and things are 
>> starting to unravel. There are numerous cases I refer to below where 
>> either the cast seems unnecessary/inappropriate or else highlights a 
>> bunch of additional changes that also need to be made. The fan out 
>> from this could be horrendous. Unless you actually get some kind of 
>> error - and I'd like to understand the details of those - I would not 
>> suggest making these changes as part of this work.
> 
> I didn't make any change unless there was was an error.? I have 100 
> failed JPRT jobs to confirm!? I eventually got a Windows system to 
> compile and test this on.?? Actually some of the changes came out 
> better.? Cases where we use jint as a bool simply turned to int.? We do 
> not have an overload for bool for cmpxchg.

That's unfortunate - ditto for OrderAccess.

>>
>> Looking through I have a quite a few queries/comments - apologies in 
>> advance as I know how tedious this is:
>>
>> make/hotspot/lib/CompileLibjsig.gmk
>> src/java.base/solaris/native/libjsig/jsig.c
>>
>> Took a while to figure out why the include was needed. :) As a follow 
>> up I suggest just deleting the -I include directive, delete the 
>> Solaris-only definition of JSIG_VERSION_1_4_1, and delete everything 
>> to do with JVM_get_libjsig_version. It is all obsolete.
> 
> Can I patch up jsig in a separate RFE?? I don't remember why this broke 
> so I simply moved JSIG #define.? Is jsig obsolete?? Removing JVM_* 
> definitions generally requires a CSR.

I did say "As a follow up". jsig is not obsolete but the jsig versioning 
code, only used by Solaris, is.

>>
>> ---
>>
>> src/hotspot/cpu/arm/interp_masm_arm.cpp
>>
>> Why did you need to add the jvm.h include?
>>
> 
>  ? tbz(Raccess_flags, JVM_ACC_SYNCHRONIZED_BIT, unlocked);

Okay. I'm not going to try and figure out how this code found this before.

>> ---
>>
>> src/hotspot/os/windows/os_windows.cpp.
>>
>> The type of process_exiting should be uint to match the DWORD of 
>> GetCurrentThreadID. Then you should need any casts. Also you missed 
>> this jint cast:
>>
>> 3796???????? process_exiting != (jint)GetCurrentThreadId()) {
> 
> Yes, that's better to change process_exiting to a DWORD.? It needs a 
> DWORD cast to 0 in the cmpxchg.
> 
>  ??????? Atomic::cmpxchg(GetCurrentThreadId(), &process_exiting, (DWORD)0);
> 
> These templates are picky.

Yes - their inability to deal with literals is extremely frustrating.

>>
>> ---
>>
>> src/hotspot/share/c1/c1_Canonicalizer.hpp
>>
>> ? 43 #ifdef _WINDOWS
>> ? 44?? // jint is defined as long in jni_md.h, so convert from int to 
>> jint
>> ? 45?? void set_constant(int x)?????????????????????? { 
>> set_constant((jint)x); }
>> ? 46 #endif
>>
>> Why is this necessary? int and long are the same on Windows. The whole 
>> point is that jint hides the underlying type, so where does this go 
>> wrong?
> 
> No, they are not the same types even though they have the same 
> representation!

This is truly unfortunate.

>>
>> ---
>>
>> src/hotspot/share/c1/c1_LinearScan.cpp
>>
>> ?ConstantIntValue((jint)0);
>>
>> why is this cast needed? what causes the ambiguity? (If this was a 
>> template I'd understand ;-) ). Also didn't you change that constructor 
>> to take an int anyway - not that I think it should - see below.
> 
> Yes, it caused an ambiguity.? 0 matches 'int' but it doesn't match 
> 'long' better than any pointer type.? So this cast is needed.

But you changed the constructor to take an int!

  class ConstantIntValue: public ScopeValue {
   private:
-  jint _value;
+  int _value;
   public:
-  ConstantIntValue(jint value)         { _value = value; }
+  ConstantIntValue(int value)          { _value = value; }


>>
>> ---
>>
>> src/hotspot/share/ci/ciReplay.cpp
>>
>> 793???????? jint* dims = NEW_RESOURCE_ARRAY(jint, rank);
>>
>> why should this be jint?
> 
> To avoid a cast from int* to jint* in the line below:
> 
>           value = kelem->multi_allocate(rank, dims, CHECK);
> 
> 
>>
>> ---
>>
>> src/hotspot/share/classfile/altHashing.cpp
>>
>> Okay this looks more consistent with jint.
> 
> Yes.? I translated this from some native code iirc.
>>
>> ---
>>
>> src/hotspot/share/code/debugInfo.hpp
>>
>> These changes seem wrong. We have:
>>
>> ConstantLongValue(jlong value)
>> ConstantDoubleValue(jdouble value)
>>
>> so we should have:
>>
>> ConstantIntValue(jint value)
> 
> Again, there are multiple call sites with '0', which match int trivially 
> but are confused with long.? It's less consistent I agree but better to 
> not cast all the call sites.

This is really making a mess of the APIs - they should be a jint but we 
declare them int because of a 0 casting problem. Can't we just use 0L?

>>
>> ---
>>
>> src/hotspot/share/code/relocInfo.cpp
>>
>> Change seems unnecessary - int32_t is fine
>>
> 
> No, int32_t doesn't match the calls below it.? They all assume _lo and 
> _hi are jint.
>> ---
>>
>> src/hotspot/share/compiler/compileBroker.cpp
>> src/hotspot/share/compiler/compileBroker.hpp
>>
>> I see a complete mix of int and jint in this class, so why make the 
>> one change you did ??
> 
> This is another case of using jint as a flag with cmpxchg.? The 
> templates for cmpxchg want the types to match and 0 and 1 are 
> essentially 'int'.? This is a lot cleaner this way.

<sigh>

>>
>> ---
>>
>> src/hotspot/share/jvmci/jvmciCompilerToVM.cpp
>>
>> 1700???? tty->write((char*) start, MIN2(length, (jint)O_BUFLEN));
>>
>> why did you need to add the jint cast? It's used without any cast on 
>> the next two lines:
>>
>> 1701???? length -= O_BUFLEN;
>> 1702???? offset += O_BUFLEN;
>>
> 
> There's a conversion from O_BUFLEN from int to long in 1701 and 1702.   
> MIN2 is a template that wants the types to match exactly.

$%^%$! templates!

>> ??
>>
>> ---
>>
>> src/hotspot/share/jvmci/jvmciRuntime.cpp
>>
>> Looking around this code it seems very confused about types - eg the 
>> previous function is declared jboolean yet returns a jint on one path! 
>> It isn't clear to me if the return type is what should be changed or 
>> the parameter type? I would just leave this alone.
> 
> I can't leave it alone because it doesn't compile that way.? This was 
> the minimal change and yea, does look a bit inconsistent.
>>
>> ---
>>
>> src/hotspot/share/opto/mulnode.cpp
>>
>> Okay TypeInt has jint parts, so the remaining int32_t declarations (A, 
>> B, C, D) should also be jint.
> 
> Yes.? c2 uses jint types.
>>
>> ---
>>
>> src/hotspot/share/opto/parse3.cpp
>>
>> I agree with the changes you made, but then:
>>
>> ?419???? jint dim_con = find_int_con(length[j], -1);
>>
>> should also be changed.
>>
>> And obviously MultiArrayExpandLimit should be defined as int not intx!
> 
> Everything in globals.hpp is intx.? That's a thread that I don't want to 
> pull on!

We still have that limitation? <double sigh>
> 
> Changed dim_con to int.
>>
>> ---
>>
>> src/hotspot/share/opto/phaseX.cpp
>>
>> I can see that intcon(jint i) is consistent with longcon(jlong l), but 
>> the use of "i" in the code is more consistent with int than jint.
> 
> huh?? really?
>>
>> ---
>>
>> src/hotspot/share/opto/type.cpp
>>
>> 1505 int TypeInt::hash(void) const {
>> 1506?? return java_add(java_add(_lo, _hi), java_add((jint)_widen, 
>> (jint)Type::Int));
>> 1507 }
>>
>> I can see that the (jint) casts you added make sense, but then the 
>> whole function should be returning jint not int. Ditto the other hash 
>> functions.
> 
> I'm not messing with this, this is the minimal in type fixing that I'm 
> going to do here.

<sigh>

>>
>> ---
>>
>> src/hotspot/share/prims/jni.cpp
>>
>> I think vm_created should be a bool. In fact all the fields you 
>> changed are logically bools - do Atomics work for bool now?
> 
> No, they do not.?? I had thought bool would be better originally too.
>>
>> ---
>>
>> src/hotspot/share/prims/jvm.cpp
>>
>> is_attachable is the terminology used in the JDK code.
> 
> Well the JDK version had is_attach_supported() as the flag name so I 
> used that in this one place.
>>
>> ---
>>
>> src/hotspot/share/prims/jvmtiEnvBase.cpp
>> src/hotspot/share/prims/jvmtiImpl.cpp
>>
>> Are you making parameters consistent with the fields they initialize?
> 
> They're consistent with the declarations now.
>>
>> ---
>>
>> src/hotspot/share/prims/jvmtiTagMap.cpp
>>
>> There is a mix of int and jint for slot in this code. You fixed some, 
>> but this remains:
>>
>> 2440 inline bool CallbackInvoker::report_stack_ref_root(jlong thread_tag,
>> 2441??????????????????????????????????????????????????? jlong tid,
>> 2442??????????????????????????????????????????????????? jint depth,
>> 2443??????????????????????????????????????????????????? jmethodID method,
>> 2444??????????????????????????????????????????????????? jlocation bci,
>> 2445??????????????????????????????????????????????????? jint slot,
> 
> Right for consistency with the declarations.
>>
>> ---
>>
>> src/hotspot/share/runtime/perfData.cpp
>>
>> Callers pass both jint and int, so param type seems arbitrary.
> 
> They are, but importantly they match the declarations.
>>
>> ---
>>
>> src/hotspot/share/runtime/perfMemory.cpp
>> src/hotspot/share/runtime/perfMemory.hpp
>>
>> PerfMemory::_initialized should ideally be a bool - can OrderAccess 
>> handle that now?
> 
> Nope.
>>
>> ---
>>
>> src/java.base/share/native/include/jvm.h
>>
>> Not clear why the jio functions are not also JNICALL ?
> 
> They are now.? The JDK version didn't have JNICALL.? JVM needs JNICALL.  
> I can't tell you why JDK didn't need JNICALL linkage.

?? JVM currently does not have JNICALL. But they are declared as "extern C".

>>
>> ---
>>
>> src/java.base/unix/native/include/jni_md.h
>>
>> There is no need to special case ARM. The differences in the existing 
>> code were for LTO support and that is now irrelevant.
> 
> See discussion with Magnus.?? We still build ARM for jdk10/hs so I 
> needed this conditional or of course I wouldn't have added it.? We can 
> remove it with LTO support.

Those builds are gone - this is obsolete. But yes all LTO can be removed 
later if you wish. Just trying to simplify things now.

>>
>> ---
>>
>> src/java.base/unix/native/include/jvm_md.h
>>
>> I know you've just copied this across, but it seems wrong to me:
>>
>> ?57 // Hack: MAXPATHLEN is 4095 on some Linux and 4096 on others. This 
>> may
>> ? 58 //?????? cause problems if JVM and the rest of JDK are built on 
>> different
>> ? 59 //?????? Linux releases. Here we define JVM_MAXPATHLEN to be 
>> MAXPATHLEN + 1,
>> ? 60 //?????? so buffers declared in VM are always >= 4096.
>> ? 61 #define JVM_MAXPATHLEN MAXPATHLEN + 1
>>
>> It doesn't make sense to me to define an internal "max path length" 
>> that can _exceed_ the platform max!
>>
>> That aside there's no support for building different parts of the JDK 
>> on different platforms and then bringing them together. And in any 
>> case I would think the real problem would be building on a platform 
>> that uses 4096 and running on one that uses 4095!
>>
>> But that aside this is a Linux hack and should be guarded by ifdef 
>> LINUX. (I doubt BSD needs it, the bsd file is just a copy of the linux 
>> one - the JDK macosx version does the right thing). Solaris and AIX 
>> should stay as-is at MAXPATHLEN.
> 
> All of the unix platforms had MAXPATHLEN+1.? I'll leave it for now and 
> we can investigate that further.

I see the following existing code:

src/java.base/unix/native/include/jvm_md.h:

#define JVM_MAXPATHLEN MAXPATHLEN

src/java.base/macosx/native/include/jvm_md.h

#define JVM_MAXPATHLEN MAXPATHLEN

src/hotspot/os/aix/jvm_aix.h

#define JVM_MAXPATHLEN MAXPATHLEN

src/hotspot/os/bsd/jvm_bsd.h

#define JVM_MAXPATHLEN MAXPATHLEN + 1  // blindly copied from Linux version

src/hotspot/os/linux/jvm_linux.h

#define JVM_MAXPATHLEN MAXPATHLEN + 1

src/hotspot/os/solaris/jvm_solaris.h

#define JVM_MAXPATHLEN MAXPATHLEN

This is a linux only hack (if you ignore the blind copy from linux into 
the BSD code in the VM).

>>
>> ?86 #define ASYNC_SIGNAL???? SIGJVM2
>>
>> This only exists on Solaris so I think should be in #ifdef SOLARIS, to 
>> make that clear.
> 
> Ok.? I'll add this.
>>
>> ---
>>
>> src/java.base/windows/native/include/jvm_md.h
>>
>> Given the differences between the two versions either something has 
>> been broken or "extern C" declarations are not needed :)
> 
> Well, they are needed for Hotspot to build and do not prevent jdk from 
> building.? I don't know what was broken.

We really need to understand this better. Maybe related to the map files 
that expose the symbols. ??

>>
>> ---
>>
>> That was a really painful way to spend most of my Friday. TGIF! :)
> 
> Thanks for going through it.? See comments inline for changes. 
> Generating a webrev takes hours so I'm not going to do that unless you 
> insist.

An incremental webrev shouldn't take long - right? You're a mq maestro 
now. :)

If you can reasonably produce an incremental webrev once you've settled 
on all the comments/issues that would be good.

Thanks,
David

> Thanks,
> Coleen
> 
> 
>>
>> Thanks,
>> David
>> -----
>>
>>
>> On 27/10/2017 6:44 AM, coleen.phillimore at oracle.com wrote:
>>> ??Hi Magnus,
>>>
>>> Thank you for reviewing this.?? I have a new version that takes out 
>>> the hack in globalDefinitions.hpp and adds casts to 
>>> src/hotspot/share/opto/type.cpp instead.
>>>
>>> Also some fixes from Martin at SAP.
>>>
>>> open webrev at http://cr.openjdk.java.net/~coleenp/8189610.02/webrev
>>>
>>> see below.
>>>
>>> On 10/26/17 5:57 AM, Magnus Ihse Bursie wrote:
>>>> Coleen,
>>>>
>>>> Thank you for addressing this!
>>>>
>>>> On 2017-10-25 18:49, coleen.phillimore at oracle.com wrote:
>>>>> Summary: removed hotspot version of jvm*h and jni*h files
>>>>>
>>>>> Mostly used sed to remove prims/jvm.h and move #include "jvm.h" 
>>>>> after precompiled.h, so if you have repetitive stress wrist issues 
>>>>> don't click on most of these files.
>>>>>
>>>>> There were more issues to resolve, however.? The JDK windows 
>>>>> jni_md.h file defined jint as long and the hotspot windows 
>>>>> jni_x86.h as int. I had to choose the jdk version since it's the 
>>>>> public version, so there are changes to the hotspot files for this. 
>>>>> Generally I changed the code to use 'int' rather than 'jint' where 
>>>>> the surrounding API didn't insist on consistently using java types. 
>>>>> We should mostly be using C++ types within hotspot except in 
>>>>> interfaces to native/JNI code.? There are a couple of hacks in 
>>>>> places where adding multiple jint casts was too painful.
>>>>>
>>>>> Tested with JPRT and tier2-4 (in progress).
>>>>>
>>>>> open webrev at http://cr.openjdk.java.net/~coleenp/8189610.01/webrev
>>>>
>>>> Looks great!
>>>>
>>>> Just a few comments:
>>>>
>>>> * src/java.base/unix/native/include/jni_md.h:
>>>>
>>>> I don't think the externally_visible attribute should be there for 
>>>> arm. I know this was the case for the corresponding hotspot file for 
>>>> arm, but that was techically incorrect. The proper dependency here 
>>>> is that externally_visible should be in all JNIEXPORT if and only if 
>>>> we're building with JVM feature "link-time-opt". Traditionally, that 
>>>> feature been enabled when building arm32 builds, and only then, so 
>>>> there's been a (coincidentally) connection here. Nowadays, Oracle 
>>>> does not care about the arm32 builds, and I'm not sure if anyone 
>>>> else is building them with link-time-opt enabled.
>>>>
>>>> It does seem wrong to me to export this behavior in the public 
>>>> jni_md.h file, though. I think the correct way to solve this, if we 
>>>> should continue supporting link-time-opt is to make sure this 
>>>> attribute is set for exported hotspot functions. If it's still 
>>>> needed, that is. A quick googling seems to indicate that 
>>>> visibility("default") might be enough in modern gcc's.
>>>>
>>>> A third option is to remove the support for link-time-opt entirely, 
>>>> if it's not really used.
>>>
>>> I didn't know how to change this since we are still building ARM with 
>>> the jdk10/hs repository, and ARM needed this change.? I could wait 
>>> until we bring down the jdk10/master changes that remove the ARM 
>>> build and remove this conditional before I push. Or we could file an 
>>> RFE to remove link-time-opt (?) and remove it then?
>>>
>>>>
>>>> * src/java.base/unix/native/include/jvm_md.h and 
>>>> src/java.base/windows/native/include/jvm_md.h:
>>>>
>>>> These files define a public API, and contain non-trivial changes. I 
>>>> suspect you should file a CSR request. (Even though I realize you're 
>>>> only matching the header file with the reality.)
>>>>
>>>
>>> I filed the CSR.?? Waiting for the next steps.
>>>
>>> Thanks,
>>> Coleen
>>>
>>>> /Magnus
>>>>
>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8189610
>>>>>
>>>>> I have a script to update copyright files on commit.
>>>>>
>>>>> Thanks to Magnus and ErikJ for the makefile changes.
>>>>>
>>>>> Thanks,
>>>>> Coleen
>>>>>
>>>>
>>>
> 

From coleen.phillimore at oracle.com  Fri Oct 27 13:40:08 2017
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Fri, 27 Oct 2017 09:40:08 -0400
Subject: RFR [S] JDK-8179624 [REDO] Avoid repeated calls to
 JavaThread::last_frame in InterpreterRuntime
In-Reply-To: <842ce767-4436-02a3-f536-b71fed1fa6ed@oracle.com>
References: <842ce767-4436-02a3-f536-b71fed1fa6ed@oracle.com>
Message-ID: <5c7e239e-dd97-6bb7-2615-1cdb7c8b1844@oracle.com>

This looks good.
Thanks,
Coleen

On 10/26/17 5:53 PM, Ioi Lam wrote:
> Hi,
>
> Please review the following change. It's a redo of a previous botched
> attempt (JDK-8179305) that had a typo which caused JIT-related crashes.
>
> Thanks to Dean for spotting the typo.
>
> + Bug
> https://bugs.openjdk.java.net/browse/JDK-8179624
>
>
> + The full changeset:
> http://cr.openjdk.java.net/~iklam/jdk10/8179624-redo-8179305-avoid-last-frame.v01.full/ 
>
>
>
> + The delta from the botched attempt
> ? (fixing the typo with monitor_begin/monitor_end):
> http://cr.openjdk.java.net/~iklam/jdk10/8179624-redo-8179305-avoid-last-frame.v01.redo_delta/ 
>
>
>
> + Testing:
> hotspot tier1~5 tests.
>
>
> Thanks
> - Ioi


From coleen.phillimore at oracle.com  Fri Oct 27 14:08:42 2017
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Fri, 27 Oct 2017 10:08:42 -0400
Subject: RFR (L, tedious again, sorry) 8189610: Reconcile jvm.h and all
 jvm_md.h between java.base and hotspot
In-Reply-To: <4509dce7-10f8-4558-2adb-90d4745e054e@oracle.com>
References: <8671321f-398c-5f7f-634d-9d9664e04d87@oracle.com>
 <05bf853a-52f8-c450-4171-89f6b64d793a@oracle.com>
 <72f2aac7-ed20-c995-913b-ee4341a2a978@oracle.com>
 <55ec3559-c593-bcb6-51b0-4639da126068@oracle.com>
 <a05deb31-c7bf-4fda-f621-d39cfb56abb4@oracle.com>
 <4509dce7-10f8-4558-2adb-90d4745e054e@oracle.com>
Message-ID: <396ab0f7-3710-3f76-675a-5108bcb50af5@oracle.com>


On 10/27/17 9:37 AM, David Holmes wrote:
> On 27/10/2017 10:13 PM, coleen.phillimore at oracle.com wrote:
>>
>>
>> On 10/27/17 3:23 AM, David Holmes wrote:
>>> Hi Coleen,
>>>
>>> Thanks for tackling this.
>>>
>>>> Summary: removed hotspot version of jvm*h and jni*h files
>>>
>>> Can you update the bug synopsis to show it covers both sets of files 
>>> please.
>>>
>>> I hate to start with this (and it took me quite a while to realize 
>>> it) but as Mandy pointed out jvm.h is not an exported interface from 
>>> the JDK to the outside world (so not subject to CSR review), but is 
>>> a private interface between the JVM and the JDK libraries. So I 
>>> think really jvm.h belongs in the hotspot sources where it was, 
>>> while jni.h belongs in the exported JDK sources. In which case the 
>>> bulk of your changes to the hotspot files would not be needed - sorry.
>>
>> Maybe someone can make that decision and change at a later date. The 
>> point of this change is that there is now only one of these files 
>> that is shared.? I don't think jvm.h and the jvm_md.h belong on the 
>> hotspot sources for the jdk to find them in some random prims and os 
>> dependent directories.
>
> The one file that is needed is a hotspot file - jvm.h defines the 
> interface that hotspot exports via jvm.cpp.
>
> If you leave jvm.h in hotspot/prims then a very large chunk of your 
> boilerplate changes are not needed. The JDK code doesn't care what the 
> name of the directory is - whatever it is just gets added as a -I 
> directive (the JDK code will include "jvm.h" not "prims/jvm.h" the way 
> hotspot sources do.
>
> This isn't something we want to change back or move again later. 
> Whatever we do now we live with.

I think it belongs with jni.h and I think the core libraries group would 
agree.?? It seems more natural there than buried in the hotspot prims 
directory.? I guess this is on hold while we have this debate.?? Sigh.

Actually with -I directives, changing to jvm.h from prims/jvm.h would 
still work.?? Maybe we should change the name to jvm.hpp since it's 
jvm.cpp though??? Or maybe just have two divergent copies and close this 
as WNF.

>
>> I'm happy to withdraw the CSR.? We generally use the CSR process to 
>> add and remove JVM_ interfaces even though they're a private 
>> interface in case some other JVM/JDK combination relies on them. The 
>> changes to these files are very minor though and not likely to cause 
>> any even theoretical incompatibility, so I'll withdraw it.
>>>
>>> Moving on ...
>>>
>>> First to address the initial comments/query you had:
>>>
>>>> The JDK windows jni_md.h file defined jint as long and the hotspot
>>>> windows jni_x86.h as int. I had to choose the jdk version since 
>>>> it's the
>>>> public version, so there are changes to the hotspot files for this.
>>>
>>> On Windows int and long are always the same as it uses ILP32 or 
>>> LLP64 (not LP64 like *nix platforms). So either choice should be 
>>> fine. That said there are some odd casting issues I comment on 
>>> below. Does the VS compiler complain about mixing int and long in 
>>> expressions?
>>
>> Yes, it does even though int and long are the same representation.
>
> And what an absolute mess that makes. :(
>
>>>
>>>> Generally I changed the code to use 'int' rather than 'jint' where the
>>>> surrounding API didn't insist on consistently using java types. We
>>>> should mostly be using C++ types within hotspot except in 
>>>> interfaces to
>>>> native/JNI code.
>>>
>>> I think you pulled too hard on a few threads here and things are 
>>> starting to unravel. There are numerous cases I refer to below where 
>>> either the cast seems unnecessary/inappropriate or else highlights a 
>>> bunch of additional changes that also need to be made. The fan out 
>>> from this could be horrendous. Unless you actually get some kind of 
>>> error - and I'd like to understand the details of those - I would 
>>> not suggest making these changes as part of this work.
>>
>> I didn't make any change unless there was was an error.? I have 100 
>> failed JPRT jobs to confirm!? I eventually got a Windows system to 
>> compile and test this on.?? Actually some of the changes came out 
>> better.? Cases where we use jint as a bool simply turned to int.? We 
>> do not have an overload for bool for cmpxchg.
>
> That's unfortunate - ditto for OrderAccess.
>
>>>
>>> Looking through I have a quite a few queries/comments - apologies in 
>>> advance as I know how tedious this is:
>>>
>>> make/hotspot/lib/CompileLibjsig.gmk
>>> src/java.base/solaris/native/libjsig/jsig.c
>>>
>>> Took a while to figure out why the include was needed. :) As a 
>>> follow up I suggest just deleting the -I include directive, delete 
>>> the Solaris-only definition of JSIG_VERSION_1_4_1, and delete 
>>> everything to do with JVM_get_libjsig_version. It is all obsolete.
>>
>> Can I patch up jsig in a separate RFE?? I don't remember why this 
>> broke so I simply moved JSIG #define.? Is jsig obsolete? Removing 
>> JVM_* definitions generally requires a CSR.
>
> I did say "As a follow up". jsig is not obsolete but the jsig 
> versioning code, only used by Solaris, is.
>
>>>
>>> ---
>>>
>>> src/hotspot/cpu/arm/interp_masm_arm.cpp
>>>
>>> Why did you need to add the jvm.h include?
>>>
>>
>> ?? tbz(Raccess_flags, JVM_ACC_SYNCHRONIZED_BIT, unlocked);
>
> Okay. I'm not going to try and figure out how this code found this 
> before.
>
>>> ---
>>>
>>> src/hotspot/os/windows/os_windows.cpp.
>>>
>>> The type of process_exiting should be uint to match the DWORD of 
>>> GetCurrentThreadID. Then you should need any casts. Also you missed 
>>> this jint cast:
>>>
>>> 3796???????? process_exiting != (jint)GetCurrentThreadId()) {
>>
>> Yes, that's better to change process_exiting to a DWORD.? It needs a 
>> DWORD cast to 0 in the cmpxchg.
>>
>> ???????? Atomic::cmpxchg(GetCurrentThreadId(), &process_exiting, 
>> (DWORD)0);
>>
>> These templates are picky.
>
> Yes - their inability to deal with literals is extremely frustrating.
>
>>>
>>> ---
>>>
>>> src/hotspot/share/c1/c1_Canonicalizer.hpp
>>>
>>> ? 43 #ifdef _WINDOWS
>>> ? 44?? // jint is defined as long in jni_md.h, so convert from int 
>>> to jint
>>> ? 45?? void set_constant(int x)?????????????????????? { 
>>> set_constant((jint)x); }
>>> ? 46 #endif
>>>
>>> Why is this necessary? int and long are the same on Windows. The 
>>> whole point is that jint hides the underlying type, so where does 
>>> this go wrong?
>>
>> No, they are not the same types even though they have the same 
>> representation!
>
> This is truly unfortunate.
>
>>>
>>> ---
>>>
>>> src/hotspot/share/c1/c1_LinearScan.cpp
>>>
>>> ?ConstantIntValue((jint)0);
>>>
>>> why is this cast needed? what causes the ambiguity? (If this was a 
>>> template I'd understand ;-) ). Also didn't you change that 
>>> constructor to take an int anyway - not that I think it should - see 
>>> below.
>>
>> Yes, it caused an ambiguity.? 0 matches 'int' but it doesn't match 
>> 'long' better than any pointer type.? So this cast is needed.
>
> But you changed the constructor to take an int!
>
> ?class ConstantIntValue: public ScopeValue {
> ? private:
> -? jint _value;
> +? int _value;
> ? public:
> -? ConstantIntValue(jint value)???????? { _value = value; }
> +? ConstantIntValue(int value)????????? { _value = value; }
>
>

Okay I removed this cast.

>>> ---
>>>
>>> src/hotspot/share/ci/ciReplay.cpp
>>>
>>> 793???????? jint* dims = NEW_RESOURCE_ARRAY(jint, rank);
>>>
>>> why should this be jint?
>>
>> To avoid a cast from int* to jint* in the line below:
>>
>> ????????? value = kelem->multi_allocate(rank, dims, CHECK);
>>
>>
>>>
>>> ---
>>>
>>> src/hotspot/share/classfile/altHashing.cpp
>>>
>>> Okay this looks more consistent with jint.
>>
>> Yes.? I translated this from some native code iirc.
>>>
>>> ---
>>>
>>> src/hotspot/share/code/debugInfo.hpp
>>>
>>> These changes seem wrong. We have:
>>>
>>> ConstantLongValue(jlong value)
>>> ConstantDoubleValue(jdouble value)
>>>
>>> so we should have:
>>>
>>> ConstantIntValue(jint value)
>>
>> Again, there are multiple call sites with '0', which match int 
>> trivially but are confused with long.? It's less consistent I agree 
>> but better to not cast all the call sites.
>
> This is really making a mess of the APIs - they should be a jint but 
> we declare them int because of a 0 casting problem. Can't we just use 0L?

There aren't that many casts.? You're right, that would have been better 
in some places.

>>>
>>> ---
>>>
>>> src/hotspot/share/code/relocInfo.cpp
>>>
>>> Change seems unnecessary - int32_t is fine
>>>
>>
>> No, int32_t doesn't match the calls below it.? They all assume _lo 
>> and _hi are jint.
>>> ---
>>>
>>> src/hotspot/share/compiler/compileBroker.cpp
>>> src/hotspot/share/compiler/compileBroker.hpp
>>>
>>> I see a complete mix of int and jint in this class, so why make the 
>>> one change you did ??
>>
>> This is another case of using jint as a flag with cmpxchg.? The 
>> templates for cmpxchg want the types to match and 0 and 1 are 
>> essentially 'int'.? This is a lot cleaner this way.
>
> <sigh>
>
>>>
>>> ---
>>>
>>> src/hotspot/share/jvmci/jvmciCompilerToVM.cpp
>>>
>>> 1700???? tty->write((char*) start, MIN2(length, (jint)O_BUFLEN));
>>>
>>> why did you need to add the jint cast? It's used without any cast on 
>>> the next two lines:
>>>
>>> 1701???? length -= O_BUFLEN;
>>> 1702???? offset += O_BUFLEN;
>>>
>>
>> There's a conversion from O_BUFLEN from int to long in 1701 and 
>> 1702.?? MIN2 is a template that wants the types to match exactly.
>
> $%^%$! templates!
>
>>> ??
>>>
>>> ---
>>>
>>> src/hotspot/share/jvmci/jvmciRuntime.cpp
>>>
>>> Looking around this code it seems very confused about types - eg the 
>>> previous function is declared jboolean yet returns a jint on one 
>>> path! It isn't clear to me if the return type is what should be 
>>> changed or the parameter type? I would just leave this alone.
>>
>> I can't leave it alone because it doesn't compile that way. This was 
>> the minimal change and yea, does look a bit inconsistent.
>>>
>>> ---
>>>
>>> src/hotspot/share/opto/mulnode.cpp
>>>
>>> Okay TypeInt has jint parts, so the remaining int32_t declarations 
>>> (A, B, C, D) should also be jint.
>>
>> Yes.? c2 uses jint types.
>>>
>>> ---
>>>
>>> src/hotspot/share/opto/parse3.cpp
>>>
>>> I agree with the changes you made, but then:
>>>
>>> ?419???? jint dim_con = find_int_con(length[j], -1);
>>>
>>> should also be changed.
>>>
>>> And obviously MultiArrayExpandLimit should be defined as int not intx!
>>
>> Everything in globals.hpp is intx.? That's a thread that I don't want 
>> to pull on!
>
> We still have that limitation? <double sigh>
>>
>> Changed dim_con to int.
>>>
>>> ---
>>>
>>> src/hotspot/share/opto/phaseX.cpp
>>>
>>> I can see that intcon(jint i) is consistent with longcon(jlong l), 
>>> but the use of "i" in the code is more consistent with int than jint.
>>
>> huh?? really?
>>>
>>> ---
>>>
>>> src/hotspot/share/opto/type.cpp
>>>
>>> 1505 int TypeInt::hash(void) const {
>>> 1506?? return java_add(java_add(_lo, _hi), java_add((jint)_widen, 
>>> (jint)Type::Int));
>>> 1507 }
>>>
>>> I can see that the (jint) casts you added make sense, but then the 
>>> whole function should be returning jint not int. Ditto the other 
>>> hash functions.
>>
>> I'm not messing with this, this is the minimal in type fixing that 
>> I'm going to do here.
>
> <sigh>
>
>>>
>>> ---
>>>
>>> src/hotspot/share/prims/jni.cpp
>>>
>>> I think vm_created should be a bool. In fact all the fields you 
>>> changed are logically bools - do Atomics work for bool now?
>>
>> No, they do not.?? I had thought bool would be better originally too.
>>>
>>> ---
>>>
>>> src/hotspot/share/prims/jvm.cpp
>>>
>>> is_attachable is the terminology used in the JDK code.
>>
>> Well the JDK version had is_attach_supported() as the flag name so I 
>> used that in this one place.
>>>
>>> ---
>>>
>>> src/hotspot/share/prims/jvmtiEnvBase.cpp
>>> src/hotspot/share/prims/jvmtiImpl.cpp
>>>
>>> Are you making parameters consistent with the fields they initialize?
>>
>> They're consistent with the declarations now.
>>>
>>> ---
>>>
>>> src/hotspot/share/prims/jvmtiTagMap.cpp
>>>
>>> There is a mix of int and jint for slot in this code. You fixed 
>>> some, but this remains:
>>>
>>> 2440 inline bool CallbackInvoker::report_stack_ref_root(jlong 
>>> thread_tag,
>>> 2441??????????????????????????????????????????????????? jlong tid,
>>> 2442??????????????????????????????????????????????????? jint depth,
>>> 2443 jmethodID method,
>>> 2444 jlocation bci,
>>> 2445??????????????????????????????????????????????????? jint slot,
>>
>> Right for consistency with the declarations.
>>>
>>> ---
>>>
>>> src/hotspot/share/runtime/perfData.cpp
>>>
>>> Callers pass both jint and int, so param type seems arbitrary.
>>
>> They are, but importantly they match the declarations.
>>>
>>> ---
>>>
>>> src/hotspot/share/runtime/perfMemory.cpp
>>> src/hotspot/share/runtime/perfMemory.hpp
>>>
>>> PerfMemory::_initialized should ideally be a bool - can OrderAccess 
>>> handle that now?
>>
>> Nope.
>>>
>>> ---
>>>
>>> src/java.base/share/native/include/jvm.h
>>>
>>> Not clear why the jio functions are not also JNICALL ?
>>
>> They are now.? The JDK version didn't have JNICALL.? JVM needs 
>> JNICALL.? I can't tell you why JDK didn't need JNICALL linkage.
>
> ?? JVM currently does not have JNICALL. But they are declared as 
> "extern C".

This was a compilation error on Windows with JDK.?? Maybe the C code in 
the JDK doesn't complain about linkage differences.? I'll have to go 
back and figure this out then.
>
>>>
>>> ---
>>>
>>> src/java.base/unix/native/include/jni_md.h
>>>
>>> There is no need to special case ARM. The differences in the 
>>> existing code were for LTO support and that is now irrelevant.
>>
>> See discussion with Magnus.?? We still build ARM for jdk10/hs so I 
>> needed this conditional or of course I wouldn't have added it.? We 
>> can remove it with LTO support.
>
> Those builds are gone - this is obsolete. But yes all LTO can be 
> removed later if you wish. Just trying to simplify things now.
>
>>>
>>> ---
>>>
>>> src/java.base/unix/native/include/jvm_md.h
>>>
>>> I know you've just copied this across, but it seems wrong to me:
>>>
>>> ?57 // Hack: MAXPATHLEN is 4095 on some Linux and 4096 on others. 
>>> This may
>>> ? 58 //?????? cause problems if JVM and the rest of JDK are built on 
>>> different
>>> ? 59 //?????? Linux releases. Here we define JVM_MAXPATHLEN to be 
>>> MAXPATHLEN + 1,
>>> ? 60 //?????? so buffers declared in VM are always >= 4096.
>>> ? 61 #define JVM_MAXPATHLEN MAXPATHLEN + 1
>>>
>>> It doesn't make sense to me to define an internal "max path length" 
>>> that can _exceed_ the platform max!
>>>
>>> That aside there's no support for building different parts of the 
>>> JDK on different platforms and then bringing them together. And in 
>>> any case I would think the real problem would be building on a 
>>> platform that uses 4096 and running on one that uses 4095!
>>>
>>> But that aside this is a Linux hack and should be guarded by ifdef 
>>> LINUX. (I doubt BSD needs it, the bsd file is just a copy of the 
>>> linux one - the JDK macosx version does the right thing). Solaris 
>>> and AIX should stay as-is at MAXPATHLEN.
>>
>> All of the unix platforms had MAXPATHLEN+1.? I'll leave it for now 
>> and we can investigate that further.
>
> I see the following existing code:
>
> src/java.base/unix/native/include/jvm_md.h:
>
> #define JVM_MAXPATHLEN MAXPATHLEN
>
> src/java.base/macosx/native/include/jvm_md.h
>
> #define JVM_MAXPATHLEN MAXPATHLEN
>
> src/hotspot/os/aix/jvm_aix.h
>
> #define JVM_MAXPATHLEN MAXPATHLEN
>
> src/hotspot/os/bsd/jvm_bsd.h
>
> #define JVM_MAXPATHLEN MAXPATHLEN + 1? // blindly copied from Linux 
> version
>
> src/hotspot/os/linux/jvm_linux.h
>
> #define JVM_MAXPATHLEN MAXPATHLEN + 1
>
> src/hotspot/os/solaris/jvm_solaris.h
>
> #define JVM_MAXPATHLEN MAXPATHLEN
>
> This is a linux only hack (if you ignore the blind copy from linux 
> into the BSD code in the VM).

Oh, thanks, so should I add a bunch of ifdefs then?? Or do you think 
having MAXPATHLEN + 1 will really break the other platforms?? Do you 
really see this as a problem or are you just pointing out inconsistency?
>
>>>
>>> ?86 #define ASYNC_SIGNAL???? SIGJVM2
>>>
>>> This only exists on Solaris so I think should be in #ifdef SOLARIS, 
>>> to make that clear.
>>
>> Ok.? I'll add this.
>>>
>>> ---
>>>
>>> src/java.base/windows/native/include/jvm_md.h
>>>
>>> Given the differences between the two versions either something has 
>>> been broken or "extern C" declarations are not needed :)
>>
>> Well, they are needed for Hotspot to build and do not prevent jdk 
>> from building.? I don't know what was broken.
>
> We really need to understand this better. Maybe related to the map 
> files that expose the symbols. ??

They're needed because the JDK files are written mostly in C and that 
doesn't complain about the linkage difference.? Hotspot files are in C++ 
which does complain.

>
>>>
>>> ---
>>>
>>> That was a really painful way to spend most of my Friday. TGIF! :)
>>
>> Thanks for going through it.? See comments inline for changes. 
>> Generating a webrev takes hours so I'm not going to do that unless 
>> you insist.
>
> An incremental webrev shouldn't take long - right? You're a mq maestro 
> now. :)

Well I generally trash a repository whenever I use mq but sure.
>
> If you can reasonably produce an incremental webrev once you've 
> settled on all the comments/issues that would be good.

Ok, sure.

Coleen
>
> Thanks,
> David
>
>> Thanks,
>> Coleen
>>
>>
>>>
>>> Thanks,
>>> David
>>> -----
>>>
>>>
>>> On 27/10/2017 6:44 AM, coleen.phillimore at oracle.com wrote:
>>>> ??Hi Magnus,
>>>>
>>>> Thank you for reviewing this.?? I have a new version that takes out 
>>>> the hack in globalDefinitions.hpp and adds casts to 
>>>> src/hotspot/share/opto/type.cpp instead.
>>>>
>>>> Also some fixes from Martin at SAP.
>>>>
>>>> open webrev at http://cr.openjdk.java.net/~coleenp/8189610.02/webrev
>>>>
>>>> see below.
>>>>
>>>> On 10/26/17 5:57 AM, Magnus Ihse Bursie wrote:
>>>>> Coleen,
>>>>>
>>>>> Thank you for addressing this!
>>>>>
>>>>> On 2017-10-25 18:49, coleen.phillimore at oracle.com wrote:
>>>>>> Summary: removed hotspot version of jvm*h and jni*h files
>>>>>>
>>>>>> Mostly used sed to remove prims/jvm.h and move #include "jvm.h" 
>>>>>> after precompiled.h, so if you have repetitive stress wrist 
>>>>>> issues don't click on most of these files.
>>>>>>
>>>>>> There were more issues to resolve, however.? The JDK windows 
>>>>>> jni_md.h file defined jint as long and the hotspot windows 
>>>>>> jni_x86.h as int. I had to choose the jdk version since it's the 
>>>>>> public version, so there are changes to the hotspot files for 
>>>>>> this. Generally I changed the code to use 'int' rather than 
>>>>>> 'jint' where the surrounding API didn't insist on consistently 
>>>>>> using java types. We should mostly be using C++ types within 
>>>>>> hotspot except in interfaces to native/JNI code.? There are a 
>>>>>> couple of hacks in places where adding multiple jint casts was 
>>>>>> too painful.
>>>>>>
>>>>>> Tested with JPRT and tier2-4 (in progress).
>>>>>>
>>>>>> open webrev at http://cr.openjdk.java.net/~coleenp/8189610.01/webrev
>>>>>
>>>>> Looks great!
>>>>>
>>>>> Just a few comments:
>>>>>
>>>>> * src/java.base/unix/native/include/jni_md.h:
>>>>>
>>>>> I don't think the externally_visible attribute should be there for 
>>>>> arm. I know this was the case for the corresponding hotspot file 
>>>>> for arm, but that was techically incorrect. The proper dependency 
>>>>> here is that externally_visible should be in all JNIEXPORT if and 
>>>>> only if we're building with JVM feature "link-time-opt". 
>>>>> Traditionally, that feature been enabled when building arm32 
>>>>> builds, and only then, so there's been a (coincidentally) 
>>>>> connection here. Nowadays, Oracle does not care about the arm32 
>>>>> builds, and I'm not sure if anyone else is building them with 
>>>>> link-time-opt enabled.
>>>>>
>>>>> It does seem wrong to me to export this behavior in the public 
>>>>> jni_md.h file, though. I think the correct way to solve this, if 
>>>>> we should continue supporting link-time-opt is to make sure this 
>>>>> attribute is set for exported hotspot functions. If it's still 
>>>>> needed, that is. A quick googling seems to indicate that 
>>>>> visibility("default") might be enough in modern gcc's.
>>>>>
>>>>> A third option is to remove the support for link-time-opt 
>>>>> entirely, if it's not really used.
>>>>
>>>> I didn't know how to change this since we are still building ARM 
>>>> with the jdk10/hs repository, and ARM needed this change.? I could 
>>>> wait until we bring down the jdk10/master changes that remove the 
>>>> ARM build and remove this conditional before I push. Or we could 
>>>> file an RFE to remove link-time-opt (?) and remove it then?
>>>>
>>>>>
>>>>> * src/java.base/unix/native/include/jvm_md.h and 
>>>>> src/java.base/windows/native/include/jvm_md.h:
>>>>>
>>>>> These files define a public API, and contain non-trivial changes. 
>>>>> I suspect you should file a CSR request. (Even though I realize 
>>>>> you're only matching the header file with the reality.)
>>>>>
>>>>
>>>> I filed the CSR.?? Waiting for the next steps.
>>>>
>>>> Thanks,
>>>> Coleen
>>>>
>>>>> /Magnus
>>>>>
>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8189610
>>>>>>
>>>>>> I have a script to update copyright files on commit.
>>>>>>
>>>>>> Thanks to Magnus and ErikJ for the makefile changes.
>>>>>>
>>>>>> Thanks,
>>>>>> Coleen
>>>>>>
>>>>>
>>>>
>>


From robbin.ehn at oracle.com  Fri Oct 27 14:45:36 2017
From: robbin.ehn at oracle.com (Robbin Ehn)
Date: Fri, 27 Oct 2017 16:45:36 +0200
Subject: RFR(XL): 8185640: Thread-local handshakes
In-Reply-To: <3d3474e5-2380-8209-cb95-3ca8cc4aa4ed@redhat.com>
References: <6f2f6259-73f1-c09c-063e-39ae528fb96f@oracle.com>
 <edf27200af4e4fb4a8bda1831ae2d974@sap.com>
 <6e10687c-f70e-5ee7-414f-b2c22d3e8f21@oracle.com>
 <f3553eec9b1d4815adbf7c87ca550527@sap.com>
 <580AD7F0-2713-472C-A440-AAFDDA2D3EB3@oracle.com>
 <7591f6c0-7192-78c3-fe79-56a7785c43e4@oracle.com>
 <2ff79d24-90ab-822a-bd61-e01b79c01ada@redhat.com>
 <8d7678bf2281406da43cbe090276b51f@sap.com>
 <28bc3976-424d-1e05-cf7f-29bc38ccabcb@oracle.com>
 <818e352d5e3a450491cf0c140bf129d6@sap.com>
 <1c4a025da6ad4d39bedc1d6a12549b87@sap.com>
 <93cce80b-e9d7-f016-1324-2b0f5fac48c4@redhat.com> <59F1F3B3.10701@oracle.com>
 <43837915-a3a3-b36f-940e-1327937f0f17@redhat.com>
 <2EB9D7C3-B868-4C3E-BD88-6A4F92A39999@oracle.com>
 <b4d75241-f87f-3aef-ad55-897d1f91be1e@redhat.com>
 <59F2DC24.8050701@oracle.com>
 <cd99843e-2602-c423-c74e-210884713ef5@redhat.com> <59F2F01A.403@oracle.com>
 <d0fe324f-26ed-7fca-e8f9-81b1ca4f452d@oracle.com>
 <3d3474e5-2380-8209-cb95-3ca8cc4aa4ed@redhat.com>
Message-ID: <de4e9aa4-a8d2-172b-73b6-e8401e837927@oracle.com>

On 2017-10-27 15:21, Andrew Haley wrote:
> On 27/10/17 14:14, Robbin Ehn wrote:
>> We are discussing the opt-out option, the newest suggestion is to make it
>> diagnostic. Opinions?
> 
> We're working on ultra-low-pause-time garbage collection, and it would be very
> useful to be able to safepoint the interpreter at any bytecode, not at jumps.
> It is a performance-related option rather than diagonstic.
>

For that I suggest the e.g UseShenandoah to set a VM internal global setting to low latency.
Not exposing yet another option to the user.
And in dispatch_base look for that, e.g:

if (SafepointMechanism::uses_thread_local_poll() && table != safepoint_table &&
       (generate_poll || SOME_GLOBAL_SETTING_FOR_LOW_LATENCY)) {

When I get this into jdk10/hs, down-stream it to Shenandoah repo, do the benchmarks, upstream to jdk10/hs (ZGC might want this also).

/Robbin


From martin.doerr at sap.com  Fri Oct 27 14:47:13 2017
From: martin.doerr at sap.com (Doerr, Martin)
Date: Fri, 27 Oct 2017 14:47:13 +0000
Subject: RFR(XL): 8185640: Thread-local handshakes
In-Reply-To: <d0fe324f-26ed-7fca-e8f9-81b1ca4f452d@oracle.com>
References: <6f2f6259-73f1-c09c-063e-39ae528fb96f@oracle.com>
 <18f2001cbbbd4772aa9268e6e34b4be9@sap.com>
 <770b1286-3c8e-92e5-3929-17eb4e6c3847@oracle.com>
 <edf27200af4e4fb4a8bda1831ae2d974@sap.com>
 <6e10687c-f70e-5ee7-414f-b2c22d3e8f21@oracle.com>
 <f3553eec9b1d4815adbf7c87ca550527@sap.com>
 <580AD7F0-2713-472C-A440-AAFDDA2D3EB3@oracle.com>
 <7591f6c0-7192-78c3-fe79-56a7785c43e4@oracle.com>
 <2ff79d24-90ab-822a-bd61-e01b79c01ada@redhat.com>
 <8d7678bf2281406da43cbe090276b51f@sap.com>
 <28bc3976-424d-1e05-cf7f-29bc38ccabcb@oracle.com>
 <818e352d5e3a450491cf0c140bf129d6@sap.com>
 <1c4a025da6ad4d39bedc1d6a12549b87@sap.com>
 <93cce80b-e9d7-f016-1324-2b0f5fac48c4@redhat.com> <59F1F3B3.10701@oracle.com>
 <43837915-a3a3-b36f-940e-1327937f0f17@redhat.com>
 <2EB9D7C3-B868-4C3E-BD88-6A4F92A39999@oracle.com>
 <b4d75241-f87f-3aef-ad55-897d1f91be1e@redhat.com>
 <59F2DC24.8050701@oracle.com>
 <cd99843e-2602-c423-c74e-210884713ef5@redhat.com> <59F2F01A.403@oracle.com>
 <d0fe324f-26ed-7fca-e8f9-81b1ca4f452d@oracle.com>
Message-ID: <4ebb905f23324a00b9cf10d8d410d420@sap.com>

Hi Robbin,

excellent. I think this matches what Coleen had proposed, now.
Thanks for doing all the work with so many incremental patches and for responding on so many discussions. Seems to be a tough piece of work.

Best regards,
Martin


-----Original Message-----
From: Robbin Ehn [mailto:robbin.ehn at oracle.com] 
Sent: Freitag, 27. Oktober 2017 15:15
To: Erik ?sterlund <erik.osterlund at oracle.com>; Andrew Haley <aph at redhat.com>; Doerr, Martin <martin.doerr at sap.com>; Karen Kinnear <karen.kinnear at oracle.com>; Coleen Phillimore (coleen.phillimore at oracle.com) <coleen.phillimore at oracle.com>
Cc: hotspot-dev developers <hotspot-dev at openjdk.java.net>
Subject: Re: RFR(XL): 8185640: Thread-local handshakes

Hi all,

Poll in switches:
http://cr.openjdk.java.net/~rehn/8185640/v7/Interpreter-Poll-Switch-10/

Poll in return:
http://cr.openjdk.java.net/~rehn/8185640/v7/Interpreter-Poll-Ret-11/

Please take an extra look at poll in return.

Sanity tested, big test run still running (99% complete - OK).

Performance regression for the added polls increased to total of -0.68% vs 
global poll. (was -0.44%)

We are discussing the opt-out option, the newest suggestion is to make it 
diagnostic. Opinions?

For anyone applying these patches, the number 9 patch changes the option from 
product. I have not sent that out.

Thanks, Robbin


From coleen.phillimore at oracle.com  Fri Oct 27 15:13:57 2017
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Fri, 27 Oct 2017 11:13:57 -0400
Subject: RFR (L, tedious again, sorry) 8189610: Reconcile jvm.h and all
 jvm_md.h between java.base and hotspot
In-Reply-To: <4509dce7-10f8-4558-2adb-90d4745e054e@oracle.com>
References: <8671321f-398c-5f7f-634d-9d9664e04d87@oracle.com>
 <05bf853a-52f8-c450-4171-89f6b64d793a@oracle.com>
 <72f2aac7-ed20-c995-913b-ee4341a2a978@oracle.com>
 <55ec3559-c593-bcb6-51b0-4639da126068@oracle.com>
 <a05deb31-c7bf-4fda-f621-d39cfb56abb4@oracle.com>
 <4509dce7-10f8-4558-2adb-90d4745e054e@oracle.com>
Message-ID: <57390ec3-8d8d-a3d7-9774-b5945a323be9@oracle.com>


On 10/27/17 9:37 AM, David Holmes wrote:
>>> src/hotspot/share/c1/c1_LinearScan.cpp
>>>
>>> ?ConstantIntValue((jint)0);
>>>
>>> why is this cast needed? what causes the ambiguity? (If this was a 
>>> template I'd understand ;-) ). Also didn't you change that 
>>> constructor to take an int anyway - not that I think it should - see 
>>> below.
>>
>> Yes, it caused an ambiguity.? 0 matches 'int' but it doesn't match 
>> 'long' better than any pointer type.? So this cast is needed.
>
> But you changed the constructor to take an int!
>
> ?class ConstantIntValue: public ScopeValue {
> ? private:
> -? jint _value;
> +? int _value;
> ? public:
> -? ConstantIntValue(jint value)???????? { _value = value; }
> +? ConstantIntValue(int value)????????? { _value = value; }
>
I changed this back to not take an int and changed c1_LinearScan.cpp to 
have the (jint)0 cast and output.cp needed (jint)0 casts.? 0L doesn't 
work for platforms where jint is an 'int' rather than a long because 
it's ambiguous with the functions that take a pointer type.
Probably better to keep the type of ConstantIntValue consistent with j 
types.

Thanks,
Coleen

From jini.george at oracle.com  Fri Oct 27 15:49:45 2017
From: jini.george at oracle.com (Jini George)
Date: Fri, 27 Oct 2017 21:19:45 +0530
Subject: RFR: SA: JDK-8189798: SA cleanup - part 1
In-Reply-To: <691d8166-5395-906a-4256-ef0ab2e2773a@oracle.com>
References: <18501902-23db-de6c-b83d-640cd33df836@oracle.com>
 <e7cf9e4a-7986-317d-56dc-7594e0b3c798@oracle.com>
 <691d8166-5395-906a-4256-ef0ab2e2773a@oracle.com>
Message-ID: <f5483c7b-68ea-723f-6bd6-e2fe5d3d4570@oracle.com>

Thank you very much, Serguei.

-Jini.

On 10/27/2017 2:22 PM, serguei.spitsyn at oracle.com wrote:
> Hi Jini,
> 
> The fix looks good to me.
> 
> Thanks,
> Serguei
> 
> 
> On 10/24/17 00:31, Jini George wrote:
>> Adding hotspot-dev too.
>>
>> Thanks,
>> Jini.
>>
>> On 10/24/2017 12:05 PM, Jini George wrote:
>>> Hello,
>>>
>>> As a part of SA next, I am working on writing a test case which 
>>> compares the fields and the types of the fields of the SA java 
>>> classes with the corresponding entries in the vmStructs tables. This, 
>>> to some extent, would help in preventing errors in SA due to the 
>>> changes in hotspot. As a precursor to this, I am in the process of 
>>> making some cleanup related changes (mostly in SA). I plan to have 
>>> the changes done in parts. For this webrev, most of the changes are for:
>>>
>>> 1. Avoiding having some values being redefined in SA. Instead have 
>>> those exported through vmStructs, and read it in SA. 
>>> (CompactibleFreeListSpace::_min_chunk_size_in_bytes, 
>>> CompactibleFreeListSpace::IndexSetSize)
>>>
>>> Redefinition of hotspot values in SA makes SA error prone, when the 
>>> value gets altered in hotspot and the corresponding modification gets 
>>> missed out in SA.
>>>
>>> 2. To remove some unused code (JNIid.java).
>>> 3. Add the missing "CMSBitMap::_bmStartWord" in vmStructs.
>>> 4. Modify variable names in SA and hotspot to match the counterpart 
>>> names, so that the comparison of the fields become easier. Most of 
>>> the changes belong to this group.
>>>
>>> Could I please get reviews done for these precursor changes ?
>>>
>>> JBS Id: https://bugs.openjdk.java.net/browse/JDK-8189798
>>> webrev: http://cr.openjdk.java.net/~jgeorge/8189798/webrev.00/
>>>
>>> Thank you,
>>> Jini.
>>>
> 

From mandy.chung at oracle.com  Fri Oct 27 17:47:21 2017
From: mandy.chung at oracle.com (mandy chung)
Date: Fri, 27 Oct 2017 10:47:21 -0700
Subject: RFR (L, tedious again, sorry) 8189610: Reconcile jvm.h and all
 jvm_md.h between java.base and hotspot
In-Reply-To: <396ab0f7-3710-3f76-675a-5108bcb50af5@oracle.com>
References: <8671321f-398c-5f7f-634d-9d9664e04d87@oracle.com>
 <05bf853a-52f8-c450-4171-89f6b64d793a@oracle.com>
 <72f2aac7-ed20-c995-913b-ee4341a2a978@oracle.com>
 <55ec3559-c593-bcb6-51b0-4639da126068@oracle.com>
 <a05deb31-c7bf-4fda-f621-d39cfb56abb4@oracle.com>
 <4509dce7-10f8-4558-2adb-90d4745e054e@oracle.com>
 <396ab0f7-3710-3f76-675a-5108bcb50af5@oracle.com>
Message-ID: <51f09db9-06f5-ad01-bc92-1d73e1113f86@oracle.com>


On 10/27/17 7:08 AM, coleen.phillimore at oracle.com wrote:
>
>
> On 10/27/17 9:37 AM, David Holmes wrote:
>>
>> The one file that is needed is a hotspot file - jvm.h defines the 
>> interface that hotspot exports via jvm.cpp.
>>
>> If you leave jvm.h in hotspot/prims then a very large chunk of your 
>> boilerplate changes are not needed. The JDK code doesn't care what 
>> the name of the directory is - whatever it is just gets added as a -I 
>> directive (the JDK code will include "jvm.h" not "prims/jvm.h" the 
>> way hotspot sources do.
>>
>> This isn't something we want to change back or move again later. 
>> Whatever we do now we live with.
>
> I think it belongs with jni.h and I think the core libraries group 
> would agree.?? It seems more natural there than buried in the hotspot 
> prims directory.? I guess this is on hold while we have this debate.?? 
> Sigh.
>
> Actually with -I directives, changing to jvm.h from prims/jvm.h would 
> still work.?? Maybe we should change the name to jvm.hpp since it's 
> jvm.cpp though??? Or maybe just have two divergent copies and close 
> this as WNF. 

I also think hotspot/prims is not a good location. 
src/java.base/share/include is a well-defined location for native header 
files.? Maybe internal header files could be placed in include/internal 
but this is a separate issue .? I should create an issue for jvm.h and 
jmm.h (I looked at the files under the include directory and jvm.h and 
jmm.h are the only two internal header files in the include directory).

I do think removing the duplicated copy of jvm.h is a good change. This 
is finally possible with the consolidated repository and we no longer 
need to update two copies of jvm.h for any change to the JVM 
interface.?? This change will work with -I directive setting to the new 
location, if changed later.

What do you think?

Mandy

From coleen.phillimore at oracle.com  Fri Oct 27 18:13:22 2017
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Fri, 27 Oct 2017 14:13:22 -0400
Subject: RFR (L, tedious again, sorry) 8189610: Reconcile jvm.h and all
 jvm_md.h between java.base and hotspot
In-Reply-To: <51f09db9-06f5-ad01-bc92-1d73e1113f86@oracle.com>
References: <8671321f-398c-5f7f-634d-9d9664e04d87@oracle.com>
 <05bf853a-52f8-c450-4171-89f6b64d793a@oracle.com>
 <72f2aac7-ed20-c995-913b-ee4341a2a978@oracle.com>
 <55ec3559-c593-bcb6-51b0-4639da126068@oracle.com>
 <a05deb31-c7bf-4fda-f621-d39cfb56abb4@oracle.com>
 <4509dce7-10f8-4558-2adb-90d4745e054e@oracle.com>
 <396ab0f7-3710-3f76-675a-5108bcb50af5@oracle.com>
 <51f09db9-06f5-ad01-bc92-1d73e1113f86@oracle.com>
Message-ID: <2a48b157-06e5-668e-7533-3e073620d7cd@oracle.com>


On 10/27/17 1:47 PM, mandy chung wrote:
>
>
> On 10/27/17 7:08 AM, coleen.phillimore at oracle.com wrote:
>>
>>
>> On 10/27/17 9:37 AM, David Holmes wrote:
>>>
>>> The one file that is needed is a hotspot file - jvm.h defines the 
>>> interface that hotspot exports via jvm.cpp.
>>>
>>> If you leave jvm.h in hotspot/prims then a very large chunk of your 
>>> boilerplate changes are not needed. The JDK code doesn't care what 
>>> the name of the directory is - whatever it is just gets added as a 
>>> -I directive (the JDK code will include "jvm.h" not "prims/jvm.h" 
>>> the way hotspot sources do.
>>>
>>> This isn't something we want to change back or move again later. 
>>> Whatever we do now we live with.
>>
>> I think it belongs with jni.h and I think the core libraries group 
>> would agree.?? It seems more natural there than buried in the hotspot 
>> prims directory.? I guess this is on hold while we have this 
>> debate.?? Sigh.
>>
>> Actually with -I directives, changing to jvm.h from prims/jvm.h would 
>> still work.?? Maybe we should change the name to jvm.hpp since it's 
>> jvm.cpp though??? Or maybe just have two divergent copies and close 
>> this as WNF. 
>
> I also think hotspot/prims is not a good location. 
> src/java.base/share/include is a well-defined location for native 
> header files.? Maybe internal header files could be placed in 
> include/internal but this is a separate issue .? I should create an 
> issue for jvm.h and jmm.h (I looked at the files under the include 
> directory and jvm.h and jmm.h are the only two internal header files 
> in the include directory).
>
> I do think removing the duplicated copy of jvm.h is a good change.? 
> This is finally possible with the consolidated repository and we no 
> longer need to update two copies of jvm.h for any change to the JVM 
> interface.?? This change will work with -I directive setting to the 
> new location, if changed later.
>
> What do you think?

I agree.? I'm not really bothered by it being in 
src/java/base/share/include in the first place though.?? Only jni.h and 
jni_md.h are copied into the images, so this seems a bit pained to make 
jvm.h be in some other directory.? But your call, really.

Thanks,
Coleen

>
> Mandy


From coleen.phillimore at oracle.com  Fri Oct 27 20:20:18 2017
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Fri, 27 Oct 2017 16:20:18 -0400
Subject: RFR (L, tedious again, sorry) 8189610: Reconcile jvm.h and all
 jvm_md.h between java.base and hotspot
In-Reply-To: <57390ec3-8d8d-a3d7-9774-b5945a323be9@oracle.com>
References: <8671321f-398c-5f7f-634d-9d9664e04d87@oracle.com>
 <05bf853a-52f8-c450-4171-89f6b64d793a@oracle.com>
 <72f2aac7-ed20-c995-913b-ee4341a2a978@oracle.com>
 <55ec3559-c593-bcb6-51b0-4639da126068@oracle.com>
 <a05deb31-c7bf-4fda-f621-d39cfb56abb4@oracle.com>
 <4509dce7-10f8-4558-2adb-90d4745e054e@oracle.com>
 <57390ec3-8d8d-a3d7-9774-b5945a323be9@oracle.com>
Message-ID: <b62a479f-7fc2-70be-1305-ad214572b54f@oracle.com>


Incremental webrev:

http://cr.openjdk.java.net/~coleenp/8189610.incr.01/webrev/index.html

thanks,
Coleen

On 10/27/17 11:13 AM, coleen.phillimore at oracle.com wrote:
>
>
> On 10/27/17 9:37 AM, David Holmes wrote:
>>>> src/hotspot/share/c1/c1_LinearScan.cpp
>>>>
>>>> ?ConstantIntValue((jint)0);
>>>>
>>>> why is this cast needed? what causes the ambiguity? (If this was a 
>>>> template I'd understand ;-) ). Also didn't you change that 
>>>> constructor to take an int anyway - not that I think it should - 
>>>> see below.
>>>
>>> Yes, it caused an ambiguity.? 0 matches 'int' but it doesn't match 
>>> 'long' better than any pointer type.? So this cast is needed.
>>
>> But you changed the constructor to take an int!
>>
>> ?class ConstantIntValue: public ScopeValue {
>> ? private:
>> -? jint _value;
>> +? int _value;
>> ? public:
>> -? ConstantIntValue(jint value)???????? { _value = value; }
>> +? ConstantIntValue(int value)????????? { _value = value; }
>>
> I changed this back to not take an int and changed c1_LinearScan.cpp 
> to have the (jint)0 cast and output.cp needed (jint)0 casts.? 0L 
> doesn't work for platforms where jint is an 'int' rather than a long 
> because it's ambiguous with the functions that take a pointer type.
> Probably better to keep the type of ConstantIntValue consistent with j 
> types.
>
> Thanks,
> Coleen


From david.holmes at oracle.com  Sat Oct 28 07:46:44 2017
From: david.holmes at oracle.com (David Holmes)
Date: Sat, 28 Oct 2017 17:46:44 +1000
Subject: RFR (L, tedious again, sorry) 8189610: Reconcile jvm.h and all
 jvm_md.h between java.base and hotspot
In-Reply-To: <51f09db9-06f5-ad01-bc92-1d73e1113f86@oracle.com>
References: <8671321f-398c-5f7f-634d-9d9664e04d87@oracle.com>
 <05bf853a-52f8-c450-4171-89f6b64d793a@oracle.com>
 <72f2aac7-ed20-c995-913b-ee4341a2a978@oracle.com>
 <55ec3559-c593-bcb6-51b0-4639da126068@oracle.com>
 <a05deb31-c7bf-4fda-f621-d39cfb56abb4@oracle.com>
 <4509dce7-10f8-4558-2adb-90d4745e054e@oracle.com>
 <396ab0f7-3710-3f76-675a-5108bcb50af5@oracle.com>
 <51f09db9-06f5-ad01-bc92-1d73e1113f86@oracle.com>
Message-ID: <66b590da-f94c-6d87-cf61-e269bf1afc0d@oracle.com>

On 28/10/2017 3:47 AM, mandy chung wrote:
> On 10/27/17 7:08 AM, coleen.phillimore at oracle.com wrote:
>>
>>
>> On 10/27/17 9:37 AM, David Holmes wrote:
>>>
>>> The one file that is needed is a hotspot file - jvm.h defines the 
>>> interface that hotspot exports via jvm.cpp.
>>>
>>> If you leave jvm.h in hotspot/prims then a very large chunk of your 
>>> boilerplate changes are not needed. The JDK code doesn't care what 
>>> the name of the directory is - whatever it is just gets added as a -I 
>>> directive (the JDK code will include "jvm.h" not "prims/jvm.h" the 
>>> way hotspot sources do.
>>>
>>> This isn't something we want to change back or move again later. 
>>> Whatever we do now we live with.
>>
>> I think it belongs with jni.h and I think the core libraries group 
>> would agree.?? It seems more natural there than buried in the hotspot 
>> prims directory.? I guess this is on hold while we have this debate.   
>> Sigh.
>>
>> Actually with -I directives, changing to jvm.h from prims/jvm.h would 
>> still work.?? Maybe we should change the name to jvm.hpp since it's 
>> jvm.cpp though??? Or maybe just have two divergent copies and close 
>> this as WNF. 
> 
> I also think hotspot/prims is not a good location. 
> src/java.base/share/include is a well-defined location for native header 
> files.? Maybe internal header files could be placed in include/internal 
> but this is a separate issue .? I should create an issue for jvm.h and 
> jmm.h (I looked at the files under the include directory and jvm.h and 
> jmm.h are the only two internal header files in the include directory).

Keeping it in prims avoids the need to touch many hotspot files, and 
with no changes needed on the JDK side because we use a -I directive to 
set the include path anyway. This is the exported VM interface so it 
makes sense to me for it to be located in the VM sources.

But I'm not going to oppose this either way so it's up to Coleen.

> I do think removing the duplicated copy of jvm.h is a good change. This 
> is finally possible with the consolidated repository and we no longer 
> need to update two copies of jvm.h for any change to the JVM 

Unfortunately we did not do this though - hence the divergence between 
the two. The use of int versus long for jint is causing a real problem.

Coleen also hit the other issue on the head. The JNI and JVM interfaces 
are C interfaces, not C++. The JDK code that uses them is compiled as C 
- so all good. But the JVM code that implements them is compiled as C++, 
and that is why we are getting issues with differing linkage directives.

David
-----

> interface.?? This change will work with -I directive setting to the new 
> location, if changed later.
> 
> What do you think?
> 
> Mandy

From david.holmes at oracle.com  Sat Oct 28 07:50:27 2017
From: david.holmes at oracle.com (David Holmes)
Date: Sat, 28 Oct 2017 17:50:27 +1000
Subject: RFR (L, tedious again, sorry) 8189610: Reconcile jvm.h and all
 jvm_md.h between java.base and hotspot
In-Reply-To: <396ab0f7-3710-3f76-675a-5108bcb50af5@oracle.com>
References: <8671321f-398c-5f7f-634d-9d9664e04d87@oracle.com>
 <05bf853a-52f8-c450-4171-89f6b64d793a@oracle.com>
 <72f2aac7-ed20-c995-913b-ee4341a2a978@oracle.com>
 <55ec3559-c593-bcb6-51b0-4639da126068@oracle.com>
 <a05deb31-c7bf-4fda-f621-d39cfb56abb4@oracle.com>
 <4509dce7-10f8-4558-2adb-90d4745e054e@oracle.com>
 <396ab0f7-3710-3f76-675a-5108bcb50af5@oracle.com>
Message-ID: <22afedef-59cc-ecde-48fc-0afb7b4bbb47@oracle.com>

Hi Coleen,

I've commented on the file location in response to Mandy's email.

The only issue I'm still concerned about is the JVM_MAXPATHLEN issue. I 
think it is a bug to define a JVM_MAXPATHLEN that is bigger than the 
platform MAXPATHLEN. I also would not want to see any change in 
behaviour because of this - so AIX and Solaris should not get a 
different JVM_MAXPATHLEN due to this refactoring change. So yes I think 
this needs to be ifdef'd for Linux and reluctantly (because it was a 
copy error) for OSX/BSD as well.

Thanks,
David

On 28/10/2017 12:08 AM, coleen.phillimore at oracle.com wrote:
> 
> 
> On 10/27/17 9:37 AM, David Holmes wrote:
>> On 27/10/2017 10:13 PM, coleen.phillimore at oracle.com wrote:
>>>
>>>
>>> On 10/27/17 3:23 AM, David Holmes wrote:
>>>> Hi Coleen,
>>>>
>>>> Thanks for tackling this.
>>>>
>>>>> Summary: removed hotspot version of jvm*h and jni*h files
>>>>
>>>> Can you update the bug synopsis to show it covers both sets of files 
>>>> please.
>>>>
>>>> I hate to start with this (and it took me quite a while to realize 
>>>> it) but as Mandy pointed out jvm.h is not an exported interface from 
>>>> the JDK to the outside world (so not subject to CSR review), but is 
>>>> a private interface between the JVM and the JDK libraries. So I 
>>>> think really jvm.h belongs in the hotspot sources where it was, 
>>>> while jni.h belongs in the exported JDK sources. In which case the 
>>>> bulk of your changes to the hotspot files would not be needed - sorry.
>>>
>>> Maybe someone can make that decision and change at a later date. The 
>>> point of this change is that there is now only one of these files 
>>> that is shared.? I don't think jvm.h and the jvm_md.h belong on the 
>>> hotspot sources for the jdk to find them in some random prims and os 
>>> dependent directories.
>>
>> The one file that is needed is a hotspot file - jvm.h defines the 
>> interface that hotspot exports via jvm.cpp.
>>
>> If you leave jvm.h in hotspot/prims then a very large chunk of your 
>> boilerplate changes are not needed. The JDK code doesn't care what the 
>> name of the directory is - whatever it is just gets added as a -I 
>> directive (the JDK code will include "jvm.h" not "prims/jvm.h" the way 
>> hotspot sources do.
>>
>> This isn't something we want to change back or move again later. 
>> Whatever we do now we live with.
> 
> I think it belongs with jni.h and I think the core libraries group would 
> agree.?? It seems more natural there than buried in the hotspot prims 
> directory.? I guess this is on hold while we have this debate.?? Sigh.
> 
> Actually with -I directives, changing to jvm.h from prims/jvm.h would 
> still work.?? Maybe we should change the name to jvm.hpp since it's 
> jvm.cpp though??? Or maybe just have two divergent copies and close this 
> as WNF.
> 
>>
>>> I'm happy to withdraw the CSR.? We generally use the CSR process to 
>>> add and remove JVM_ interfaces even though they're a private 
>>> interface in case some other JVM/JDK combination relies on them. The 
>>> changes to these files are very minor though and not likely to cause 
>>> any even theoretical incompatibility, so I'll withdraw it.
>>>>
>>>> Moving on ...
>>>>
>>>> First to address the initial comments/query you had:
>>>>
>>>>> The JDK windows jni_md.h file defined jint as long and the hotspot
>>>>> windows jni_x86.h as int. I had to choose the jdk version since 
>>>>> it's the
>>>>> public version, so there are changes to the hotspot files for this.
>>>>
>>>> On Windows int and long are always the same as it uses ILP32 or 
>>>> LLP64 (not LP64 like *nix platforms). So either choice should be 
>>>> fine. That said there are some odd casting issues I comment on 
>>>> below. Does the VS compiler complain about mixing int and long in 
>>>> expressions?
>>>
>>> Yes, it does even though int and long are the same representation.
>>
>> And what an absolute mess that makes. :(
>>
>>>>
>>>>> Generally I changed the code to use 'int' rather than 'jint' where the
>>>>> surrounding API didn't insist on consistently using java types. We
>>>>> should mostly be using C++ types within hotspot except in 
>>>>> interfaces to
>>>>> native/JNI code.
>>>>
>>>> I think you pulled too hard on a few threads here and things are 
>>>> starting to unravel. There are numerous cases I refer to below where 
>>>> either the cast seems unnecessary/inappropriate or else highlights a 
>>>> bunch of additional changes that also need to be made. The fan out 
>>>> from this could be horrendous. Unless you actually get some kind of 
>>>> error - and I'd like to understand the details of those - I would 
>>>> not suggest making these changes as part of this work.
>>>
>>> I didn't make any change unless there was was an error.? I have 100 
>>> failed JPRT jobs to confirm!? I eventually got a Windows system to 
>>> compile and test this on.?? Actually some of the changes came out 
>>> better.? Cases where we use jint as a bool simply turned to int.? We 
>>> do not have an overload for bool for cmpxchg.
>>
>> That's unfortunate - ditto for OrderAccess.
>>
>>>>
>>>> Looking through I have a quite a few queries/comments - apologies in 
>>>> advance as I know how tedious this is:
>>>>
>>>> make/hotspot/lib/CompileLibjsig.gmk
>>>> src/java.base/solaris/native/libjsig/jsig.c
>>>>
>>>> Took a while to figure out why the include was needed. :) As a 
>>>> follow up I suggest just deleting the -I include directive, delete 
>>>> the Solaris-only definition of JSIG_VERSION_1_4_1, and delete 
>>>> everything to do with JVM_get_libjsig_version. It is all obsolete.
>>>
>>> Can I patch up jsig in a separate RFE?? I don't remember why this 
>>> broke so I simply moved JSIG #define.? Is jsig obsolete? Removing 
>>> JVM_* definitions generally requires a CSR.
>>
>> I did say "As a follow up". jsig is not obsolete but the jsig 
>> versioning code, only used by Solaris, is.
>>
>>>>
>>>> ---
>>>>
>>>> src/hotspot/cpu/arm/interp_masm_arm.cpp
>>>>
>>>> Why did you need to add the jvm.h include?
>>>>
>>>
>>> ?? tbz(Raccess_flags, JVM_ACC_SYNCHRONIZED_BIT, unlocked);
>>
>> Okay. I'm not going to try and figure out how this code found this 
>> before.
>>
>>>> ---
>>>>
>>>> src/hotspot/os/windows/os_windows.cpp.
>>>>
>>>> The type of process_exiting should be uint to match the DWORD of 
>>>> GetCurrentThreadID. Then you should need any casts. Also you missed 
>>>> this jint cast:
>>>>
>>>> 3796???????? process_exiting != (jint)GetCurrentThreadId()) {
>>>
>>> Yes, that's better to change process_exiting to a DWORD.? It needs a 
>>> DWORD cast to 0 in the cmpxchg.
>>>
>>> ???????? Atomic::cmpxchg(GetCurrentThreadId(), &process_exiting, 
>>> (DWORD)0);
>>>
>>> These templates are picky.
>>
>> Yes - their inability to deal with literals is extremely frustrating.
>>
>>>>
>>>> ---
>>>>
>>>> src/hotspot/share/c1/c1_Canonicalizer.hpp
>>>>
>>>> ? 43 #ifdef _WINDOWS
>>>> ? 44?? // jint is defined as long in jni_md.h, so convert from int 
>>>> to jint
>>>> ? 45?? void set_constant(int x)?????????????????????? { 
>>>> set_constant((jint)x); }
>>>> ? 46 #endif
>>>>
>>>> Why is this necessary? int and long are the same on Windows. The 
>>>> whole point is that jint hides the underlying type, so where does 
>>>> this go wrong?
>>>
>>> No, they are not the same types even though they have the same 
>>> representation!
>>
>> This is truly unfortunate.
>>
>>>>
>>>> ---
>>>>
>>>> src/hotspot/share/c1/c1_LinearScan.cpp
>>>>
>>>> ?ConstantIntValue((jint)0);
>>>>
>>>> why is this cast needed? what causes the ambiguity? (If this was a 
>>>> template I'd understand ;-) ). Also didn't you change that 
>>>> constructor to take an int anyway - not that I think it should - see 
>>>> below.
>>>
>>> Yes, it caused an ambiguity.? 0 matches 'int' but it doesn't match 
>>> 'long' better than any pointer type.? So this cast is needed.
>>
>> But you changed the constructor to take an int!
>>
>> ?class ConstantIntValue: public ScopeValue {
>> ? private:
>> -? jint _value;
>> +? int _value;
>> ? public:
>> -? ConstantIntValue(jint value)???????? { _value = value; }
>> +? ConstantIntValue(int value)????????? { _value = value; }
>>
>>
> 
> Okay I removed this cast.
> 
>>>> ---
>>>>
>>>> src/hotspot/share/ci/ciReplay.cpp
>>>>
>>>> 793???????? jint* dims = NEW_RESOURCE_ARRAY(jint, rank);
>>>>
>>>> why should this be jint?
>>>
>>> To avoid a cast from int* to jint* in the line below:
>>>
>>> ????????? value = kelem->multi_allocate(rank, dims, CHECK);
>>>
>>>
>>>>
>>>> ---
>>>>
>>>> src/hotspot/share/classfile/altHashing.cpp
>>>>
>>>> Okay this looks more consistent with jint.
>>>
>>> Yes.? I translated this from some native code iirc.
>>>>
>>>> ---
>>>>
>>>> src/hotspot/share/code/debugInfo.hpp
>>>>
>>>> These changes seem wrong. We have:
>>>>
>>>> ConstantLongValue(jlong value)
>>>> ConstantDoubleValue(jdouble value)
>>>>
>>>> so we should have:
>>>>
>>>> ConstantIntValue(jint value)
>>>
>>> Again, there are multiple call sites with '0', which match int 
>>> trivially but are confused with long.? It's less consistent I agree 
>>> but better to not cast all the call sites.
>>
>> This is really making a mess of the APIs - they should be a jint but 
>> we declare them int because of a 0 casting problem. Can't we just use 0L?
> 
> There aren't that many casts.? You're right, that would have been better 
> in some places.
> 
>>>>
>>>> ---
>>>>
>>>> src/hotspot/share/code/relocInfo.cpp
>>>>
>>>> Change seems unnecessary - int32_t is fine
>>>>
>>>
>>> No, int32_t doesn't match the calls below it.? They all assume _lo 
>>> and _hi are jint.
>>>> ---
>>>>
>>>> src/hotspot/share/compiler/compileBroker.cpp
>>>> src/hotspot/share/compiler/compileBroker.hpp
>>>>
>>>> I see a complete mix of int and jint in this class, so why make the 
>>>> one change you did ??
>>>
>>> This is another case of using jint as a flag with cmpxchg.? The 
>>> templates for cmpxchg want the types to match and 0 and 1 are 
>>> essentially 'int'.? This is a lot cleaner this way.
>>
>> <sigh>
>>
>>>>
>>>> ---
>>>>
>>>> src/hotspot/share/jvmci/jvmciCompilerToVM.cpp
>>>>
>>>> 1700???? tty->write((char*) start, MIN2(length, (jint)O_BUFLEN));
>>>>
>>>> why did you need to add the jint cast? It's used without any cast on 
>>>> the next two lines:
>>>>
>>>> 1701???? length -= O_BUFLEN;
>>>> 1702???? offset += O_BUFLEN;
>>>>
>>>
>>> There's a conversion from O_BUFLEN from int to long in 1701 and 
>>> 1702.?? MIN2 is a template that wants the types to match exactly.
>>
>> $%^%$! templates!
>>
>>>> ??
>>>>
>>>> ---
>>>>
>>>> src/hotspot/share/jvmci/jvmciRuntime.cpp
>>>>
>>>> Looking around this code it seems very confused about types - eg the 
>>>> previous function is declared jboolean yet returns a jint on one 
>>>> path! It isn't clear to me if the return type is what should be 
>>>> changed or the parameter type? I would just leave this alone.
>>>
>>> I can't leave it alone because it doesn't compile that way. This was 
>>> the minimal change and yea, does look a bit inconsistent.
>>>>
>>>> ---
>>>>
>>>> src/hotspot/share/opto/mulnode.cpp
>>>>
>>>> Okay TypeInt has jint parts, so the remaining int32_t declarations 
>>>> (A, B, C, D) should also be jint.
>>>
>>> Yes.? c2 uses jint types.
>>>>
>>>> ---
>>>>
>>>> src/hotspot/share/opto/parse3.cpp
>>>>
>>>> I agree with the changes you made, but then:
>>>>
>>>> ?419???? jint dim_con = find_int_con(length[j], -1);
>>>>
>>>> should also be changed.
>>>>
>>>> And obviously MultiArrayExpandLimit should be defined as int not intx!
>>>
>>> Everything in globals.hpp is intx.? That's a thread that I don't want 
>>> to pull on!
>>
>> We still have that limitation? <double sigh>
>>>
>>> Changed dim_con to int.
>>>>
>>>> ---
>>>>
>>>> src/hotspot/share/opto/phaseX.cpp
>>>>
>>>> I can see that intcon(jint i) is consistent with longcon(jlong l), 
>>>> but the use of "i" in the code is more consistent with int than jint.
>>>
>>> huh?? really?
>>>>
>>>> ---
>>>>
>>>> src/hotspot/share/opto/type.cpp
>>>>
>>>> 1505 int TypeInt::hash(void) const {
>>>> 1506?? return java_add(java_add(_lo, _hi), java_add((jint)_widen, 
>>>> (jint)Type::Int));
>>>> 1507 }
>>>>
>>>> I can see that the (jint) casts you added make sense, but then the 
>>>> whole function should be returning jint not int. Ditto the other 
>>>> hash functions.
>>>
>>> I'm not messing with this, this is the minimal in type fixing that 
>>> I'm going to do here.
>>
>> <sigh>
>>
>>>>
>>>> ---
>>>>
>>>> src/hotspot/share/prims/jni.cpp
>>>>
>>>> I think vm_created should be a bool. In fact all the fields you 
>>>> changed are logically bools - do Atomics work for bool now?
>>>
>>> No, they do not.?? I had thought bool would be better originally too.
>>>>
>>>> ---
>>>>
>>>> src/hotspot/share/prims/jvm.cpp
>>>>
>>>> is_attachable is the terminology used in the JDK code.
>>>
>>> Well the JDK version had is_attach_supported() as the flag name so I 
>>> used that in this one place.
>>>>
>>>> ---
>>>>
>>>> src/hotspot/share/prims/jvmtiEnvBase.cpp
>>>> src/hotspot/share/prims/jvmtiImpl.cpp
>>>>
>>>> Are you making parameters consistent with the fields they initialize?
>>>
>>> They're consistent with the declarations now.
>>>>
>>>> ---
>>>>
>>>> src/hotspot/share/prims/jvmtiTagMap.cpp
>>>>
>>>> There is a mix of int and jint for slot in this code. You fixed 
>>>> some, but this remains:
>>>>
>>>> 2440 inline bool CallbackInvoker::report_stack_ref_root(jlong 
>>>> thread_tag,
>>>> 2441??????????????????????????????????????????????????? jlong tid,
>>>> 2442??????????????????????????????????????????????????? jint depth,
>>>> 2443 jmethodID method,
>>>> 2444 jlocation bci,
>>>> 2445??????????????????????????????????????????????????? jint slot,
>>>
>>> Right for consistency with the declarations.
>>>>
>>>> ---
>>>>
>>>> src/hotspot/share/runtime/perfData.cpp
>>>>
>>>> Callers pass both jint and int, so param type seems arbitrary.
>>>
>>> They are, but importantly they match the declarations.
>>>>
>>>> ---
>>>>
>>>> src/hotspot/share/runtime/perfMemory.cpp
>>>> src/hotspot/share/runtime/perfMemory.hpp
>>>>
>>>> PerfMemory::_initialized should ideally be a bool - can OrderAccess 
>>>> handle that now?
>>>
>>> Nope.
>>>>
>>>> ---
>>>>
>>>> src/java.base/share/native/include/jvm.h
>>>>
>>>> Not clear why the jio functions are not also JNICALL ?
>>>
>>> They are now.? The JDK version didn't have JNICALL.? JVM needs 
>>> JNICALL.? I can't tell you why JDK didn't need JNICALL linkage.
>>
>> ?? JVM currently does not have JNICALL. But they are declared as 
>> "extern C".
> 
> This was a compilation error on Windows with JDK.?? Maybe the C code in 
> the JDK doesn't complain about linkage differences.? I'll have to go 
> back and figure this out then.
>>
>>>>
>>>> ---
>>>>
>>>> src/java.base/unix/native/include/jni_md.h
>>>>
>>>> There is no need to special case ARM. The differences in the 
>>>> existing code were for LTO support and that is now irrelevant.
>>>
>>> See discussion with Magnus.?? We still build ARM for jdk10/hs so I 
>>> needed this conditional or of course I wouldn't have added it.? We 
>>> can remove it with LTO support.
>>
>> Those builds are gone - this is obsolete. But yes all LTO can be 
>> removed later if you wish. Just trying to simplify things now.
>>
>>>>
>>>> ---
>>>>
>>>> src/java.base/unix/native/include/jvm_md.h
>>>>
>>>> I know you've just copied this across, but it seems wrong to me:
>>>>
>>>> ?57 // Hack: MAXPATHLEN is 4095 on some Linux and 4096 on others. 
>>>> This may
>>>> ? 58 //?????? cause problems if JVM and the rest of JDK are built on 
>>>> different
>>>> ? 59 //?????? Linux releases. Here we define JVM_MAXPATHLEN to be 
>>>> MAXPATHLEN + 1,
>>>> ? 60 //?????? so buffers declared in VM are always >= 4096.
>>>> ? 61 #define JVM_MAXPATHLEN MAXPATHLEN + 1
>>>>
>>>> It doesn't make sense to me to define an internal "max path length" 
>>>> that can _exceed_ the platform max!
>>>>
>>>> That aside there's no support for building different parts of the 
>>>> JDK on different platforms and then bringing them together. And in 
>>>> any case I would think the real problem would be building on a 
>>>> platform that uses 4096 and running on one that uses 4095!
>>>>
>>>> But that aside this is a Linux hack and should be guarded by ifdef 
>>>> LINUX. (I doubt BSD needs it, the bsd file is just a copy of the 
>>>> linux one - the JDK macosx version does the right thing). Solaris 
>>>> and AIX should stay as-is at MAXPATHLEN.
>>>
>>> All of the unix platforms had MAXPATHLEN+1.? I'll leave it for now 
>>> and we can investigate that further.
>>
>> I see the following existing code:
>>
>> src/java.base/unix/native/include/jvm_md.h:
>>
>> #define JVM_MAXPATHLEN MAXPATHLEN
>>
>> src/java.base/macosx/native/include/jvm_md.h
>>
>> #define JVM_MAXPATHLEN MAXPATHLEN
>>
>> src/hotspot/os/aix/jvm_aix.h
>>
>> #define JVM_MAXPATHLEN MAXPATHLEN
>>
>> src/hotspot/os/bsd/jvm_bsd.h
>>
>> #define JVM_MAXPATHLEN MAXPATHLEN + 1? // blindly copied from Linux 
>> version
>>
>> src/hotspot/os/linux/jvm_linux.h
>>
>> #define JVM_MAXPATHLEN MAXPATHLEN + 1
>>
>> src/hotspot/os/solaris/jvm_solaris.h
>>
>> #define JVM_MAXPATHLEN MAXPATHLEN
>>
>> This is a linux only hack (if you ignore the blind copy from linux 
>> into the BSD code in the VM).
> 
> Oh, thanks, so should I add a bunch of ifdefs then?? Or do you think 
> having MAXPATHLEN + 1 will really break the other platforms?? Do you 
> really see this as a problem or are you just pointing out inconsistency?
>>
>>>>
>>>> ?86 #define ASYNC_SIGNAL???? SIGJVM2
>>>>
>>>> This only exists on Solaris so I think should be in #ifdef SOLARIS, 
>>>> to make that clear.
>>>
>>> Ok.? I'll add this.
>>>>
>>>> ---
>>>>
>>>> src/java.base/windows/native/include/jvm_md.h
>>>>
>>>> Given the differences between the two versions either something has 
>>>> been broken or "extern C" declarations are not needed :)
>>>
>>> Well, they are needed for Hotspot to build and do not prevent jdk 
>>> from building.? I don't know what was broken.
>>
>> We really need to understand this better. Maybe related to the map 
>> files that expose the symbols. ??
> 
> They're needed because the JDK files are written mostly in C and that 
> doesn't complain about the linkage difference.? Hotspot files are in C++ 
> which does complain.
> 
>>
>>>>
>>>> ---
>>>>
>>>> That was a really painful way to spend most of my Friday. TGIF! :)
>>>
>>> Thanks for going through it.? See comments inline for changes. 
>>> Generating a webrev takes hours so I'm not going to do that unless 
>>> you insist.
>>
>> An incremental webrev shouldn't take long - right? You're a mq maestro 
>> now. :)
> 
> Well I generally trash a repository whenever I use mq but sure.
>>
>> If you can reasonably produce an incremental webrev once you've 
>> settled on all the comments/issues that would be good.
> 
> Ok, sure.
> 
> Coleen
>>
>> Thanks,
>> David
>>
>>> Thanks,
>>> Coleen
>>>
>>>
>>>>
>>>> Thanks,
>>>> David
>>>> -----
>>>>
>>>>
>>>> On 27/10/2017 6:44 AM, coleen.phillimore at oracle.com wrote:
>>>>> ??Hi Magnus,
>>>>>
>>>>> Thank you for reviewing this.?? I have a new version that takes out 
>>>>> the hack in globalDefinitions.hpp and adds casts to 
>>>>> src/hotspot/share/opto/type.cpp instead.
>>>>>
>>>>> Also some fixes from Martin at SAP.
>>>>>
>>>>> open webrev at http://cr.openjdk.java.net/~coleenp/8189610.02/webrev
>>>>>
>>>>> see below.
>>>>>
>>>>> On 10/26/17 5:57 AM, Magnus Ihse Bursie wrote:
>>>>>> Coleen,
>>>>>>
>>>>>> Thank you for addressing this!
>>>>>>
>>>>>> On 2017-10-25 18:49, coleen.phillimore at oracle.com wrote:
>>>>>>> Summary: removed hotspot version of jvm*h and jni*h files
>>>>>>>
>>>>>>> Mostly used sed to remove prims/jvm.h and move #include "jvm.h" 
>>>>>>> after precompiled.h, so if you have repetitive stress wrist 
>>>>>>> issues don't click on most of these files.
>>>>>>>
>>>>>>> There were more issues to resolve, however.? The JDK windows 
>>>>>>> jni_md.h file defined jint as long and the hotspot windows 
>>>>>>> jni_x86.h as int. I had to choose the jdk version since it's the 
>>>>>>> public version, so there are changes to the hotspot files for 
>>>>>>> this. Generally I changed the code to use 'int' rather than 
>>>>>>> 'jint' where the surrounding API didn't insist on consistently 
>>>>>>> using java types. We should mostly be using C++ types within 
>>>>>>> hotspot except in interfaces to native/JNI code.? There are a 
>>>>>>> couple of hacks in places where adding multiple jint casts was 
>>>>>>> too painful.
>>>>>>>
>>>>>>> Tested with JPRT and tier2-4 (in progress).
>>>>>>>
>>>>>>> open webrev at http://cr.openjdk.java.net/~coleenp/8189610.01/webrev
>>>>>>
>>>>>> Looks great!
>>>>>>
>>>>>> Just a few comments:
>>>>>>
>>>>>> * src/java.base/unix/native/include/jni_md.h:
>>>>>>
>>>>>> I don't think the externally_visible attribute should be there for 
>>>>>> arm. I know this was the case for the corresponding hotspot file 
>>>>>> for arm, but that was techically incorrect. The proper dependency 
>>>>>> here is that externally_visible should be in all JNIEXPORT if and 
>>>>>> only if we're building with JVM feature "link-time-opt". 
>>>>>> Traditionally, that feature been enabled when building arm32 
>>>>>> builds, and only then, so there's been a (coincidentally) 
>>>>>> connection here. Nowadays, Oracle does not care about the arm32 
>>>>>> builds, and I'm not sure if anyone else is building them with 
>>>>>> link-time-opt enabled.
>>>>>>
>>>>>> It does seem wrong to me to export this behavior in the public 
>>>>>> jni_md.h file, though. I think the correct way to solve this, if 
>>>>>> we should continue supporting link-time-opt is to make sure this 
>>>>>> attribute is set for exported hotspot functions. If it's still 
>>>>>> needed, that is. A quick googling seems to indicate that 
>>>>>> visibility("default") might be enough in modern gcc's.
>>>>>>
>>>>>> A third option is to remove the support for link-time-opt 
>>>>>> entirely, if it's not really used.
>>>>>
>>>>> I didn't know how to change this since we are still building ARM 
>>>>> with the jdk10/hs repository, and ARM needed this change.? I could 
>>>>> wait until we bring down the jdk10/master changes that remove the 
>>>>> ARM build and remove this conditional before I push. Or we could 
>>>>> file an RFE to remove link-time-opt (?) and remove it then?
>>>>>
>>>>>>
>>>>>> * src/java.base/unix/native/include/jvm_md.h and 
>>>>>> src/java.base/windows/native/include/jvm_md.h:
>>>>>>
>>>>>> These files define a public API, and contain non-trivial changes. 
>>>>>> I suspect you should file a CSR request. (Even though I realize 
>>>>>> you're only matching the header file with the reality.)
>>>>>>
>>>>>
>>>>> I filed the CSR.?? Waiting for the next steps.
>>>>>
>>>>> Thanks,
>>>>> Coleen
>>>>>
>>>>>> /Magnus
>>>>>>
>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8189610
>>>>>>>
>>>>>>> I have a script to update copyright files on commit.
>>>>>>>
>>>>>>> Thanks to Magnus and ErikJ for the makefile changes.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Coleen
>>>>>>>
>>>>>>
>>>>>
>>>
> 

From david.holmes at oracle.com  Sat Oct 28 07:58:30 2017
From: david.holmes at oracle.com (David Holmes)
Date: Sat, 28 Oct 2017 17:58:30 +1000
Subject: RFR (L, tedious again, sorry) 8189610: Reconcile jvm.h and all
 jvm_md.h between java.base and hotspot
In-Reply-To: <b62a479f-7fc2-70be-1305-ad214572b54f@oracle.com>
References: <8671321f-398c-5f7f-634d-9d9664e04d87@oracle.com>
 <05bf853a-52f8-c450-4171-89f6b64d793a@oracle.com>
 <72f2aac7-ed20-c995-913b-ee4341a2a978@oracle.com>
 <55ec3559-c593-bcb6-51b0-4639da126068@oracle.com>
 <a05deb31-c7bf-4fda-f621-d39cfb56abb4@oracle.com>
 <4509dce7-10f8-4558-2adb-90d4745e054e@oracle.com>
 <57390ec3-8d8d-a3d7-9774-b5945a323be9@oracle.com>
 <b62a479f-7fc2-70be-1305-ad214572b54f@oracle.com>
Message-ID: <0f568e05-6f06-d2df-571e-0c591f062c15@oracle.com>

On 28/10/2017 6:20 AM, coleen.phillimore at oracle.com wrote:
> 
> Incremental webrev:
> 
> http://cr.openjdk.java.net/~coleenp/8189610.incr.01/webrev/index.html

That all looks fine - thanks.

If I get a chance I'll look deeper into why the VS compiler needs 0 to 
be cast to jint (aka long) to avoid ambiguity with it being a NULL 
pointer. I could understand if it always needed the cast, but not only 
needing it for long, but not int.

Thanks,
David

> thanks,
> Coleen
> 
> On 10/27/17 11:13 AM, coleen.phillimore at oracle.com wrote:
>>
>>
>> On 10/27/17 9:37 AM, David Holmes wrote:
>>>>> src/hotspot/share/c1/c1_LinearScan.cpp
>>>>>
>>>>> ?ConstantIntValue((jint)0);
>>>>>
>>>>> why is this cast needed? what causes the ambiguity? (If this was a 
>>>>> template I'd understand ;-) ). Also didn't you change that 
>>>>> constructor to take an int anyway - not that I think it should - 
>>>>> see below.
>>>>
>>>> Yes, it caused an ambiguity.? 0 matches 'int' but it doesn't match 
>>>> 'long' better than any pointer type.? So this cast is needed.
>>>
>>> But you changed the constructor to take an int!
>>>
>>> ?class ConstantIntValue: public ScopeValue {
>>> ? private:
>>> -? jint _value;
>>> +? int _value;
>>> ? public:
>>> -? ConstantIntValue(jint value)???????? { _value = value; }
>>> +? ConstantIntValue(int value)????????? { _value = value; }
>>>
>> I changed this back to not take an int and changed c1_LinearScan.cpp 
>> to have the (jint)0 cast and output.cp needed (jint)0 casts.? 0L 
>> doesn't work for platforms where jint is an 'int' rather than a long 
>> because it's ambiguous with the functions that take a pointer type.
>> Probably better to keep the type of ConstantIntValue consistent with j 
>> types.
>>
>> Thanks,
>> Coleen
> 

From kumar.x.srinivasan at oracle.com  Fri Oct 27 17:12:43 2017
From: kumar.x.srinivasan at oracle.com (Kumar Srinivasan)
Date: Fri, 27 Oct 2017 10:12:43 -0700
Subject: RFR: 8190287: Update JDK's internal ASM to ASMv6
Message-ID: <59F3690B.6070309@oracle.com>

Hello Remi, Sundar and others,

Please review the webrev [1] to update JDK's internal ASM to v6.

To help with review areas, you can use the browser to search for mq 
patches commented with //

Highlights of changes:
1. updated ASMv6 // jdk-new-asmv6.patch
2. changes to jlink and jar to add ModuleMainClass and ModulePackages 
attributes //jdk-new-asm-update.patch
3. adjustments to jdk tests  //jdk-new-asm-test.patch
4. minor adjustments  to hotspot tests  //jdk-new-hotspot-test.patch

Tests:
jdk_tier1, jdk_tier2, testset hotspot, hotspot_tier1, nashorn ant tests,
Alan has also run several tests.

Big thanks to Alan for #2 and #3 as part of [3].

Thanks
Kumar

[1] http://cr.openjdk.java.net/~ksrini/8190287/webrev.00/index.html
[2] https://bugs.openjdk.java.net/browse/JDK-8190287
[3] https://bugs.openjdk.java.net/browse/JDK-8186236


From magnus.ihse.bursie at oracle.com  Mon Oct 30 07:50:02 2017
From: magnus.ihse.bursie at oracle.com (Magnus Ihse Bursie)
Date: Mon, 30 Oct 2017 08:50:02 +0100
Subject: RFR [10] 8189800: Add support for AddressSanitizer
In-Reply-To: <aafaf71d-087d-fdab-a50a-56d6d331cf24@oracle.com>
References: <51eabbae-5435-59be-f443-a6b214a17513@oracle.com>
 <aafaf71d-087d-fdab-a50a-56d6d331cf24@oracle.com>
Message-ID: <55e0e055-2e65-5c83-3f8e-36895f71860e@oracle.com>

On 2017-10-30 08:39, Artem Smotrakov wrote:
> cc'ing hotspot-dev at openjdk.java.net as David suggested.
>
> Artem
>
>
> On 10/27/2017 11:02 PM, Artem Smotrakov wrote:
>> Hello,
>>
>> Please review the following patch which adds support for 
>> AddressSanitizer.
>>
>> AddressSanitizer is a runtime memory error detector which looks for 
>> various memory corruption issues and leaks.
>>
>> Please refer to [1] for details. AddressSanitizer is available in gcc 
>> 4.8+ and clang 3.1+
>>
>> The patch below introduces --enable-asan parameter for the configure 
>> script which enables AddressSanitizer.
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8189800
>> Webrev: http://cr.openjdk.java.net/~asmotrak/8189800/webrev.00/
spec.gmk.in should only have export for variables that needs to be 
exported in the environment for executing binaries, that is ASAN_OPTIONS 
and LD_LIBRARY_PATH, not ASAN_ENABLED or DEVKIT_LIB_DIR.

I'm also a bit curious about the addition of of DEVKIT_LIB_DIR. Would 
you care to elaborate your thinking?

Otherwise it looks good.

/Magnus

>>
>> [1] https://github.com/google/sanitizers/wiki/AddressSanitizer
>>
>> Artem
>


From coleen.phillimore at oracle.com  Mon Oct 30 12:07:46 2017
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Mon, 30 Oct 2017 08:07:46 -0400
Subject: RFR (L, tedious again, sorry) 8189610: Reconcile jvm.h and all
 jvm_md.h between java.base and hotspot
In-Reply-To: <66b590da-f94c-6d87-cf61-e269bf1afc0d@oracle.com>
References: <8671321f-398c-5f7f-634d-9d9664e04d87@oracle.com>
 <05bf853a-52f8-c450-4171-89f6b64d793a@oracle.com>
 <72f2aac7-ed20-c995-913b-ee4341a2a978@oracle.com>
 <55ec3559-c593-bcb6-51b0-4639da126068@oracle.com>
 <a05deb31-c7bf-4fda-f621-d39cfb56abb4@oracle.com>
 <4509dce7-10f8-4558-2adb-90d4745e054e@oracle.com>
 <396ab0f7-3710-3f76-675a-5108bcb50af5@oracle.com>
 <51f09db9-06f5-ad01-bc92-1d73e1113f86@oracle.com>
 <66b590da-f94c-6d87-cf61-e269bf1afc0d@oracle.com>
Message-ID: <d91495df-d104-5a76-b84a-4e1595a4eb43@oracle.com>


On 10/28/17 3:46 AM, David Holmes wrote:
> On 28/10/2017 3:47 AM, mandy chung wrote:
>> On 10/27/17 7:08 AM, coleen.phillimore at oracle.com wrote:
>>>
>>>
>>> On 10/27/17 9:37 AM, David Holmes wrote:
>>>>
>>>> The one file that is needed is a hotspot file - jvm.h defines the 
>>>> interface that hotspot exports via jvm.cpp.
>>>>
>>>> If you leave jvm.h in hotspot/prims then a very large chunk of your 
>>>> boilerplate changes are not needed. The JDK code doesn't care what 
>>>> the name of the directory is - whatever it is just gets added as a 
>>>> -I directive (the JDK code will include "jvm.h" not "prims/jvm.h" 
>>>> the way hotspot sources do.
>>>>
>>>> This isn't something we want to change back or move again later. 
>>>> Whatever we do now we live with.
>>>
>>> I think it belongs with jni.h and I think the core libraries group 
>>> would agree.?? It seems more natural there than buried in the 
>>> hotspot prims directory.? I guess this is on hold while we have this 
>>> debate.?? Sigh.
>>>
>>> Actually with -I directives, changing to jvm.h from prims/jvm.h 
>>> would still work.?? Maybe we should change the name to jvm.hpp since 
>>> it's jvm.cpp though??? Or maybe just have two divergent copies and 
>>> close this as WNF. 
>>
>> I also think hotspot/prims is not a good location. 
>> src/java.base/share/include is a well-defined location for native 
>> header files.? Maybe internal header files could be placed in 
>> include/internal but this is a separate issue .? I should create an 
>> issue for jvm.h and jmm.h (I looked at the files under the include 
>> directory and jvm.h and jmm.h are the only two internal header files 
>> in the include directory).
>
> Keeping it in prims avoids the need to touch many hotspot files, and 
> with no changes needed on the JDK side because we use a -I directive 
> to set the include path anyway. This is the exported VM interface so 
> it makes sense to me for it to be located in the VM sources.
>
> But I'm not going to oppose this either way so it's up to Coleen.

I've already disagreed that this file belongs in 
src/hotspot/share/prims, so the include directive without prims is 
preferred.? This allows putting jvm.h in a new place if/when that is 
agreed upon.
>
>> I do think removing the duplicated copy of jvm.h is a good change. 
>> This is finally possible with the consolidated repository and we no 
>> longer need to update two copies of jvm.h for any change to the JVM 
>
> Unfortunately we did not do this though - hence the divergence between 
> the two. The use of int versus long for jint is causing a real problem.
>
> Coleen also hit the other issue on the head. The JNI and JVM 
> interfaces are C interfaces, not C++. The JDK code that uses them is 
> compiled as C - so all good. But the JVM code that implements them is 
> compiled as C++, and that is why we are getting issues with differing 
> linkage directives.

Well, there is now one source file for jvm.h and jni.h and their machine 
dependent counterparts and 2500 lines of duplicated code is removed with 
this change.? The issues with jint and linkages are resolved and tested 
as well with this changeset.

Thanks,
Coleen
>
> David
> -----
>
>> interface.?? This change will work with -I directive setting to the 
>> new location, if changed later.
>>
>> What do you think?
>>
>> Mandy


From coleen.phillimore at oracle.com  Mon Oct 30 12:13:45 2017
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Mon, 30 Oct 2017 08:13:45 -0400
Subject: RFR (L, tedious again, sorry) 8189610: Reconcile jvm.h and all
 jvm_md.h between java.base and hotspot
In-Reply-To: <22afedef-59cc-ecde-48fc-0afb7b4bbb47@oracle.com>
References: <8671321f-398c-5f7f-634d-9d9664e04d87@oracle.com>
 <05bf853a-52f8-c450-4171-89f6b64d793a@oracle.com>
 <72f2aac7-ed20-c995-913b-ee4341a2a978@oracle.com>
 <55ec3559-c593-bcb6-51b0-4639da126068@oracle.com>
 <a05deb31-c7bf-4fda-f621-d39cfb56abb4@oracle.com>
 <4509dce7-10f8-4558-2adb-90d4745e054e@oracle.com>
 <396ab0f7-3710-3f76-675a-5108bcb50af5@oracle.com>
 <22afedef-59cc-ecde-48fc-0afb7b4bbb47@oracle.com>
Message-ID: <a3389ea3-83b2-40fb-85cd-bb741191d219@oracle.com>


On 10/28/17 3:50 AM, David Holmes wrote:
> Hi Coleen,
>
> I've commented on the file location in response to Mandy's email.
>
> The only issue I'm still concerned about is the JVM_MAXPATHLEN issue. 
> I think it is a bug to define a JVM_MAXPATHLEN that is bigger than the 
> platform MAXPATHLEN. I also would not want to see any change in 
> behaviour because of this - so AIX and Solaris should not get a 
> different JVM_MAXPATHLEN due to this refactoring change. So yes I 
> think this needs to be ifdef'd for Linux and reluctantly (because it 
> was a copy error) for OSX/BSD as well.

#if defined(AIX) || defined(SOLARIS)
#define JVM_MAXPATHLEN MAXPATHLEN
#else
// Hack: MAXPATHLEN is 4095 on some Linux and 4096 on others. This may
//?????? cause problems if JVM and the rest of JDK are built on different
//?????? Linux releases. Here we define JVM_MAXPATHLEN to be MAXPATHLEN + 1,
//?????? so buffers declared in VM are always >= 4096.
#define JVM_MAXPATHLEN MAXPATHLEN + 1
#endif

Is this ok?

thanks,
Coleen
>
> Thanks,
> David
>
> On 28/10/2017 12:08 AM, coleen.phillimore at oracle.com wrote:
>>
>>
>> On 10/27/17 9:37 AM, David Holmes wrote:
>>> On 27/10/2017 10:13 PM, coleen.phillimore at oracle.com wrote:
>>>>
>>>>
>>>> On 10/27/17 3:23 AM, David Holmes wrote:
>>>>> Hi Coleen,
>>>>>
>>>>> Thanks for tackling this.
>>>>>
>>>>>> Summary: removed hotspot version of jvm*h and jni*h files
>>>>>
>>>>> Can you update the bug synopsis to show it covers both sets of 
>>>>> files please.
>>>>>
>>>>> I hate to start with this (and it took me quite a while to realize 
>>>>> it) but as Mandy pointed out jvm.h is not an exported interface 
>>>>> from the JDK to the outside world (so not subject to CSR review), 
>>>>> but is a private interface between the JVM and the JDK libraries. 
>>>>> So I think really jvm.h belongs in the hotspot sources where it 
>>>>> was, while jni.h belongs in the exported JDK sources. In which 
>>>>> case the bulk of your changes to the hotspot files would not be 
>>>>> needed - sorry.
>>>>
>>>> Maybe someone can make that decision and change at a later date. 
>>>> The point of this change is that there is now only one of these 
>>>> files that is shared.? I don't think jvm.h and the jvm_md.h belong 
>>>> on the hotspot sources for the jdk to find them in some random 
>>>> prims and os dependent directories.
>>>
>>> The one file that is needed is a hotspot file - jvm.h defines the 
>>> interface that hotspot exports via jvm.cpp.
>>>
>>> If you leave jvm.h in hotspot/prims then a very large chunk of your 
>>> boilerplate changes are not needed. The JDK code doesn't care what 
>>> the name of the directory is - whatever it is just gets added as a 
>>> -I directive (the JDK code will include "jvm.h" not "prims/jvm.h" 
>>> the way hotspot sources do.
>>>
>>> This isn't something we want to change back or move again later. 
>>> Whatever we do now we live with.
>>
>> I think it belongs with jni.h and I think the core libraries group 
>> would agree.?? It seems more natural there than buried in the hotspot 
>> prims directory.? I guess this is on hold while we have this 
>> debate.?? Sigh.
>>
>> Actually with -I directives, changing to jvm.h from prims/jvm.h would 
>> still work.?? Maybe we should change the name to jvm.hpp since it's 
>> jvm.cpp though??? Or maybe just have two divergent copies and close 
>> this as WNF.
>>
>>>
>>>> I'm happy to withdraw the CSR.? We generally use the CSR process to 
>>>> add and remove JVM_ interfaces even though they're a private 
>>>> interface in case some other JVM/JDK combination relies on them. 
>>>> The changes to these files are very minor though and not likely to 
>>>> cause any even theoretical incompatibility, so I'll withdraw it.
>>>>>
>>>>> Moving on ...
>>>>>
>>>>> First to address the initial comments/query you had:
>>>>>
>>>>>> The JDK windows jni_md.h file defined jint as long and the hotspot
>>>>>> windows jni_x86.h as int. I had to choose the jdk version since 
>>>>>> it's the
>>>>>> public version, so there are changes to the hotspot files for this.
>>>>>
>>>>> On Windows int and long are always the same as it uses ILP32 or 
>>>>> LLP64 (not LP64 like *nix platforms). So either choice should be 
>>>>> fine. That said there are some odd casting issues I comment on 
>>>>> below. Does the VS compiler complain about mixing int and long in 
>>>>> expressions?
>>>>
>>>> Yes, it does even though int and long are the same representation.
>>>
>>> And what an absolute mess that makes. :(
>>>
>>>>>
>>>>>> Generally I changed the code to use 'int' rather than 'jint' 
>>>>>> where the
>>>>>> surrounding API didn't insist on consistently using java types. We
>>>>>> should mostly be using C++ types within hotspot except in 
>>>>>> interfaces to
>>>>>> native/JNI code.
>>>>>
>>>>> I think you pulled too hard on a few threads here and things are 
>>>>> starting to unravel. There are numerous cases I refer to below 
>>>>> where either the cast seems unnecessary/inappropriate or else 
>>>>> highlights a bunch of additional changes that also need to be 
>>>>> made. The fan out from this could be horrendous. Unless you 
>>>>> actually get some kind of error - and I'd like to understand the 
>>>>> details of those - I would not suggest making these changes as 
>>>>> part of this work.
>>>>
>>>> I didn't make any change unless there was was an error.? I have 100 
>>>> failed JPRT jobs to confirm!? I eventually got a Windows system to 
>>>> compile and test this on.?? Actually some of the changes came out 
>>>> better.? Cases where we use jint as a bool simply turned to int.? 
>>>> We do not have an overload for bool for cmpxchg.
>>>
>>> That's unfortunate - ditto for OrderAccess.
>>>
>>>>>
>>>>> Looking through I have a quite a few queries/comments - apologies 
>>>>> in advance as I know how tedious this is:
>>>>>
>>>>> make/hotspot/lib/CompileLibjsig.gmk
>>>>> src/java.base/solaris/native/libjsig/jsig.c
>>>>>
>>>>> Took a while to figure out why the include was needed. :) As a 
>>>>> follow up I suggest just deleting the -I include directive, delete 
>>>>> the Solaris-only definition of JSIG_VERSION_1_4_1, and delete 
>>>>> everything to do with JVM_get_libjsig_version. It is all obsolete.
>>>>
>>>> Can I patch up jsig in a separate RFE?? I don't remember why this 
>>>> broke so I simply moved JSIG #define.? Is jsig obsolete? Removing 
>>>> JVM_* definitions generally requires a CSR.
>>>
>>> I did say "As a follow up". jsig is not obsolete but the jsig 
>>> versioning code, only used by Solaris, is.
>>>
>>>>>
>>>>> ---
>>>>>
>>>>> src/hotspot/cpu/arm/interp_masm_arm.cpp
>>>>>
>>>>> Why did you need to add the jvm.h include?
>>>>>
>>>>
>>>> ?? tbz(Raccess_flags, JVM_ACC_SYNCHRONIZED_BIT, unlocked);
>>>
>>> Okay. I'm not going to try and figure out how this code found this 
>>> before.
>>>
>>>>> ---
>>>>>
>>>>> src/hotspot/os/windows/os_windows.cpp.
>>>>>
>>>>> The type of process_exiting should be uint to match the DWORD of 
>>>>> GetCurrentThreadID. Then you should need any casts. Also you 
>>>>> missed this jint cast:
>>>>>
>>>>> 3796???????? process_exiting != (jint)GetCurrentThreadId()) {
>>>>
>>>> Yes, that's better to change process_exiting to a DWORD.? It needs 
>>>> a DWORD cast to 0 in the cmpxchg.
>>>>
>>>> ???????? Atomic::cmpxchg(GetCurrentThreadId(), &process_exiting, 
>>>> (DWORD)0);
>>>>
>>>> These templates are picky.
>>>
>>> Yes - their inability to deal with literals is extremely frustrating.
>>>
>>>>>
>>>>> ---
>>>>>
>>>>> src/hotspot/share/c1/c1_Canonicalizer.hpp
>>>>>
>>>>> ? 43 #ifdef _WINDOWS
>>>>> ? 44?? // jint is defined as long in jni_md.h, so convert from int 
>>>>> to jint
>>>>> ? 45?? void set_constant(int x)?????????????????????? { 
>>>>> set_constant((jint)x); }
>>>>> ? 46 #endif
>>>>>
>>>>> Why is this necessary? int and long are the same on Windows. The 
>>>>> whole point is that jint hides the underlying type, so where does 
>>>>> this go wrong?
>>>>
>>>> No, they are not the same types even though they have the same 
>>>> representation!
>>>
>>> This is truly unfortunate.
>>>
>>>>>
>>>>> ---
>>>>>
>>>>> src/hotspot/share/c1/c1_LinearScan.cpp
>>>>>
>>>>> ?ConstantIntValue((jint)0);
>>>>>
>>>>> why is this cast needed? what causes the ambiguity? (If this was a 
>>>>> template I'd understand ;-) ). Also didn't you change that 
>>>>> constructor to take an int anyway - not that I think it should - 
>>>>> see below.
>>>>
>>>> Yes, it caused an ambiguity.? 0 matches 'int' but it doesn't match 
>>>> 'long' better than any pointer type.? So this cast is needed.
>>>
>>> But you changed the constructor to take an int!
>>>
>>> ?class ConstantIntValue: public ScopeValue {
>>> ? private:
>>> -? jint _value;
>>> +? int _value;
>>> ? public:
>>> -? ConstantIntValue(jint value)???????? { _value = value; }
>>> +? ConstantIntValue(int value)????????? { _value = value; }
>>>
>>>
>>
>> Okay I removed this cast.
>>
>>>>> ---
>>>>>
>>>>> src/hotspot/share/ci/ciReplay.cpp
>>>>>
>>>>> 793???????? jint* dims = NEW_RESOURCE_ARRAY(jint, rank);
>>>>>
>>>>> why should this be jint?
>>>>
>>>> To avoid a cast from int* to jint* in the line below:
>>>>
>>>> ????????? value = kelem->multi_allocate(rank, dims, CHECK);
>>>>
>>>>
>>>>>
>>>>> ---
>>>>>
>>>>> src/hotspot/share/classfile/altHashing.cpp
>>>>>
>>>>> Okay this looks more consistent with jint.
>>>>
>>>> Yes.? I translated this from some native code iirc.
>>>>>
>>>>> ---
>>>>>
>>>>> src/hotspot/share/code/debugInfo.hpp
>>>>>
>>>>> These changes seem wrong. We have:
>>>>>
>>>>> ConstantLongValue(jlong value)
>>>>> ConstantDoubleValue(jdouble value)
>>>>>
>>>>> so we should have:
>>>>>
>>>>> ConstantIntValue(jint value)
>>>>
>>>> Again, there are multiple call sites with '0', which match int 
>>>> trivially but are confused with long.? It's less consistent I agree 
>>>> but better to not cast all the call sites.
>>>
>>> This is really making a mess of the APIs - they should be a jint but 
>>> we declare them int because of a 0 casting problem. Can't we just 
>>> use 0L?
>>
>> There aren't that many casts.? You're right, that would have been 
>> better in some places.
>>
>>>>>
>>>>> ---
>>>>>
>>>>> src/hotspot/share/code/relocInfo.cpp
>>>>>
>>>>> Change seems unnecessary - int32_t is fine
>>>>>
>>>>
>>>> No, int32_t doesn't match the calls below it.? They all assume _lo 
>>>> and _hi are jint.
>>>>> ---
>>>>>
>>>>> src/hotspot/share/compiler/compileBroker.cpp
>>>>> src/hotspot/share/compiler/compileBroker.hpp
>>>>>
>>>>> I see a complete mix of int and jint in this class, so why make 
>>>>> the one change you did ??
>>>>
>>>> This is another case of using jint as a flag with cmpxchg. The 
>>>> templates for cmpxchg want the types to match and 0 and 1 are 
>>>> essentially 'int'.? This is a lot cleaner this way.
>>>
>>> <sigh>
>>>
>>>>>
>>>>> ---
>>>>>
>>>>> src/hotspot/share/jvmci/jvmciCompilerToVM.cpp
>>>>>
>>>>> 1700???? tty->write((char*) start, MIN2(length, (jint)O_BUFLEN));
>>>>>
>>>>> why did you need to add the jint cast? It's used without any cast 
>>>>> on the next two lines:
>>>>>
>>>>> 1701???? length -= O_BUFLEN;
>>>>> 1702???? offset += O_BUFLEN;
>>>>>
>>>>
>>>> There's a conversion from O_BUFLEN from int to long in 1701 and 
>>>> 1702.?? MIN2 is a template that wants the types to match exactly.
>>>
>>> $%^%$! templates!
>>>
>>>>> ??
>>>>>
>>>>> ---
>>>>>
>>>>> src/hotspot/share/jvmci/jvmciRuntime.cpp
>>>>>
>>>>> Looking around this code it seems very confused about types - eg 
>>>>> the previous function is declared jboolean yet returns a jint on 
>>>>> one path! It isn't clear to me if the return type is what should 
>>>>> be changed or the parameter type? I would just leave this alone.
>>>>
>>>> I can't leave it alone because it doesn't compile that way. This 
>>>> was the minimal change and yea, does look a bit inconsistent.
>>>>>
>>>>> ---
>>>>>
>>>>> src/hotspot/share/opto/mulnode.cpp
>>>>>
>>>>> Okay TypeInt has jint parts, so the remaining int32_t declarations 
>>>>> (A, B, C, D) should also be jint.
>>>>
>>>> Yes.? c2 uses jint types.
>>>>>
>>>>> ---
>>>>>
>>>>> src/hotspot/share/opto/parse3.cpp
>>>>>
>>>>> I agree with the changes you made, but then:
>>>>>
>>>>> ?419???? jint dim_con = find_int_con(length[j], -1);
>>>>>
>>>>> should also be changed.
>>>>>
>>>>> And obviously MultiArrayExpandLimit should be defined as int not 
>>>>> intx!
>>>>
>>>> Everything in globals.hpp is intx.? That's a thread that I don't 
>>>> want to pull on!
>>>
>>> We still have that limitation? <double sigh>
>>>>
>>>> Changed dim_con to int.
>>>>>
>>>>> ---
>>>>>
>>>>> src/hotspot/share/opto/phaseX.cpp
>>>>>
>>>>> I can see that intcon(jint i) is consistent with longcon(jlong l), 
>>>>> but the use of "i" in the code is more consistent with int than jint.
>>>>
>>>> huh?? really?
>>>>>
>>>>> ---
>>>>>
>>>>> src/hotspot/share/opto/type.cpp
>>>>>
>>>>> 1505 int TypeInt::hash(void) const {
>>>>> 1506?? return java_add(java_add(_lo, _hi), java_add((jint)_widen, 
>>>>> (jint)Type::Int));
>>>>> 1507 }
>>>>>
>>>>> I can see that the (jint) casts you added make sense, but then the 
>>>>> whole function should be returning jint not int. Ditto the other 
>>>>> hash functions.
>>>>
>>>> I'm not messing with this, this is the minimal in type fixing that 
>>>> I'm going to do here.
>>>
>>> <sigh>
>>>
>>>>>
>>>>> ---
>>>>>
>>>>> src/hotspot/share/prims/jni.cpp
>>>>>
>>>>> I think vm_created should be a bool. In fact all the fields you 
>>>>> changed are logically bools - do Atomics work for bool now?
>>>>
>>>> No, they do not.?? I had thought bool would be better originally too.
>>>>>
>>>>> ---
>>>>>
>>>>> src/hotspot/share/prims/jvm.cpp
>>>>>
>>>>> is_attachable is the terminology used in the JDK code.
>>>>
>>>> Well the JDK version had is_attach_supported() as the flag name so 
>>>> I used that in this one place.
>>>>>
>>>>> ---
>>>>>
>>>>> src/hotspot/share/prims/jvmtiEnvBase.cpp
>>>>> src/hotspot/share/prims/jvmtiImpl.cpp
>>>>>
>>>>> Are you making parameters consistent with the fields they initialize?
>>>>
>>>> They're consistent with the declarations now.
>>>>>
>>>>> ---
>>>>>
>>>>> src/hotspot/share/prims/jvmtiTagMap.cpp
>>>>>
>>>>> There is a mix of int and jint for slot in this code. You fixed 
>>>>> some, but this remains:
>>>>>
>>>>> 2440 inline bool CallbackInvoker::report_stack_ref_root(jlong 
>>>>> thread_tag,
>>>>> 2441 jlong tid,
>>>>> 2442 jint depth,
>>>>> 2443 jmethodID method,
>>>>> 2444 jlocation bci,
>>>>> 2445 jint slot,
>>>>
>>>> Right for consistency with the declarations.
>>>>>
>>>>> ---
>>>>>
>>>>> src/hotspot/share/runtime/perfData.cpp
>>>>>
>>>>> Callers pass both jint and int, so param type seems arbitrary.
>>>>
>>>> They are, but importantly they match the declarations.
>>>>>
>>>>> ---
>>>>>
>>>>> src/hotspot/share/runtime/perfMemory.cpp
>>>>> src/hotspot/share/runtime/perfMemory.hpp
>>>>>
>>>>> PerfMemory::_initialized should ideally be a bool - can 
>>>>> OrderAccess handle that now?
>>>>
>>>> Nope.
>>>>>
>>>>> ---
>>>>>
>>>>> src/java.base/share/native/include/jvm.h
>>>>>
>>>>> Not clear why the jio functions are not also JNICALL ?
>>>>
>>>> They are now.? The JDK version didn't have JNICALL.? JVM needs 
>>>> JNICALL.? I can't tell you why JDK didn't need JNICALL linkage.
>>>
>>> ?? JVM currently does not have JNICALL. But they are declared as 
>>> "extern C".
>>
>> This was a compilation error on Windows with JDK.?? Maybe the C code 
>> in the JDK doesn't complain about linkage differences. I'll have to 
>> go back and figure this out then.
>>>
>>>>>
>>>>> ---
>>>>>
>>>>> src/java.base/unix/native/include/jni_md.h
>>>>>
>>>>> There is no need to special case ARM. The differences in the 
>>>>> existing code were for LTO support and that is now irrelevant.
>>>>
>>>> See discussion with Magnus.?? We still build ARM for jdk10/hs so I 
>>>> needed this conditional or of course I wouldn't have added it.? We 
>>>> can remove it with LTO support.
>>>
>>> Those builds are gone - this is obsolete. But yes all LTO can be 
>>> removed later if you wish. Just trying to simplify things now.
>>>
>>>>>
>>>>> ---
>>>>>
>>>>> src/java.base/unix/native/include/jvm_md.h
>>>>>
>>>>> I know you've just copied this across, but it seems wrong to me:
>>>>>
>>>>> ?57 // Hack: MAXPATHLEN is 4095 on some Linux and 4096 on others. 
>>>>> This may
>>>>> ? 58 //?????? cause problems if JVM and the rest of JDK are built 
>>>>> on different
>>>>> ? 59 //?????? Linux releases. Here we define JVM_MAXPATHLEN to be 
>>>>> MAXPATHLEN + 1,
>>>>> ? 60 //?????? so buffers declared in VM are always >= 4096.
>>>>> ? 61 #define JVM_MAXPATHLEN MAXPATHLEN + 1
>>>>>
>>>>> It doesn't make sense to me to define an internal "max path 
>>>>> length" that can _exceed_ the platform max!
>>>>>
>>>>> That aside there's no support for building different parts of the 
>>>>> JDK on different platforms and then bringing them together. And in 
>>>>> any case I would think the real problem would be building on a 
>>>>> platform that uses 4096 and running on one that uses 4095!
>>>>>
>>>>> But that aside this is a Linux hack and should be guarded by ifdef 
>>>>> LINUX. (I doubt BSD needs it, the bsd file is just a copy of the 
>>>>> linux one - the JDK macosx version does the right thing). Solaris 
>>>>> and AIX should stay as-is at MAXPATHLEN.
>>>>
>>>> All of the unix platforms had MAXPATHLEN+1.? I'll leave it for now 
>>>> and we can investigate that further.
>>>
>>> I see the following existing code:
>>>
>>> src/java.base/unix/native/include/jvm_md.h:
>>>
>>> #define JVM_MAXPATHLEN MAXPATHLEN
>>>
>>> src/java.base/macosx/native/include/jvm_md.h
>>>
>>> #define JVM_MAXPATHLEN MAXPATHLEN
>>>
>>> src/hotspot/os/aix/jvm_aix.h
>>>
>>> #define JVM_MAXPATHLEN MAXPATHLEN
>>>
>>> src/hotspot/os/bsd/jvm_bsd.h
>>>
>>> #define JVM_MAXPATHLEN MAXPATHLEN + 1? // blindly copied from Linux 
>>> version
>>>
>>> src/hotspot/os/linux/jvm_linux.h
>>>
>>> #define JVM_MAXPATHLEN MAXPATHLEN + 1
>>>
>>> src/hotspot/os/solaris/jvm_solaris.h
>>>
>>> #define JVM_MAXPATHLEN MAXPATHLEN
>>>
>>> This is a linux only hack (if you ignore the blind copy from linux 
>>> into the BSD code in the VM).
>>
>> Oh, thanks, so should I add a bunch of ifdefs then?? Or do you think 
>> having MAXPATHLEN + 1 will really break the other platforms?? Do you 
>> really see this as a problem or are you just pointing out inconsistency?
>>>
>>>>>
>>>>> ?86 #define ASYNC_SIGNAL???? SIGJVM2
>>>>>
>>>>> This only exists on Solaris so I think should be in #ifdef 
>>>>> SOLARIS, to make that clear.
>>>>
>>>> Ok.? I'll add this.
>>>>>
>>>>> ---
>>>>>
>>>>> src/java.base/windows/native/include/jvm_md.h
>>>>>
>>>>> Given the differences between the two versions either something 
>>>>> has been broken or "extern C" declarations are not needed :)
>>>>
>>>> Well, they are needed for Hotspot to build and do not prevent jdk 
>>>> from building.? I don't know what was broken.
>>>
>>> We really need to understand this better. Maybe related to the map 
>>> files that expose the symbols. ??
>>
>> They're needed because the JDK files are written mostly in C and that 
>> doesn't complain about the linkage difference.? Hotspot files are in 
>> C++ which does complain.
>>
>>>
>>>>>
>>>>> ---
>>>>>
>>>>> That was a really painful way to spend most of my Friday. TGIF! :)
>>>>
>>>> Thanks for going through it.? See comments inline for changes. 
>>>> Generating a webrev takes hours so I'm not going to do that unless 
>>>> you insist.
>>>
>>> An incremental webrev shouldn't take long - right? You're a mq 
>>> maestro now. :)
>>
>> Well I generally trash a repository whenever I use mq but sure.
>>>
>>> If you can reasonably produce an incremental webrev once you've 
>>> settled on all the comments/issues that would be good.
>>
>> Ok, sure.
>>
>> Coleen
>>>
>>> Thanks,
>>> David
>>>
>>>> Thanks,
>>>> Coleen
>>>>
>>>>
>>>>>
>>>>> Thanks,
>>>>> David
>>>>> -----
>>>>>
>>>>>
>>>>> On 27/10/2017 6:44 AM, coleen.phillimore at oracle.com wrote:
>>>>>> ??Hi Magnus,
>>>>>>
>>>>>> Thank you for reviewing this.?? I have a new version that takes 
>>>>>> out the hack in globalDefinitions.hpp and adds casts to 
>>>>>> src/hotspot/share/opto/type.cpp instead.
>>>>>>
>>>>>> Also some fixes from Martin at SAP.
>>>>>>
>>>>>> open webrev at http://cr.openjdk.java.net/~coleenp/8189610.02/webrev
>>>>>>
>>>>>> see below.
>>>>>>
>>>>>> On 10/26/17 5:57 AM, Magnus Ihse Bursie wrote:
>>>>>>> Coleen,
>>>>>>>
>>>>>>> Thank you for addressing this!
>>>>>>>
>>>>>>> On 2017-10-25 18:49, coleen.phillimore at oracle.com wrote:
>>>>>>>> Summary: removed hotspot version of jvm*h and jni*h files
>>>>>>>>
>>>>>>>> Mostly used sed to remove prims/jvm.h and move #include "jvm.h" 
>>>>>>>> after precompiled.h, so if you have repetitive stress wrist 
>>>>>>>> issues don't click on most of these files.
>>>>>>>>
>>>>>>>> There were more issues to resolve, however.? The JDK windows 
>>>>>>>> jni_md.h file defined jint as long and the hotspot windows 
>>>>>>>> jni_x86.h as int. I had to choose the jdk version since it's 
>>>>>>>> the public version, so there are changes to the hotspot files 
>>>>>>>> for this. Generally I changed the code to use 'int' rather than 
>>>>>>>> 'jint' where the surrounding API didn't insist on consistently 
>>>>>>>> using java types. We should mostly be using C++ types within 
>>>>>>>> hotspot except in interfaces to native/JNI code.? There are a 
>>>>>>>> couple of hacks in places where adding multiple jint casts was 
>>>>>>>> too painful.
>>>>>>>>
>>>>>>>> Tested with JPRT and tier2-4 (in progress).
>>>>>>>>
>>>>>>>> open webrev at 
>>>>>>>> http://cr.openjdk.java.net/~coleenp/8189610.01/webrev
>>>>>>>
>>>>>>> Looks great!
>>>>>>>
>>>>>>> Just a few comments:
>>>>>>>
>>>>>>> * src/java.base/unix/native/include/jni_md.h:
>>>>>>>
>>>>>>> I don't think the externally_visible attribute should be there 
>>>>>>> for arm. I know this was the case for the corresponding hotspot 
>>>>>>> file for arm, but that was techically incorrect. The proper 
>>>>>>> dependency here is that externally_visible should be in all 
>>>>>>> JNIEXPORT if and only if we're building with JVM feature 
>>>>>>> "link-time-opt". Traditionally, that feature been enabled when 
>>>>>>> building arm32 builds, and only then, so there's been a 
>>>>>>> (coincidentally) connection here. Nowadays, Oracle does not care 
>>>>>>> about the arm32 builds, and I'm not sure if anyone else is 
>>>>>>> building them with link-time-opt enabled.
>>>>>>>
>>>>>>> It does seem wrong to me to export this behavior in the public 
>>>>>>> jni_md.h file, though. I think the correct way to solve this, if 
>>>>>>> we should continue supporting link-time-opt is to make sure this 
>>>>>>> attribute is set for exported hotspot functions. If it's still 
>>>>>>> needed, that is. A quick googling seems to indicate that 
>>>>>>> visibility("default") might be enough in modern gcc's.
>>>>>>>
>>>>>>> A third option is to remove the support for link-time-opt 
>>>>>>> entirely, if it's not really used.
>>>>>>
>>>>>> I didn't know how to change this since we are still building ARM 
>>>>>> with the jdk10/hs repository, and ARM needed this change.? I 
>>>>>> could wait until we bring down the jdk10/master changes that 
>>>>>> remove the ARM build and remove this conditional before I push. 
>>>>>> Or we could file an RFE to remove link-time-opt (?) and remove it 
>>>>>> then?
>>>>>>
>>>>>>>
>>>>>>> * src/java.base/unix/native/include/jvm_md.h and 
>>>>>>> src/java.base/windows/native/include/jvm_md.h:
>>>>>>>
>>>>>>> These files define a public API, and contain non-trivial 
>>>>>>> changes. I suspect you should file a CSR request. (Even though I 
>>>>>>> realize you're only matching the header file with the reality.)
>>>>>>>
>>>>>>
>>>>>> I filed the CSR.?? Waiting for the next steps.
>>>>>>
>>>>>> Thanks,
>>>>>> Coleen
>>>>>>
>>>>>>> /Magnus
>>>>>>>
>>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8189610
>>>>>>>>
>>>>>>>> I have a script to update copyright files on commit.
>>>>>>>>
>>>>>>>> Thanks to Magnus and ErikJ for the makefile changes.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Coleen
>>>>>>>>
>>>>>>>
>>>>>>
>>>>
>>


From coleen.phillimore at oracle.com  Mon Oct 30 12:15:31 2017
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Mon, 30 Oct 2017 08:15:31 -0400
Subject: RFR (L, tedious again, sorry) 8189610: Reconcile jvm.h and all
 jvm_md.h between java.base and hotspot
In-Reply-To: <0f568e05-6f06-d2df-571e-0c591f062c15@oracle.com>
References: <8671321f-398c-5f7f-634d-9d9664e04d87@oracle.com>
 <05bf853a-52f8-c450-4171-89f6b64d793a@oracle.com>
 <72f2aac7-ed20-c995-913b-ee4341a2a978@oracle.com>
 <55ec3559-c593-bcb6-51b0-4639da126068@oracle.com>
 <a05deb31-c7bf-4fda-f621-d39cfb56abb4@oracle.com>
 <4509dce7-10f8-4558-2adb-90d4745e054e@oracle.com>
 <57390ec3-8d8d-a3d7-9774-b5945a323be9@oracle.com>
 <b62a479f-7fc2-70be-1305-ad214572b54f@oracle.com>
 <0f568e05-6f06-d2df-571e-0c591f062c15@oracle.com>
Message-ID: <29688c76-4983-dffc-6ce2-402cf91dafbf@oracle.com>


On 10/28/17 3:58 AM, David Holmes wrote:
> On 28/10/2017 6:20 AM, coleen.phillimore at oracle.com wrote:
>>
>> Incremental webrev:
>>
>> http://cr.openjdk.java.net/~coleenp/8189610.incr.01/webrev/index.html
>
> That all looks fine - thanks.
>
> If I get a chance I'll look deeper into why the VS compiler needs 0 to 
> be cast to jint (aka long) to avoid ambiguity with it being a NULL 
> pointer. I could understand if it always needed the cast, but not only 
> needing it for long, but not int.

Thanks,? Kim can probably tell you where in the spec this is.

Coleen

>
> Thanks,
> David
>
>> thanks,
>> Coleen
>>
>> On 10/27/17 11:13 AM, coleen.phillimore at oracle.com wrote:
>>>
>>>
>>> On 10/27/17 9:37 AM, David Holmes wrote:
>>>>>> src/hotspot/share/c1/c1_LinearScan.cpp
>>>>>>
>>>>>> ?ConstantIntValue((jint)0);
>>>>>>
>>>>>> why is this cast needed? what causes the ambiguity? (If this was 
>>>>>> a template I'd understand ;-) ). Also didn't you change that 
>>>>>> constructor to take an int anyway - not that I think it should - 
>>>>>> see below.
>>>>>
>>>>> Yes, it caused an ambiguity.? 0 matches 'int' but it doesn't match 
>>>>> 'long' better than any pointer type.? So this cast is needed.
>>>>
>>>> But you changed the constructor to take an int!
>>>>
>>>> ?class ConstantIntValue: public ScopeValue {
>>>> ? private:
>>>> -? jint _value;
>>>> +? int _value;
>>>> ? public:
>>>> -? ConstantIntValue(jint value)???????? { _value = value; }
>>>> +? ConstantIntValue(int value)????????? { _value = value; }
>>>>
>>> I changed this back to not take an int and changed c1_LinearScan.cpp 
>>> to have the (jint)0 cast and output.cp needed (jint)0 casts.? 0L 
>>> doesn't work for platforms where jint is an 'int' rather than a long 
>>> because it's ambiguous with the functions that take a pointer type.
>>> Probably better to keep the type of ConstantIntValue consistent with 
>>> j types.
>>>
>>> Thanks,
>>> Coleen
>>


From david.holmes at oracle.com  Mon Oct 30 12:17:38 2017
From: david.holmes at oracle.com (David Holmes)
Date: Mon, 30 Oct 2017 22:17:38 +1000
Subject: RFR (L, tedious again, sorry) 8189610: Reconcile jvm.h and all
 jvm_md.h between java.base and hotspot
In-Reply-To: <a3389ea3-83b2-40fb-85cd-bb741191d219@oracle.com>
References: <8671321f-398c-5f7f-634d-9d9664e04d87@oracle.com>
 <05bf853a-52f8-c450-4171-89f6b64d793a@oracle.com>
 <72f2aac7-ed20-c995-913b-ee4341a2a978@oracle.com>
 <55ec3559-c593-bcb6-51b0-4639da126068@oracle.com>
 <a05deb31-c7bf-4fda-f621-d39cfb56abb4@oracle.com>
 <4509dce7-10f8-4558-2adb-90d4745e054e@oracle.com>
 <396ab0f7-3710-3f76-675a-5108bcb50af5@oracle.com>
 <22afedef-59cc-ecde-48fc-0afb7b4bbb47@oracle.com>
 <a3389ea3-83b2-40fb-85cd-bb741191d219@oracle.com>
Message-ID: <815ac734-ea8b-ea2d-ecec-85cb547ba2f4@oracle.com>

On 30/10/2017 10:13 PM, coleen.phillimore at oracle.com wrote:
> On 10/28/17 3:50 AM, David Holmes wrote:
>> Hi Coleen,
>>
>> I've commented on the file location in response to Mandy's email.
>>
>> The only issue I'm still concerned about is the JVM_MAXPATHLEN issue. 
>> I think it is a bug to define a JVM_MAXPATHLEN that is bigger than the 
>> platform MAXPATHLEN. I also would not want to see any change in 
>> behaviour because of this - so AIX and Solaris should not get a 
>> different JVM_MAXPATHLEN due to this refactoring change. So yes I 
>> think this needs to be ifdef'd for Linux and reluctantly (because it 
>> was a copy error) for OSX/BSD as well.
> 
> #if defined(AIX) || defined(SOLARIS)
> #define JVM_MAXPATHLEN MAXPATHLEN
> #else
> // Hack: MAXPATHLEN is 4095 on some Linux and 4096 on others. This may
> //?????? cause problems if JVM and the rest of JDK are built on different
> //?????? Linux releases. Here we define JVM_MAXPATHLEN to be MAXPATHLEN 
> + 1,
> //?????? so buffers declared in VM are always >= 4096.
> #define JVM_MAXPATHLEN MAXPATHLEN + 1
> #endif
> 
> Is this ok?

Yes - thanks. It preserves existing behaviour on the VM side at least. 
Time will tell if it messes anything up on the JDK side for Linux/OSX.

David

> thanks,
> Coleen
>>
>> Thanks,
>> David
>>
>> On 28/10/2017 12:08 AM, coleen.phillimore at oracle.com wrote:
>>>
>>>
>>> On 10/27/17 9:37 AM, David Holmes wrote:
>>>> On 27/10/2017 10:13 PM, coleen.phillimore at oracle.com wrote:
>>>>>
>>>>>
>>>>> On 10/27/17 3:23 AM, David Holmes wrote:
>>>>>> Hi Coleen,
>>>>>>
>>>>>> Thanks for tackling this.
>>>>>>
>>>>>>> Summary: removed hotspot version of jvm*h and jni*h files
>>>>>>
>>>>>> Can you update the bug synopsis to show it covers both sets of 
>>>>>> files please.
>>>>>>
>>>>>> I hate to start with this (and it took me quite a while to realize 
>>>>>> it) but as Mandy pointed out jvm.h is not an exported interface 
>>>>>> from the JDK to the outside world (so not subject to CSR review), 
>>>>>> but is a private interface between the JVM and the JDK libraries. 
>>>>>> So I think really jvm.h belongs in the hotspot sources where it 
>>>>>> was, while jni.h belongs in the exported JDK sources. In which 
>>>>>> case the bulk of your changes to the hotspot files would not be 
>>>>>> needed - sorry.
>>>>>
>>>>> Maybe someone can make that decision and change at a later date. 
>>>>> The point of this change is that there is now only one of these 
>>>>> files that is shared.? I don't think jvm.h and the jvm_md.h belong 
>>>>> on the hotspot sources for the jdk to find them in some random 
>>>>> prims and os dependent directories.
>>>>
>>>> The one file that is needed is a hotspot file - jvm.h defines the 
>>>> interface that hotspot exports via jvm.cpp.
>>>>
>>>> If you leave jvm.h in hotspot/prims then a very large chunk of your 
>>>> boilerplate changes are not needed. The JDK code doesn't care what 
>>>> the name of the directory is - whatever it is just gets added as a 
>>>> -I directive (the JDK code will include "jvm.h" not "prims/jvm.h" 
>>>> the way hotspot sources do.
>>>>
>>>> This isn't something we want to change back or move again later. 
>>>> Whatever we do now we live with.
>>>
>>> I think it belongs with jni.h and I think the core libraries group 
>>> would agree.?? It seems more natural there than buried in the hotspot 
>>> prims directory.? I guess this is on hold while we have this 
>>> debate.?? Sigh.
>>>
>>> Actually with -I directives, changing to jvm.h from prims/jvm.h would 
>>> still work.?? Maybe we should change the name to jvm.hpp since it's 
>>> jvm.cpp though??? Or maybe just have two divergent copies and close 
>>> this as WNF.
>>>
>>>>
>>>>> I'm happy to withdraw the CSR.? We generally use the CSR process to 
>>>>> add and remove JVM_ interfaces even though they're a private 
>>>>> interface in case some other JVM/JDK combination relies on them. 
>>>>> The changes to these files are very minor though and not likely to 
>>>>> cause any even theoretical incompatibility, so I'll withdraw it.
>>>>>>
>>>>>> Moving on ...
>>>>>>
>>>>>> First to address the initial comments/query you had:
>>>>>>
>>>>>>> The JDK windows jni_md.h file defined jint as long and the hotspot
>>>>>>> windows jni_x86.h as int. I had to choose the jdk version since 
>>>>>>> it's the
>>>>>>> public version, so there are changes to the hotspot files for this.
>>>>>>
>>>>>> On Windows int and long are always the same as it uses ILP32 or 
>>>>>> LLP64 (not LP64 like *nix platforms). So either choice should be 
>>>>>> fine. That said there are some odd casting issues I comment on 
>>>>>> below. Does the VS compiler complain about mixing int and long in 
>>>>>> expressions?
>>>>>
>>>>> Yes, it does even though int and long are the same representation.
>>>>
>>>> And what an absolute mess that makes. :(
>>>>
>>>>>>
>>>>>>> Generally I changed the code to use 'int' rather than 'jint' 
>>>>>>> where the
>>>>>>> surrounding API didn't insist on consistently using java types. We
>>>>>>> should mostly be using C++ types within hotspot except in 
>>>>>>> interfaces to
>>>>>>> native/JNI code.
>>>>>>
>>>>>> I think you pulled too hard on a few threads here and things are 
>>>>>> starting to unravel. There are numerous cases I refer to below 
>>>>>> where either the cast seems unnecessary/inappropriate or else 
>>>>>> highlights a bunch of additional changes that also need to be 
>>>>>> made. The fan out from this could be horrendous. Unless you 
>>>>>> actually get some kind of error - and I'd like to understand the 
>>>>>> details of those - I would not suggest making these changes as 
>>>>>> part of this work.
>>>>>
>>>>> I didn't make any change unless there was was an error.? I have 100 
>>>>> failed JPRT jobs to confirm!? I eventually got a Windows system to 
>>>>> compile and test this on.?? Actually some of the changes came out 
>>>>> better.? Cases where we use jint as a bool simply turned to int. We 
>>>>> do not have an overload for bool for cmpxchg.
>>>>
>>>> That's unfortunate - ditto for OrderAccess.
>>>>
>>>>>>
>>>>>> Looking through I have a quite a few queries/comments - apologies 
>>>>>> in advance as I know how tedious this is:
>>>>>>
>>>>>> make/hotspot/lib/CompileLibjsig.gmk
>>>>>> src/java.base/solaris/native/libjsig/jsig.c
>>>>>>
>>>>>> Took a while to figure out why the include was needed. :) As a 
>>>>>> follow up I suggest just deleting the -I include directive, delete 
>>>>>> the Solaris-only definition of JSIG_VERSION_1_4_1, and delete 
>>>>>> everything to do with JVM_get_libjsig_version. It is all obsolete.
>>>>>
>>>>> Can I patch up jsig in a separate RFE?? I don't remember why this 
>>>>> broke so I simply moved JSIG #define.? Is jsig obsolete? Removing 
>>>>> JVM_* definitions generally requires a CSR.
>>>>
>>>> I did say "As a follow up". jsig is not obsolete but the jsig 
>>>> versioning code, only used by Solaris, is.
>>>>
>>>>>>
>>>>>> ---
>>>>>>
>>>>>> src/hotspot/cpu/arm/interp_masm_arm.cpp
>>>>>>
>>>>>> Why did you need to add the jvm.h include?
>>>>>>
>>>>>
>>>>> ?? tbz(Raccess_flags, JVM_ACC_SYNCHRONIZED_BIT, unlocked);
>>>>
>>>> Okay. I'm not going to try and figure out how this code found this 
>>>> before.
>>>>
>>>>>> ---
>>>>>>
>>>>>> src/hotspot/os/windows/os_windows.cpp.
>>>>>>
>>>>>> The type of process_exiting should be uint to match the DWORD of 
>>>>>> GetCurrentThreadID. Then you should need any casts. Also you 
>>>>>> missed this jint cast:
>>>>>>
>>>>>> 3796???????? process_exiting != (jint)GetCurrentThreadId()) {
>>>>>
>>>>> Yes, that's better to change process_exiting to a DWORD.? It needs 
>>>>> a DWORD cast to 0 in the cmpxchg.
>>>>>
>>>>> ???????? Atomic::cmpxchg(GetCurrentThreadId(), &process_exiting, 
>>>>> (DWORD)0);
>>>>>
>>>>> These templates are picky.
>>>>
>>>> Yes - their inability to deal with literals is extremely frustrating.
>>>>
>>>>>>
>>>>>> ---
>>>>>>
>>>>>> src/hotspot/share/c1/c1_Canonicalizer.hpp
>>>>>>
>>>>>> ? 43 #ifdef _WINDOWS
>>>>>> ? 44?? // jint is defined as long in jni_md.h, so convert from int 
>>>>>> to jint
>>>>>> ? 45?? void set_constant(int x)?????????????????????? { 
>>>>>> set_constant((jint)x); }
>>>>>> ? 46 #endif
>>>>>>
>>>>>> Why is this necessary? int and long are the same on Windows. The 
>>>>>> whole point is that jint hides the underlying type, so where does 
>>>>>> this go wrong?
>>>>>
>>>>> No, they are not the same types even though they have the same 
>>>>> representation!
>>>>
>>>> This is truly unfortunate.
>>>>
>>>>>>
>>>>>> ---
>>>>>>
>>>>>> src/hotspot/share/c1/c1_LinearScan.cpp
>>>>>>
>>>>>> ?ConstantIntValue((jint)0);
>>>>>>
>>>>>> why is this cast needed? what causes the ambiguity? (If this was a 
>>>>>> template I'd understand ;-) ). Also didn't you change that 
>>>>>> constructor to take an int anyway - not that I think it should - 
>>>>>> see below.
>>>>>
>>>>> Yes, it caused an ambiguity.? 0 matches 'int' but it doesn't match 
>>>>> 'long' better than any pointer type.? So this cast is needed.
>>>>
>>>> But you changed the constructor to take an int!
>>>>
>>>> ?class ConstantIntValue: public ScopeValue {
>>>> ? private:
>>>> -? jint _value;
>>>> +? int _value;
>>>> ? public:
>>>> -? ConstantIntValue(jint value)???????? { _value = value; }
>>>> +? ConstantIntValue(int value)????????? { _value = value; }
>>>>
>>>>
>>>
>>> Okay I removed this cast.
>>>
>>>>>> ---
>>>>>>
>>>>>> src/hotspot/share/ci/ciReplay.cpp
>>>>>>
>>>>>> 793???????? jint* dims = NEW_RESOURCE_ARRAY(jint, rank);
>>>>>>
>>>>>> why should this be jint?
>>>>>
>>>>> To avoid a cast from int* to jint* in the line below:
>>>>>
>>>>> ????????? value = kelem->multi_allocate(rank, dims, CHECK);
>>>>>
>>>>>
>>>>>>
>>>>>> ---
>>>>>>
>>>>>> src/hotspot/share/classfile/altHashing.cpp
>>>>>>
>>>>>> Okay this looks more consistent with jint.
>>>>>
>>>>> Yes.? I translated this from some native code iirc.
>>>>>>
>>>>>> ---
>>>>>>
>>>>>> src/hotspot/share/code/debugInfo.hpp
>>>>>>
>>>>>> These changes seem wrong. We have:
>>>>>>
>>>>>> ConstantLongValue(jlong value)
>>>>>> ConstantDoubleValue(jdouble value)
>>>>>>
>>>>>> so we should have:
>>>>>>
>>>>>> ConstantIntValue(jint value)
>>>>>
>>>>> Again, there are multiple call sites with '0', which match int 
>>>>> trivially but are confused with long.? It's less consistent I agree 
>>>>> but better to not cast all the call sites.
>>>>
>>>> This is really making a mess of the APIs - they should be a jint but 
>>>> we declare them int because of a 0 casting problem. Can't we just 
>>>> use 0L?
>>>
>>> There aren't that many casts.? You're right, that would have been 
>>> better in some places.
>>>
>>>>>>
>>>>>> ---
>>>>>>
>>>>>> src/hotspot/share/code/relocInfo.cpp
>>>>>>
>>>>>> Change seems unnecessary - int32_t is fine
>>>>>>
>>>>>
>>>>> No, int32_t doesn't match the calls below it.? They all assume _lo 
>>>>> and _hi are jint.
>>>>>> ---
>>>>>>
>>>>>> src/hotspot/share/compiler/compileBroker.cpp
>>>>>> src/hotspot/share/compiler/compileBroker.hpp
>>>>>>
>>>>>> I see a complete mix of int and jint in this class, so why make 
>>>>>> the one change you did ??
>>>>>
>>>>> This is another case of using jint as a flag with cmpxchg. The 
>>>>> templates for cmpxchg want the types to match and 0 and 1 are 
>>>>> essentially 'int'.? This is a lot cleaner this way.
>>>>
>>>> <sigh>
>>>>
>>>>>>
>>>>>> ---
>>>>>>
>>>>>> src/hotspot/share/jvmci/jvmciCompilerToVM.cpp
>>>>>>
>>>>>> 1700???? tty->write((char*) start, MIN2(length, (jint)O_BUFLEN));
>>>>>>
>>>>>> why did you need to add the jint cast? It's used without any cast 
>>>>>> on the next two lines:
>>>>>>
>>>>>> 1701???? length -= O_BUFLEN;
>>>>>> 1702???? offset += O_BUFLEN;
>>>>>>
>>>>>
>>>>> There's a conversion from O_BUFLEN from int to long in 1701 and 
>>>>> 1702.?? MIN2 is a template that wants the types to match exactly.
>>>>
>>>> $%^%$! templates!
>>>>
>>>>>> ??
>>>>>>
>>>>>> ---
>>>>>>
>>>>>> src/hotspot/share/jvmci/jvmciRuntime.cpp
>>>>>>
>>>>>> Looking around this code it seems very confused about types - eg 
>>>>>> the previous function is declared jboolean yet returns a jint on 
>>>>>> one path! It isn't clear to me if the return type is what should 
>>>>>> be changed or the parameter type? I would just leave this alone.
>>>>>
>>>>> I can't leave it alone because it doesn't compile that way. This 
>>>>> was the minimal change and yea, does look a bit inconsistent.
>>>>>>
>>>>>> ---
>>>>>>
>>>>>> src/hotspot/share/opto/mulnode.cpp
>>>>>>
>>>>>> Okay TypeInt has jint parts, so the remaining int32_t declarations 
>>>>>> (A, B, C, D) should also be jint.
>>>>>
>>>>> Yes.? c2 uses jint types.
>>>>>>
>>>>>> ---
>>>>>>
>>>>>> src/hotspot/share/opto/parse3.cpp
>>>>>>
>>>>>> I agree with the changes you made, but then:
>>>>>>
>>>>>> ?419???? jint dim_con = find_int_con(length[j], -1);
>>>>>>
>>>>>> should also be changed.
>>>>>>
>>>>>> And obviously MultiArrayExpandLimit should be defined as int not 
>>>>>> intx!
>>>>>
>>>>> Everything in globals.hpp is intx.? That's a thread that I don't 
>>>>> want to pull on!
>>>>
>>>> We still have that limitation? <double sigh>
>>>>>
>>>>> Changed dim_con to int.
>>>>>>
>>>>>> ---
>>>>>>
>>>>>> src/hotspot/share/opto/phaseX.cpp
>>>>>>
>>>>>> I can see that intcon(jint i) is consistent with longcon(jlong l), 
>>>>>> but the use of "i" in the code is more consistent with int than jint.
>>>>>
>>>>> huh?? really?
>>>>>>
>>>>>> ---
>>>>>>
>>>>>> src/hotspot/share/opto/type.cpp
>>>>>>
>>>>>> 1505 int TypeInt::hash(void) const {
>>>>>> 1506?? return java_add(java_add(_lo, _hi), java_add((jint)_widen, 
>>>>>> (jint)Type::Int));
>>>>>> 1507 }
>>>>>>
>>>>>> I can see that the (jint) casts you added make sense, but then the 
>>>>>> whole function should be returning jint not int. Ditto the other 
>>>>>> hash functions.
>>>>>
>>>>> I'm not messing with this, this is the minimal in type fixing that 
>>>>> I'm going to do here.
>>>>
>>>> <sigh>
>>>>
>>>>>>
>>>>>> ---
>>>>>>
>>>>>> src/hotspot/share/prims/jni.cpp
>>>>>>
>>>>>> I think vm_created should be a bool. In fact all the fields you 
>>>>>> changed are logically bools - do Atomics work for bool now?
>>>>>
>>>>> No, they do not.?? I had thought bool would be better originally too.
>>>>>>
>>>>>> ---
>>>>>>
>>>>>> src/hotspot/share/prims/jvm.cpp
>>>>>>
>>>>>> is_attachable is the terminology used in the JDK code.
>>>>>
>>>>> Well the JDK version had is_attach_supported() as the flag name so 
>>>>> I used that in this one place.
>>>>>>
>>>>>> ---
>>>>>>
>>>>>> src/hotspot/share/prims/jvmtiEnvBase.cpp
>>>>>> src/hotspot/share/prims/jvmtiImpl.cpp
>>>>>>
>>>>>> Are you making parameters consistent with the fields they initialize?
>>>>>
>>>>> They're consistent with the declarations now.
>>>>>>
>>>>>> ---
>>>>>>
>>>>>> src/hotspot/share/prims/jvmtiTagMap.cpp
>>>>>>
>>>>>> There is a mix of int and jint for slot in this code. You fixed 
>>>>>> some, but this remains:
>>>>>>
>>>>>> 2440 inline bool CallbackInvoker::report_stack_ref_root(jlong 
>>>>>> thread_tag,
>>>>>> 2441 jlong tid,
>>>>>> 2442 jint depth,
>>>>>> 2443 jmethodID method,
>>>>>> 2444 jlocation bci,
>>>>>> 2445 jint slot,
>>>>>
>>>>> Right for consistency with the declarations.
>>>>>>
>>>>>> ---
>>>>>>
>>>>>> src/hotspot/share/runtime/perfData.cpp
>>>>>>
>>>>>> Callers pass both jint and int, so param type seems arbitrary.
>>>>>
>>>>> They are, but importantly they match the declarations.
>>>>>>
>>>>>> ---
>>>>>>
>>>>>> src/hotspot/share/runtime/perfMemory.cpp
>>>>>> src/hotspot/share/runtime/perfMemory.hpp
>>>>>>
>>>>>> PerfMemory::_initialized should ideally be a bool - can 
>>>>>> OrderAccess handle that now?
>>>>>
>>>>> Nope.
>>>>>>
>>>>>> ---
>>>>>>
>>>>>> src/java.base/share/native/include/jvm.h
>>>>>>
>>>>>> Not clear why the jio functions are not also JNICALL ?
>>>>>
>>>>> They are now.? The JDK version didn't have JNICALL.? JVM needs 
>>>>> JNICALL.? I can't tell you why JDK didn't need JNICALL linkage.
>>>>
>>>> ?? JVM currently does not have JNICALL. But they are declared as 
>>>> "extern C".
>>>
>>> This was a compilation error on Windows with JDK.?? Maybe the C code 
>>> in the JDK doesn't complain about linkage differences. I'll have to 
>>> go back and figure this out then.
>>>>
>>>>>>
>>>>>> ---
>>>>>>
>>>>>> src/java.base/unix/native/include/jni_md.h
>>>>>>
>>>>>> There is no need to special case ARM. The differences in the 
>>>>>> existing code were for LTO support and that is now irrelevant.
>>>>>
>>>>> See discussion with Magnus.?? We still build ARM for jdk10/hs so I 
>>>>> needed this conditional or of course I wouldn't have added it.? We 
>>>>> can remove it with LTO support.
>>>>
>>>> Those builds are gone - this is obsolete. But yes all LTO can be 
>>>> removed later if you wish. Just trying to simplify things now.
>>>>
>>>>>>
>>>>>> ---
>>>>>>
>>>>>> src/java.base/unix/native/include/jvm_md.h
>>>>>>
>>>>>> I know you've just copied this across, but it seems wrong to me:
>>>>>>
>>>>>> ?57 // Hack: MAXPATHLEN is 4095 on some Linux and 4096 on others. 
>>>>>> This may
>>>>>> ? 58 //?????? cause problems if JVM and the rest of JDK are built 
>>>>>> on different
>>>>>> ? 59 //?????? Linux releases. Here we define JVM_MAXPATHLEN to be 
>>>>>> MAXPATHLEN + 1,
>>>>>> ? 60 //?????? so buffers declared in VM are always >= 4096.
>>>>>> ? 61 #define JVM_MAXPATHLEN MAXPATHLEN + 1
>>>>>>
>>>>>> It doesn't make sense to me to define an internal "max path 
>>>>>> length" that can _exceed_ the platform max!
>>>>>>
>>>>>> That aside there's no support for building different parts of the 
>>>>>> JDK on different platforms and then bringing them together. And in 
>>>>>> any case I would think the real problem would be building on a 
>>>>>> platform that uses 4096 and running on one that uses 4095!
>>>>>>
>>>>>> But that aside this is a Linux hack and should be guarded by ifdef 
>>>>>> LINUX. (I doubt BSD needs it, the bsd file is just a copy of the 
>>>>>> linux one - the JDK macosx version does the right thing). Solaris 
>>>>>> and AIX should stay as-is at MAXPATHLEN.
>>>>>
>>>>> All of the unix platforms had MAXPATHLEN+1.? I'll leave it for now 
>>>>> and we can investigate that further.
>>>>
>>>> I see the following existing code:
>>>>
>>>> src/java.base/unix/native/include/jvm_md.h:
>>>>
>>>> #define JVM_MAXPATHLEN MAXPATHLEN
>>>>
>>>> src/java.base/macosx/native/include/jvm_md.h
>>>>
>>>> #define JVM_MAXPATHLEN MAXPATHLEN
>>>>
>>>> src/hotspot/os/aix/jvm_aix.h
>>>>
>>>> #define JVM_MAXPATHLEN MAXPATHLEN
>>>>
>>>> src/hotspot/os/bsd/jvm_bsd.h
>>>>
>>>> #define JVM_MAXPATHLEN MAXPATHLEN + 1? // blindly copied from Linux 
>>>> version
>>>>
>>>> src/hotspot/os/linux/jvm_linux.h
>>>>
>>>> #define JVM_MAXPATHLEN MAXPATHLEN + 1
>>>>
>>>> src/hotspot/os/solaris/jvm_solaris.h
>>>>
>>>> #define JVM_MAXPATHLEN MAXPATHLEN
>>>>
>>>> This is a linux only hack (if you ignore the blind copy from linux 
>>>> into the BSD code in the VM).
>>>
>>> Oh, thanks, so should I add a bunch of ifdefs then?? Or do you think 
>>> having MAXPATHLEN + 1 will really break the other platforms?? Do you 
>>> really see this as a problem or are you just pointing out inconsistency?
>>>>
>>>>>>
>>>>>> ?86 #define ASYNC_SIGNAL???? SIGJVM2
>>>>>>
>>>>>> This only exists on Solaris so I think should be in #ifdef 
>>>>>> SOLARIS, to make that clear.
>>>>>
>>>>> Ok.? I'll add this.
>>>>>>
>>>>>> ---
>>>>>>
>>>>>> src/java.base/windows/native/include/jvm_md.h
>>>>>>
>>>>>> Given the differences between the two versions either something 
>>>>>> has been broken or "extern C" declarations are not needed :)
>>>>>
>>>>> Well, they are needed for Hotspot to build and do not prevent jdk 
>>>>> from building.? I don't know what was broken.
>>>>
>>>> We really need to understand this better. Maybe related to the map 
>>>> files that expose the symbols. ??
>>>
>>> They're needed because the JDK files are written mostly in C and that 
>>> doesn't complain about the linkage difference.? Hotspot files are in 
>>> C++ which does complain.
>>>
>>>>
>>>>>>
>>>>>> ---
>>>>>>
>>>>>> That was a really painful way to spend most of my Friday. TGIF! :)
>>>>>
>>>>> Thanks for going through it.? See comments inline for changes. 
>>>>> Generating a webrev takes hours so I'm not going to do that unless 
>>>>> you insist.
>>>>
>>>> An incremental webrev shouldn't take long - right? You're a mq 
>>>> maestro now. :)
>>>
>>> Well I generally trash a repository whenever I use mq but sure.
>>>>
>>>> If you can reasonably produce an incremental webrev once you've 
>>>> settled on all the comments/issues that would be good.
>>>
>>> Ok, sure.
>>>
>>> Coleen
>>>>
>>>> Thanks,
>>>> David
>>>>
>>>>> Thanks,
>>>>> Coleen
>>>>>
>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>> David
>>>>>> -----
>>>>>>
>>>>>>
>>>>>> On 27/10/2017 6:44 AM, coleen.phillimore at oracle.com wrote:
>>>>>>> ??Hi Magnus,
>>>>>>>
>>>>>>> Thank you for reviewing this.?? I have a new version that takes 
>>>>>>> out the hack in globalDefinitions.hpp and adds casts to 
>>>>>>> src/hotspot/share/opto/type.cpp instead.
>>>>>>>
>>>>>>> Also some fixes from Martin at SAP.
>>>>>>>
>>>>>>> open webrev at http://cr.openjdk.java.net/~coleenp/8189610.02/webrev
>>>>>>>
>>>>>>> see below.
>>>>>>>
>>>>>>> On 10/26/17 5:57 AM, Magnus Ihse Bursie wrote:
>>>>>>>> Coleen,
>>>>>>>>
>>>>>>>> Thank you for addressing this!
>>>>>>>>
>>>>>>>> On 2017-10-25 18:49, coleen.phillimore at oracle.com wrote:
>>>>>>>>> Summary: removed hotspot version of jvm*h and jni*h files
>>>>>>>>>
>>>>>>>>> Mostly used sed to remove prims/jvm.h and move #include "jvm.h" 
>>>>>>>>> after precompiled.h, so if you have repetitive stress wrist 
>>>>>>>>> issues don't click on most of these files.
>>>>>>>>>
>>>>>>>>> There were more issues to resolve, however.? The JDK windows 
>>>>>>>>> jni_md.h file defined jint as long and the hotspot windows 
>>>>>>>>> jni_x86.h as int. I had to choose the jdk version since it's 
>>>>>>>>> the public version, so there are changes to the hotspot files 
>>>>>>>>> for this. Generally I changed the code to use 'int' rather than 
>>>>>>>>> 'jint' where the surrounding API didn't insist on consistently 
>>>>>>>>> using java types. We should mostly be using C++ types within 
>>>>>>>>> hotspot except in interfaces to native/JNI code.? There are a 
>>>>>>>>> couple of hacks in places where adding multiple jint casts was 
>>>>>>>>> too painful.
>>>>>>>>>
>>>>>>>>> Tested with JPRT and tier2-4 (in progress).
>>>>>>>>>
>>>>>>>>> open webrev at 
>>>>>>>>> http://cr.openjdk.java.net/~coleenp/8189610.01/webrev
>>>>>>>>
>>>>>>>> Looks great!
>>>>>>>>
>>>>>>>> Just a few comments:
>>>>>>>>
>>>>>>>> * src/java.base/unix/native/include/jni_md.h:
>>>>>>>>
>>>>>>>> I don't think the externally_visible attribute should be there 
>>>>>>>> for arm. I know this was the case for the corresponding hotspot 
>>>>>>>> file for arm, but that was techically incorrect. The proper 
>>>>>>>> dependency here is that externally_visible should be in all 
>>>>>>>> JNIEXPORT if and only if we're building with JVM feature 
>>>>>>>> "link-time-opt". Traditionally, that feature been enabled when 
>>>>>>>> building arm32 builds, and only then, so there's been a 
>>>>>>>> (coincidentally) connection here. Nowadays, Oracle does not care 
>>>>>>>> about the arm32 builds, and I'm not sure if anyone else is 
>>>>>>>> building them with link-time-opt enabled.
>>>>>>>>
>>>>>>>> It does seem wrong to me to export this behavior in the public 
>>>>>>>> jni_md.h file, though. I think the correct way to solve this, if 
>>>>>>>> we should continue supporting link-time-opt is to make sure this 
>>>>>>>> attribute is set for exported hotspot functions. If it's still 
>>>>>>>> needed, that is. A quick googling seems to indicate that 
>>>>>>>> visibility("default") might be enough in modern gcc's.
>>>>>>>>
>>>>>>>> A third option is to remove the support for link-time-opt 
>>>>>>>> entirely, if it's not really used.
>>>>>>>
>>>>>>> I didn't know how to change this since we are still building ARM 
>>>>>>> with the jdk10/hs repository, and ARM needed this change.? I 
>>>>>>> could wait until we bring down the jdk10/master changes that 
>>>>>>> remove the ARM build and remove this conditional before I push. 
>>>>>>> Or we could file an RFE to remove link-time-opt (?) and remove it 
>>>>>>> then?
>>>>>>>
>>>>>>>>
>>>>>>>> * src/java.base/unix/native/include/jvm_md.h and 
>>>>>>>> src/java.base/windows/native/include/jvm_md.h:
>>>>>>>>
>>>>>>>> These files define a public API, and contain non-trivial 
>>>>>>>> changes. I suspect you should file a CSR request. (Even though I 
>>>>>>>> realize you're only matching the header file with the reality.)
>>>>>>>>
>>>>>>>
>>>>>>> I filed the CSR.?? Waiting for the next steps.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Coleen
>>>>>>>
>>>>>>>> /Magnus
>>>>>>>>
>>>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8189610
>>>>>>>>>
>>>>>>>>> I have a script to update copyright files on commit.
>>>>>>>>>
>>>>>>>>> Thanks to Magnus and ErikJ for the makefile changes.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Coleen
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>
>>>
> 

From coleen.phillimore at oracle.com  Mon Oct 30 12:38:23 2017
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Mon, 30 Oct 2017 08:38:23 -0400
Subject: RFR (L, tedious again, sorry) 8189610: Reconcile jvm.h and all
 jvm_md.h between java.base and hotspot
In-Reply-To: <815ac734-ea8b-ea2d-ecec-85cb547ba2f4@oracle.com>
References: <8671321f-398c-5f7f-634d-9d9664e04d87@oracle.com>
 <05bf853a-52f8-c450-4171-89f6b64d793a@oracle.com>
 <72f2aac7-ed20-c995-913b-ee4341a2a978@oracle.com>
 <55ec3559-c593-bcb6-51b0-4639da126068@oracle.com>
 <a05deb31-c7bf-4fda-f621-d39cfb56abb4@oracle.com>
 <4509dce7-10f8-4558-2adb-90d4745e054e@oracle.com>
 <396ab0f7-3710-3f76-675a-5108bcb50af5@oracle.com>
 <22afedef-59cc-ecde-48fc-0afb7b4bbb47@oracle.com>
 <a3389ea3-83b2-40fb-85cd-bb741191d219@oracle.com>
 <815ac734-ea8b-ea2d-ecec-85cb547ba2f4@oracle.com>
Message-ID: <440f79ba-2da3-b627-53bc-e1842e3cf73c@oracle.com>


On 10/30/17 8:17 AM, David Holmes wrote:
> On 30/10/2017 10:13 PM, coleen.phillimore at oracle.com wrote:
>> On 10/28/17 3:50 AM, David Holmes wrote:
>>> Hi Coleen,
>>>
>>> I've commented on the file location in response to Mandy's email.
>>>
>>> The only issue I'm still concerned about is the JVM_MAXPATHLEN 
>>> issue. I think it is a bug to define a JVM_MAXPATHLEN that is bigger 
>>> than the platform MAXPATHLEN. I also would not want to see any 
>>> change in behaviour because of this - so AIX and Solaris should not 
>>> get a different JVM_MAXPATHLEN due to this refactoring change. So 
>>> yes I think this needs to be ifdef'd for Linux and reluctantly 
>>> (because it was a copy error) for OSX/BSD as well.
>>
>> #if defined(AIX) || defined(SOLARIS)
>> #define JVM_MAXPATHLEN MAXPATHLEN
>> #else
>> // Hack: MAXPATHLEN is 4095 on some Linux and 4096 on others. This may
>> //?????? cause problems if JVM and the rest of JDK are built on 
>> different
>> //?????? Linux releases. Here we define JVM_MAXPATHLEN to be 
>> MAXPATHLEN + 1,
>> //?????? so buffers declared in VM are always >= 4096.
>> #define JVM_MAXPATHLEN MAXPATHLEN + 1
>> #endif
>>
>> Is this ok?
>
> Yes - thanks. It preserves existing behaviour on the VM side at least. 
> Time will tell if it messes anything up on the JDK side for Linux/OSX.

I don't want to wait for time so I'm investigating.

It's one use is:

Java_java_io_UnixFileSystem_canonicalize0(JNIEnv *env, jobject this,
...
 ??????? char canonicalPath[JVM_MAXPATHLEN];
 ??????? if (canonicalize((char *)path,
 ???????????????????????? canonicalPath, JVM_MAXPATHLEN) < 0) {
 ??????????? JNU_ThrowIOExceptionWithLastError(env, "Bad pathname");

Which goes to:

canonicalize_md.c

canonicalize(char *original, char *resolved, int len)
 ??? if (len < PATH_MAX) {
 ??????? errno = EINVAL;
 ??????? return -1;
 ??? }


So this should fail every time.

sys/param.h:# define MAXPATHLEN??? PATH_MAX

I haven't found any tests for it.

I don't know why Java_java_io_UnixFileSystem uses JVM_MAXPATHLEN since 
it's not calling the JVM interface as far as I can tell.??? I think it 
should be changed to PATH_MAX.

?
Coleen
>
> David
>
>> thanks,
>> Coleen
>>>
>>> Thanks,
>>> David
>>>
>>> On 28/10/2017 12:08 AM, coleen.phillimore at oracle.com wrote:
>>>>
>>>>
>>>> On 10/27/17 9:37 AM, David Holmes wrote:
>>>>> On 27/10/2017 10:13 PM, coleen.phillimore at oracle.com wrote:
>>>>>>
>>>>>>
>>>>>> On 10/27/17 3:23 AM, David Holmes wrote:
>>>>>>> Hi Coleen,
>>>>>>>
>>>>>>> Thanks for tackling this.
>>>>>>>
>>>>>>>> Summary: removed hotspot version of jvm*h and jni*h files
>>>>>>>
>>>>>>> Can you update the bug synopsis to show it covers both sets of 
>>>>>>> files please.
>>>>>>>
>>>>>>> I hate to start with this (and it took me quite a while to 
>>>>>>> realize it) but as Mandy pointed out jvm.h is not an exported 
>>>>>>> interface from the JDK to the outside world (so not subject to 
>>>>>>> CSR review), but is a private interface between the JVM and the 
>>>>>>> JDK libraries. So I think really jvm.h belongs in the hotspot 
>>>>>>> sources where it was, while jni.h belongs in the exported JDK 
>>>>>>> sources. In which case the bulk of your changes to the hotspot 
>>>>>>> files would not be needed - sorry.
>>>>>>
>>>>>> Maybe someone can make that decision and change at a later date. 
>>>>>> The point of this change is that there is now only one of these 
>>>>>> files that is shared.? I don't think jvm.h and the jvm_md.h 
>>>>>> belong on the hotspot sources for the jdk to find them in some 
>>>>>> random prims and os dependent directories.
>>>>>
>>>>> The one file that is needed is a hotspot file - jvm.h defines the 
>>>>> interface that hotspot exports via jvm.cpp.
>>>>>
>>>>> If you leave jvm.h in hotspot/prims then a very large chunk of 
>>>>> your boilerplate changes are not needed. The JDK code doesn't care 
>>>>> what the name of the directory is - whatever it is just gets added 
>>>>> as a -I directive (the JDK code will include "jvm.h" not 
>>>>> "prims/jvm.h" the way hotspot sources do.
>>>>>
>>>>> This isn't something we want to change back or move again later. 
>>>>> Whatever we do now we live with.
>>>>
>>>> I think it belongs with jni.h and I think the core libraries group 
>>>> would agree.?? It seems more natural there than buried in the 
>>>> hotspot prims directory.? I guess this is on hold while we have 
>>>> this debate.?? Sigh.
>>>>
>>>> Actually with -I directives, changing to jvm.h from prims/jvm.h 
>>>> would still work.?? Maybe we should change the name to jvm.hpp 
>>>> since it's jvm.cpp though??? Or maybe just have two divergent 
>>>> copies and close this as WNF.
>>>>
>>>>>
>>>>>> I'm happy to withdraw the CSR.? We generally use the CSR process 
>>>>>> to add and remove JVM_ interfaces even though they're a private 
>>>>>> interface in case some other JVM/JDK combination relies on them. 
>>>>>> The changes to these files are very minor though and not likely 
>>>>>> to cause any even theoretical incompatibility, so I'll withdraw it.
>>>>>>>
>>>>>>> Moving on ...
>>>>>>>
>>>>>>> First to address the initial comments/query you had:
>>>>>>>
>>>>>>>> The JDK windows jni_md.h file defined jint as long and the hotspot
>>>>>>>> windows jni_x86.h as int. I had to choose the jdk version since 
>>>>>>>> it's the
>>>>>>>> public version, so there are changes to the hotspot files for 
>>>>>>>> this.
>>>>>>>
>>>>>>> On Windows int and long are always the same as it uses ILP32 or 
>>>>>>> LLP64 (not LP64 like *nix platforms). So either choice should be 
>>>>>>> fine. That said there are some odd casting issues I comment on 
>>>>>>> below. Does the VS compiler complain about mixing int and long 
>>>>>>> in expressions?
>>>>>>
>>>>>> Yes, it does even though int and long are the same representation.
>>>>>
>>>>> And what an absolute mess that makes. :(
>>>>>
>>>>>>>
>>>>>>>> Generally I changed the code to use 'int' rather than 'jint' 
>>>>>>>> where the
>>>>>>>> surrounding API didn't insist on consistently using java types. We
>>>>>>>> should mostly be using C++ types within hotspot except in 
>>>>>>>> interfaces to
>>>>>>>> native/JNI code.
>>>>>>>
>>>>>>> I think you pulled too hard on a few threads here and things are 
>>>>>>> starting to unravel. There are numerous cases I refer to below 
>>>>>>> where either the cast seems unnecessary/inappropriate or else 
>>>>>>> highlights a bunch of additional changes that also need to be 
>>>>>>> made. The fan out from this could be horrendous. Unless you 
>>>>>>> actually get some kind of error - and I'd like to understand the 
>>>>>>> details of those - I would not suggest making these changes as 
>>>>>>> part of this work.
>>>>>>
>>>>>> I didn't make any change unless there was was an error. I have 
>>>>>> 100 failed JPRT jobs to confirm!? I eventually got a Windows 
>>>>>> system to compile and test this on. Actually some of the changes 
>>>>>> came out better.? Cases where we use jint as a bool simply turned 
>>>>>> to int. We do not have an overload for bool for cmpxchg.
>>>>>
>>>>> That's unfortunate - ditto for OrderAccess.
>>>>>
>>>>>>>
>>>>>>> Looking through I have a quite a few queries/comments - 
>>>>>>> apologies in advance as I know how tedious this is:
>>>>>>>
>>>>>>> make/hotspot/lib/CompileLibjsig.gmk
>>>>>>> src/java.base/solaris/native/libjsig/jsig.c
>>>>>>>
>>>>>>> Took a while to figure out why the include was needed. :) As a 
>>>>>>> follow up I suggest just deleting the -I include directive, 
>>>>>>> delete the Solaris-only definition of JSIG_VERSION_1_4_1, and 
>>>>>>> delete everything to do with JVM_get_libjsig_version. It is all 
>>>>>>> obsolete.
>>>>>>
>>>>>> Can I patch up jsig in a separate RFE?? I don't remember why this 
>>>>>> broke so I simply moved JSIG #define.? Is jsig obsolete? Removing 
>>>>>> JVM_* definitions generally requires a CSR.
>>>>>
>>>>> I did say "As a follow up". jsig is not obsolete but the jsig 
>>>>> versioning code, only used by Solaris, is.
>>>>>
>>>>>>>
>>>>>>> ---
>>>>>>>
>>>>>>> src/hotspot/cpu/arm/interp_masm_arm.cpp
>>>>>>>
>>>>>>> Why did you need to add the jvm.h include?
>>>>>>>
>>>>>>
>>>>>> ?? tbz(Raccess_flags, JVM_ACC_SYNCHRONIZED_BIT, unlocked);
>>>>>
>>>>> Okay. I'm not going to try and figure out how this code found this 
>>>>> before.
>>>>>
>>>>>>> ---
>>>>>>>
>>>>>>> src/hotspot/os/windows/os_windows.cpp.
>>>>>>>
>>>>>>> The type of process_exiting should be uint to match the DWORD of 
>>>>>>> GetCurrentThreadID. Then you should need any casts. Also you 
>>>>>>> missed this jint cast:
>>>>>>>
>>>>>>> 3796???????? process_exiting != (jint)GetCurrentThreadId()) {
>>>>>>
>>>>>> Yes, that's better to change process_exiting to a DWORD.? It 
>>>>>> needs a DWORD cast to 0 in the cmpxchg.
>>>>>>
>>>>>> ???????? Atomic::cmpxchg(GetCurrentThreadId(), &process_exiting, 
>>>>>> (DWORD)0);
>>>>>>
>>>>>> These templates are picky.
>>>>>
>>>>> Yes - their inability to deal with literals is extremely frustrating.
>>>>>
>>>>>>>
>>>>>>> ---
>>>>>>>
>>>>>>> src/hotspot/share/c1/c1_Canonicalizer.hpp
>>>>>>>
>>>>>>> ? 43 #ifdef _WINDOWS
>>>>>>> ? 44?? // jint is defined as long in jni_md.h, so convert from 
>>>>>>> int to jint
>>>>>>> ? 45?? void set_constant(int x) { set_constant((jint)x); }
>>>>>>> ? 46 #endif
>>>>>>>
>>>>>>> Why is this necessary? int and long are the same on Windows. The 
>>>>>>> whole point is that jint hides the underlying type, so where 
>>>>>>> does this go wrong?
>>>>>>
>>>>>> No, they are not the same types even though they have the same 
>>>>>> representation!
>>>>>
>>>>> This is truly unfortunate.
>>>>>
>>>>>>>
>>>>>>> ---
>>>>>>>
>>>>>>> src/hotspot/share/c1/c1_LinearScan.cpp
>>>>>>>
>>>>>>> ?ConstantIntValue((jint)0);
>>>>>>>
>>>>>>> why is this cast needed? what causes the ambiguity? (If this was 
>>>>>>> a template I'd understand ;-) ). Also didn't you change that 
>>>>>>> constructor to take an int anyway - not that I think it should - 
>>>>>>> see below.
>>>>>>
>>>>>> Yes, it caused an ambiguity.? 0 matches 'int' but it doesn't 
>>>>>> match 'long' better than any pointer type.? So this cast is needed.
>>>>>
>>>>> But you changed the constructor to take an int!
>>>>>
>>>>> ?class ConstantIntValue: public ScopeValue {
>>>>> ? private:
>>>>> -? jint _value;
>>>>> +? int _value;
>>>>> ? public:
>>>>> -? ConstantIntValue(jint value)???????? { _value = value; }
>>>>> +? ConstantIntValue(int value)????????? { _value = value; }
>>>>>
>>>>>
>>>>
>>>> Okay I removed this cast.
>>>>
>>>>>>> ---
>>>>>>>
>>>>>>> src/hotspot/share/ci/ciReplay.cpp
>>>>>>>
>>>>>>> 793???????? jint* dims = NEW_RESOURCE_ARRAY(jint, rank);
>>>>>>>
>>>>>>> why should this be jint?
>>>>>>
>>>>>> To avoid a cast from int* to jint* in the line below:
>>>>>>
>>>>>> ????????? value = kelem->multi_allocate(rank, dims, CHECK);
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> ---
>>>>>>>
>>>>>>> src/hotspot/share/classfile/altHashing.cpp
>>>>>>>
>>>>>>> Okay this looks more consistent with jint.
>>>>>>
>>>>>> Yes.? I translated this from some native code iirc.
>>>>>>>
>>>>>>> ---
>>>>>>>
>>>>>>> src/hotspot/share/code/debugInfo.hpp
>>>>>>>
>>>>>>> These changes seem wrong. We have:
>>>>>>>
>>>>>>> ConstantLongValue(jlong value)
>>>>>>> ConstantDoubleValue(jdouble value)
>>>>>>>
>>>>>>> so we should have:
>>>>>>>
>>>>>>> ConstantIntValue(jint value)
>>>>>>
>>>>>> Again, there are multiple call sites with '0', which match int 
>>>>>> trivially but are confused with long.? It's less consistent I 
>>>>>> agree but better to not cast all the call sites.
>>>>>
>>>>> This is really making a mess of the APIs - they should be a jint 
>>>>> but we declare them int because of a 0 casting problem. Can't we 
>>>>> just use 0L?
>>>>
>>>> There aren't that many casts.? You're right, that would have been 
>>>> better in some places.
>>>>
>>>>>>>
>>>>>>> ---
>>>>>>>
>>>>>>> src/hotspot/share/code/relocInfo.cpp
>>>>>>>
>>>>>>> Change seems unnecessary - int32_t is fine
>>>>>>>
>>>>>>
>>>>>> No, int32_t doesn't match the calls below it.? They all assume 
>>>>>> _lo and _hi are jint.
>>>>>>> ---
>>>>>>>
>>>>>>> src/hotspot/share/compiler/compileBroker.cpp
>>>>>>> src/hotspot/share/compiler/compileBroker.hpp
>>>>>>>
>>>>>>> I see a complete mix of int and jint in this class, so why make 
>>>>>>> the one change you did ??
>>>>>>
>>>>>> This is another case of using jint as a flag with cmpxchg. The 
>>>>>> templates for cmpxchg want the types to match and 0 and 1 are 
>>>>>> essentially 'int'.? This is a lot cleaner this way.
>>>>>
>>>>> <sigh>
>>>>>
>>>>>>>
>>>>>>> ---
>>>>>>>
>>>>>>> src/hotspot/share/jvmci/jvmciCompilerToVM.cpp
>>>>>>>
>>>>>>> 1700???? tty->write((char*) start, MIN2(length, (jint)O_BUFLEN));
>>>>>>>
>>>>>>> why did you need to add the jint cast? It's used without any 
>>>>>>> cast on the next two lines:
>>>>>>>
>>>>>>> 1701???? length -= O_BUFLEN;
>>>>>>> 1702???? offset += O_BUFLEN;
>>>>>>>
>>>>>>
>>>>>> There's a conversion from O_BUFLEN from int to long in 1701 and 
>>>>>> 1702.?? MIN2 is a template that wants the types to match exactly.
>>>>>
>>>>> $%^%$! templates!
>>>>>
>>>>>>> ??
>>>>>>>
>>>>>>> ---
>>>>>>>
>>>>>>> src/hotspot/share/jvmci/jvmciRuntime.cpp
>>>>>>>
>>>>>>> Looking around this code it seems very confused about types - eg 
>>>>>>> the previous function is declared jboolean yet returns a jint on 
>>>>>>> one path! It isn't clear to me if the return type is what should 
>>>>>>> be changed or the parameter type? I would just leave this alone.
>>>>>>
>>>>>> I can't leave it alone because it doesn't compile that way. This 
>>>>>> was the minimal change and yea, does look a bit inconsistent.
>>>>>>>
>>>>>>> ---
>>>>>>>
>>>>>>> src/hotspot/share/opto/mulnode.cpp
>>>>>>>
>>>>>>> Okay TypeInt has jint parts, so the remaining int32_t 
>>>>>>> declarations (A, B, C, D) should also be jint.
>>>>>>
>>>>>> Yes.? c2 uses jint types.
>>>>>>>
>>>>>>> ---
>>>>>>>
>>>>>>> src/hotspot/share/opto/parse3.cpp
>>>>>>>
>>>>>>> I agree with the changes you made, but then:
>>>>>>>
>>>>>>> ?419???? jint dim_con = find_int_con(length[j], -1);
>>>>>>>
>>>>>>> should also be changed.
>>>>>>>
>>>>>>> And obviously MultiArrayExpandLimit should be defined as int not 
>>>>>>> intx!
>>>>>>
>>>>>> Everything in globals.hpp is intx.? That's a thread that I don't 
>>>>>> want to pull on!
>>>>>
>>>>> We still have that limitation? <double sigh>
>>>>>>
>>>>>> Changed dim_con to int.
>>>>>>>
>>>>>>> ---
>>>>>>>
>>>>>>> src/hotspot/share/opto/phaseX.cpp
>>>>>>>
>>>>>>> I can see that intcon(jint i) is consistent with longcon(jlong 
>>>>>>> l), but the use of "i" in the code is more consistent with int 
>>>>>>> than jint.
>>>>>>
>>>>>> huh?? really?
>>>>>>>
>>>>>>> ---
>>>>>>>
>>>>>>> src/hotspot/share/opto/type.cpp
>>>>>>>
>>>>>>> 1505 int TypeInt::hash(void) const {
>>>>>>> 1506?? return java_add(java_add(_lo, _hi), 
>>>>>>> java_add((jint)_widen, (jint)Type::Int));
>>>>>>> 1507 }
>>>>>>>
>>>>>>> I can see that the (jint) casts you added make sense, but then 
>>>>>>> the whole function should be returning jint not int. Ditto the 
>>>>>>> other hash functions.
>>>>>>
>>>>>> I'm not messing with this, this is the minimal in type fixing 
>>>>>> that I'm going to do here.
>>>>>
>>>>> <sigh>
>>>>>
>>>>>>>
>>>>>>> ---
>>>>>>>
>>>>>>> src/hotspot/share/prims/jni.cpp
>>>>>>>
>>>>>>> I think vm_created should be a bool. In fact all the fields you 
>>>>>>> changed are logically bools - do Atomics work for bool now?
>>>>>>
>>>>>> No, they do not.?? I had thought bool would be better originally 
>>>>>> too.
>>>>>>>
>>>>>>> ---
>>>>>>>
>>>>>>> src/hotspot/share/prims/jvm.cpp
>>>>>>>
>>>>>>> is_attachable is the terminology used in the JDK code.
>>>>>>
>>>>>> Well the JDK version had is_attach_supported() as the flag name 
>>>>>> so I used that in this one place.
>>>>>>>
>>>>>>> ---
>>>>>>>
>>>>>>> src/hotspot/share/prims/jvmtiEnvBase.cpp
>>>>>>> src/hotspot/share/prims/jvmtiImpl.cpp
>>>>>>>
>>>>>>> Are you making parameters consistent with the fields they 
>>>>>>> initialize?
>>>>>>
>>>>>> They're consistent with the declarations now.
>>>>>>>
>>>>>>> ---
>>>>>>>
>>>>>>> src/hotspot/share/prims/jvmtiTagMap.cpp
>>>>>>>
>>>>>>> There is a mix of int and jint for slot in this code. You fixed 
>>>>>>> some, but this remains:
>>>>>>>
>>>>>>> 2440 inline bool CallbackInvoker::report_stack_ref_root(jlong 
>>>>>>> thread_tag,
>>>>>>> 2441 jlong tid,
>>>>>>> 2442 jint depth,
>>>>>>> 2443 jmethodID method,
>>>>>>> 2444 jlocation bci,
>>>>>>> 2445 jint slot,
>>>>>>
>>>>>> Right for consistency with the declarations.
>>>>>>>
>>>>>>> ---
>>>>>>>
>>>>>>> src/hotspot/share/runtime/perfData.cpp
>>>>>>>
>>>>>>> Callers pass both jint and int, so param type seems arbitrary.
>>>>>>
>>>>>> They are, but importantly they match the declarations.
>>>>>>>
>>>>>>> ---
>>>>>>>
>>>>>>> src/hotspot/share/runtime/perfMemory.cpp
>>>>>>> src/hotspot/share/runtime/perfMemory.hpp
>>>>>>>
>>>>>>> PerfMemory::_initialized should ideally be a bool - can 
>>>>>>> OrderAccess handle that now?
>>>>>>
>>>>>> Nope.
>>>>>>>
>>>>>>> ---
>>>>>>>
>>>>>>> src/java.base/share/native/include/jvm.h
>>>>>>>
>>>>>>> Not clear why the jio functions are not also JNICALL ?
>>>>>>
>>>>>> They are now.? The JDK version didn't have JNICALL.? JVM needs 
>>>>>> JNICALL.? I can't tell you why JDK didn't need JNICALL linkage.
>>>>>
>>>>> ?? JVM currently does not have JNICALL. But they are declared as 
>>>>> "extern C".
>>>>
>>>> This was a compilation error on Windows with JDK.?? Maybe the C 
>>>> code in the JDK doesn't complain about linkage differences. I'll 
>>>> have to go back and figure this out then.
>>>>>
>>>>>>>
>>>>>>> ---
>>>>>>>
>>>>>>> src/java.base/unix/native/include/jni_md.h
>>>>>>>
>>>>>>> There is no need to special case ARM. The differences in the 
>>>>>>> existing code were for LTO support and that is now irrelevant.
>>>>>>
>>>>>> See discussion with Magnus.?? We still build ARM for jdk10/hs so 
>>>>>> I needed this conditional or of course I wouldn't have added it.? 
>>>>>> We can remove it with LTO support.
>>>>>
>>>>> Those builds are gone - this is obsolete. But yes all LTO can be 
>>>>> removed later if you wish. Just trying to simplify things now.
>>>>>
>>>>>>>
>>>>>>> ---
>>>>>>>
>>>>>>> src/java.base/unix/native/include/jvm_md.h
>>>>>>>
>>>>>>> I know you've just copied this across, but it seems wrong to me:
>>>>>>>
>>>>>>> ?57 // Hack: MAXPATHLEN is 4095 on some Linux and 4096 on 
>>>>>>> others. This may
>>>>>>> ? 58 //?????? cause problems if JVM and the rest of JDK are 
>>>>>>> built on different
>>>>>>> ? 59 //?????? Linux releases. Here we define JVM_MAXPATHLEN to 
>>>>>>> be MAXPATHLEN + 1,
>>>>>>> ? 60 //?????? so buffers declared in VM are always >= 4096.
>>>>>>> ? 61 #define JVM_MAXPATHLEN MAXPATHLEN + 1
>>>>>>>
>>>>>>> It doesn't make sense to me to define an internal "max path 
>>>>>>> length" that can _exceed_ the platform max!
>>>>>>>
>>>>>>> That aside there's no support for building different parts of 
>>>>>>> the JDK on different platforms and then bringing them together. 
>>>>>>> And in any case I would think the real problem would be building 
>>>>>>> on a platform that uses 4096 and running on one that uses 4095!
>>>>>>>
>>>>>>> But that aside this is a Linux hack and should be guarded by 
>>>>>>> ifdef LINUX. (I doubt BSD needs it, the bsd file is just a copy 
>>>>>>> of the linux one - the JDK macosx version does the right thing). 
>>>>>>> Solaris and AIX should stay as-is at MAXPATHLEN.
>>>>>>
>>>>>> All of the unix platforms had MAXPATHLEN+1.? I'll leave it for 
>>>>>> now and we can investigate that further.
>>>>>
>>>>> I see the following existing code:
>>>>>
>>>>> src/java.base/unix/native/include/jvm_md.h:
>>>>>
>>>>> #define JVM_MAXPATHLEN MAXPATHLEN
>>>>>
>>>>> src/java.base/macosx/native/include/jvm_md.h
>>>>>
>>>>> #define JVM_MAXPATHLEN MAXPATHLEN
>>>>>
>>>>> src/hotspot/os/aix/jvm_aix.h
>>>>>
>>>>> #define JVM_MAXPATHLEN MAXPATHLEN
>>>>>
>>>>> src/hotspot/os/bsd/jvm_bsd.h
>>>>>
>>>>> #define JVM_MAXPATHLEN MAXPATHLEN + 1? // blindly copied from 
>>>>> Linux version
>>>>>
>>>>> src/hotspot/os/linux/jvm_linux.h
>>>>>
>>>>> #define JVM_MAXPATHLEN MAXPATHLEN + 1
>>>>>
>>>>> src/hotspot/os/solaris/jvm_solaris.h
>>>>>
>>>>> #define JVM_MAXPATHLEN MAXPATHLEN
>>>>>
>>>>> This is a linux only hack (if you ignore the blind copy from linux 
>>>>> into the BSD code in the VM).
>>>>
>>>> Oh, thanks, so should I add a bunch of ifdefs then?? Or do you 
>>>> think having MAXPATHLEN + 1 will really break the other platforms?? 
>>>> Do you really see this as a problem or are you just pointing out 
>>>> inconsistency?
>>>>>
>>>>>>>
>>>>>>> ?86 #define ASYNC_SIGNAL???? SIGJVM2
>>>>>>>
>>>>>>> This only exists on Solaris so I think should be in #ifdef 
>>>>>>> SOLARIS, to make that clear.
>>>>>>
>>>>>> Ok.? I'll add this.
>>>>>>>
>>>>>>> ---
>>>>>>>
>>>>>>> src/java.base/windows/native/include/jvm_md.h
>>>>>>>
>>>>>>> Given the differences between the two versions either something 
>>>>>>> has been broken or "extern C" declarations are not needed :)
>>>>>>
>>>>>> Well, they are needed for Hotspot to build and do not prevent jdk 
>>>>>> from building.? I don't know what was broken.
>>>>>
>>>>> We really need to understand this better. Maybe related to the map 
>>>>> files that expose the symbols. ??
>>>>
>>>> They're needed because the JDK files are written mostly in C and 
>>>> that doesn't complain about the linkage difference. Hotspot files 
>>>> are in C++ which does complain.
>>>>
>>>>>
>>>>>>>
>>>>>>> ---
>>>>>>>
>>>>>>> That was a really painful way to spend most of my Friday. TGIF! :)
>>>>>>
>>>>>> Thanks for going through it.? See comments inline for changes. 
>>>>>> Generating a webrev takes hours so I'm not going to do that 
>>>>>> unless you insist.
>>>>>
>>>>> An incremental webrev shouldn't take long - right? You're a mq 
>>>>> maestro now. :)
>>>>
>>>> Well I generally trash a repository whenever I use mq but sure.
>>>>>
>>>>> If you can reasonably produce an incremental webrev once you've 
>>>>> settled on all the comments/issues that would be good.
>>>>
>>>> Ok, sure.
>>>>
>>>> Coleen
>>>>>
>>>>> Thanks,
>>>>> David
>>>>>
>>>>>> Thanks,
>>>>>> Coleen
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>> David
>>>>>>> -----
>>>>>>>
>>>>>>>
>>>>>>> On 27/10/2017 6:44 AM, coleen.phillimore at oracle.com wrote:
>>>>>>>> ??Hi Magnus,
>>>>>>>>
>>>>>>>> Thank you for reviewing this.?? I have a new version that takes 
>>>>>>>> out the hack in globalDefinitions.hpp and adds casts to 
>>>>>>>> src/hotspot/share/opto/type.cpp instead.
>>>>>>>>
>>>>>>>> Also some fixes from Martin at SAP.
>>>>>>>>
>>>>>>>> open webrev at 
>>>>>>>> http://cr.openjdk.java.net/~coleenp/8189610.02/webrev
>>>>>>>>
>>>>>>>> see below.
>>>>>>>>
>>>>>>>> On 10/26/17 5:57 AM, Magnus Ihse Bursie wrote:
>>>>>>>>> Coleen,
>>>>>>>>>
>>>>>>>>> Thank you for addressing this!
>>>>>>>>>
>>>>>>>>> On 2017-10-25 18:49, coleen.phillimore at oracle.com wrote:
>>>>>>>>>> Summary: removed hotspot version of jvm*h and jni*h files
>>>>>>>>>>
>>>>>>>>>> Mostly used sed to remove prims/jvm.h and move #include 
>>>>>>>>>> "jvm.h" after precompiled.h, so if you have repetitive stress 
>>>>>>>>>> wrist issues don't click on most of these files.
>>>>>>>>>>
>>>>>>>>>> There were more issues to resolve, however.? The JDK windows 
>>>>>>>>>> jni_md.h file defined jint as long and the hotspot windows 
>>>>>>>>>> jni_x86.h as int. I had to choose the jdk version since it's 
>>>>>>>>>> the public version, so there are changes to the hotspot files 
>>>>>>>>>> for this. Generally I changed the code to use 'int' rather 
>>>>>>>>>> than 'jint' where the surrounding API didn't insist on 
>>>>>>>>>> consistently using java types. We should mostly be using C++ 
>>>>>>>>>> types within hotspot except in interfaces to native/JNI 
>>>>>>>>>> code.? There are a couple of hacks in places where adding 
>>>>>>>>>> multiple jint casts was too painful.
>>>>>>>>>>
>>>>>>>>>> Tested with JPRT and tier2-4 (in progress).
>>>>>>>>>>
>>>>>>>>>> open webrev at 
>>>>>>>>>> http://cr.openjdk.java.net/~coleenp/8189610.01/webrev
>>>>>>>>>
>>>>>>>>> Looks great!
>>>>>>>>>
>>>>>>>>> Just a few comments:
>>>>>>>>>
>>>>>>>>> * src/java.base/unix/native/include/jni_md.h:
>>>>>>>>>
>>>>>>>>> I don't think the externally_visible attribute should be there 
>>>>>>>>> for arm. I know this was the case for the corresponding 
>>>>>>>>> hotspot file for arm, but that was techically incorrect. The 
>>>>>>>>> proper dependency here is that externally_visible should be in 
>>>>>>>>> all JNIEXPORT if and only if we're building with JVM feature 
>>>>>>>>> "link-time-opt". Traditionally, that feature been enabled when 
>>>>>>>>> building arm32 builds, and only then, so there's been a 
>>>>>>>>> (coincidentally) connection here. Nowadays, Oracle does not 
>>>>>>>>> care about the arm32 builds, and I'm not sure if anyone else 
>>>>>>>>> is building them with link-time-opt enabled.
>>>>>>>>>
>>>>>>>>> It does seem wrong to me to export this behavior in the public 
>>>>>>>>> jni_md.h file, though. I think the correct way to solve this, 
>>>>>>>>> if we should continue supporting link-time-opt is to make sure 
>>>>>>>>> this attribute is set for exported hotspot functions. If it's 
>>>>>>>>> still needed, that is. A quick googling seems to indicate that 
>>>>>>>>> visibility("default") might be enough in modern gcc's.
>>>>>>>>>
>>>>>>>>> A third option is to remove the support for link-time-opt 
>>>>>>>>> entirely, if it's not really used.
>>>>>>>>
>>>>>>>> I didn't know how to change this since we are still building 
>>>>>>>> ARM with the jdk10/hs repository, and ARM needed this change.? 
>>>>>>>> I could wait until we bring down the jdk10/master changes that 
>>>>>>>> remove the ARM build and remove this conditional before I push. 
>>>>>>>> Or we could file an RFE to remove link-time-opt (?) and remove 
>>>>>>>> it then?
>>>>>>>>
>>>>>>>>>
>>>>>>>>> * src/java.base/unix/native/include/jvm_md.h and 
>>>>>>>>> src/java.base/windows/native/include/jvm_md.h:
>>>>>>>>>
>>>>>>>>> These files define a public API, and contain non-trivial 
>>>>>>>>> changes. I suspect you should file a CSR request. (Even though 
>>>>>>>>> I realize you're only matching the header file with the reality.)
>>>>>>>>>
>>>>>>>>
>>>>>>>> I filed the CSR.?? Waiting for the next steps.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Coleen
>>>>>>>>
>>>>>>>>> /Magnus
>>>>>>>>>
>>>>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8189610
>>>>>>>>>>
>>>>>>>>>> I have a script to update copyright files on commit.
>>>>>>>>>>
>>>>>>>>>> Thanks to Magnus and ErikJ for the makefile changes.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Coleen
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>
>>>>
>>


From Alan.Bateman at oracle.com  Mon Oct 30 13:24:33 2017
From: Alan.Bateman at oracle.com (Alan Bateman)
Date: Mon, 30 Oct 2017 13:24:33 +0000
Subject: RFR (L, tedious again, sorry) 8189610: Reconcile jvm.h and all
 jvm_md.h between java.base and hotspot
In-Reply-To: <440f79ba-2da3-b627-53bc-e1842e3cf73c@oracle.com>
References: <8671321f-398c-5f7f-634d-9d9664e04d87@oracle.com>
 <05bf853a-52f8-c450-4171-89f6b64d793a@oracle.com>
 <72f2aac7-ed20-c995-913b-ee4341a2a978@oracle.com>
 <55ec3559-c593-bcb6-51b0-4639da126068@oracle.com>
 <a05deb31-c7bf-4fda-f621-d39cfb56abb4@oracle.com>
 <4509dce7-10f8-4558-2adb-90d4745e054e@oracle.com>
 <396ab0f7-3710-3f76-675a-5108bcb50af5@oracle.com>
 <22afedef-59cc-ecde-48fc-0afb7b4bbb47@oracle.com>
 <a3389ea3-83b2-40fb-85cd-bb741191d219@oracle.com>
 <815ac734-ea8b-ea2d-ecec-85cb547ba2f4@oracle.com>
 <440f79ba-2da3-b627-53bc-e1842e3cf73c@oracle.com>
Message-ID: <9fa3a074-3ebc-4fb5-4ffa-72d8bc4e5dc2@oracle.com>


On 30/10/2017 12:38, coleen.phillimore at oracle.com wrote:
> :
>
>
> I don't know why Java_java_io_UnixFileSystem uses JVM_MAXPATHLEN since 
> it's not calling the JVM interface as far as I can tell. I think it 
> should be changed to PATH_MAX.
This code used to use the JVM_* functions (dates back to early JDK 
releases). The JVM_MAXPATHLEN usage is likely left over from when this 
code was change to use the syscalls directly.

-Alan

From robbin.ehn at oracle.com  Mon Oct 30 14:34:29 2017
From: robbin.ehn at oracle.com (Robbin Ehn)
Date: Mon, 30 Oct 2017 15:34:29 +0100
Subject: RFR(XL): 8185640: Thread-local handshakes
In-Reply-To: <4ebb905f23324a00b9cf10d8d410d420@sap.com>
References: <6f2f6259-73f1-c09c-063e-39ae528fb96f@oracle.com>
 <edf27200af4e4fb4a8bda1831ae2d974@sap.com>
 <6e10687c-f70e-5ee7-414f-b2c22d3e8f21@oracle.com>
 <f3553eec9b1d4815adbf7c87ca550527@sap.com>
 <580AD7F0-2713-472C-A440-AAFDDA2D3EB3@oracle.com>
 <7591f6c0-7192-78c3-fe79-56a7785c43e4@oracle.com>
 <2ff79d24-90ab-822a-bd61-e01b79c01ada@redhat.com>
 <8d7678bf2281406da43cbe090276b51f@sap.com>
 <28bc3976-424d-1e05-cf7f-29bc38ccabcb@oracle.com>
 <818e352d5e3a450491cf0c140bf129d6@sap.com>
 <1c4a025da6ad4d39bedc1d6a12549b87@sap.com>
 <93cce80b-e9d7-f016-1324-2b0f5fac48c4@redhat.com> <59F1F3B3.10701@oracle.com>
 <43837915-a3a3-b36f-940e-1327937f0f17@redhat.com>
 <2EB9D7C3-B868-4C3E-BD88-6A4F92A39999@oracle.com>
 <b4d75241-f87f-3aef-ad55-897d1f91be1e@redhat.com>
 <59F2DC24.8050701@oracle.com>
 <cd99843e-2602-c423-c74e-210884713ef5@redhat.com> <59F2F01A.403@oracle.com>
 <d0fe324f-26ed-7fca-e8f9-81b1ca4f452d@oracle.com>
 <4ebb905f23324a00b9cf10d8d410d420@sap.com>
Message-ID: <a2fe667f-d89b-9856-8630-d0e15ae90ae4@oracle.com>

Thanks!

There have been a bit hesitation and confusion about the option (at least 
internally).
The option is opt-out but in globals.hpp it starts out as false.

Now instead we explicit set it true in globals.hpp but we turn it off if we 
notice that:
- We are on an unsupported platform
- User have specified UseAOT
- User have specified EnableJVMCI

Here is webrev for changes needed:
http://cr.openjdk.java.net/~rehn/8185640/v8/Option-Cleanup-12/webrev/
And here is CSR:
https://bugs.openjdk.java.net/browse/JDK-8189942

Manual testing + basic testing done.

And since I'm really hoping that this can be the last incremental, here is my 
whole patch queue flatten out:
http://cr.openjdk.java.net/~rehn/8185640/v8/Full/webrev/

Thanks, Robbin

On 10/27/2017 04:47 PM, Doerr, Martin wrote:
> Hi Robbin,
> 
> excellent. I think this matches what Coleen had proposed, now.
> Thanks for doing all the work with so many incremental patches and for responding on so many discussions. Seems to be a tough piece of work.
> 
> Best regards,
> Martin
> 
> 
> -----Original Message-----
> From: Robbin Ehn [mailto:robbin.ehn at oracle.com]
> Sent: Freitag, 27. Oktober 2017 15:15
> To: Erik ?sterlund <erik.osterlund at oracle.com>; Andrew Haley <aph at redhat.com>; Doerr, Martin <martin.doerr at sap.com>; Karen Kinnear <karen.kinnear at oracle.com>; Coleen Phillimore (coleen.phillimore at oracle.com) <coleen.phillimore at oracle.com>
> Cc: hotspot-dev developers <hotspot-dev at openjdk.java.net>
> Subject: Re: RFR(XL): 8185640: Thread-local handshakes
> 
> Hi all,
> 
> Poll in switches:
> http://cr.openjdk.java.net/~rehn/8185640/v7/Interpreter-Poll-Switch-10/
> 
> Poll in return:
> http://cr.openjdk.java.net/~rehn/8185640/v7/Interpreter-Poll-Ret-11/
> 
> Please take an extra look at poll in return.
> 
> Sanity tested, big test run still running (99% complete - OK).
> 
> Performance regression for the added polls increased to total of -0.68% vs
> global poll. (was -0.44%)
> 
> We are discussing the opt-out option, the newest suggestion is to make it
> diagnostic. Opinions?
> 
> For anyone applying these patches, the number 9 patch changes the option from
> product. I have not sent that out.
> 
> Thanks, Robbin
> 
> 
> 

From artem.smotrakov at oracle.com  Mon Oct 30 07:39:49 2017
From: artem.smotrakov at oracle.com (Artem Smotrakov)
Date: Mon, 30 Oct 2017 10:39:49 +0300
Subject: RFR [10] 8189800: Add support for AddressSanitizer
In-Reply-To: <51eabbae-5435-59be-f443-a6b214a17513@oracle.com>
References: <51eabbae-5435-59be-f443-a6b214a17513@oracle.com>
Message-ID: <aafaf71d-087d-fdab-a50a-56d6d331cf24@oracle.com>

cc'ing hotspot-dev at openjdk.java.net as David suggested.

Artem


On 10/27/2017 11:02 PM, Artem Smotrakov wrote:
> Hello,
>
> Please review the following patch which adds support for 
> AddressSanitizer.
>
> AddressSanitizer is a runtime memory error detector which looks for 
> various memory corruption issues and leaks.
>
> Please refer to [1] for details. AddressSanitizer is available in gcc 
> 4.8+ and clang 3.1+
>
> The patch below introduces --enable-asan parameter for the configure 
> script which enables AddressSanitizer.
>
> Bug: https://bugs.openjdk.java.net/browse/JDK-8189800
> Webrev: http://cr.openjdk.java.net/~asmotrak/8189800/webrev.00/
>
> [1] https://github.com/google/sanitizers/wiki/AddressSanitizer
>
> Artem


From artem.smotrakov at oracle.com  Mon Oct 30 09:31:40 2017
From: artem.smotrakov at oracle.com (Artem Smotrakov)
Date: Mon, 30 Oct 2017 12:31:40 +0300
Subject: RFR [10] 8189800: Add support for AddressSanitizer
In-Reply-To: <55e0e055-2e65-5c83-3f8e-36895f71860e@oracle.com>
References: <51eabbae-5435-59be-f443-a6b214a17513@oracle.com>
 <aafaf71d-087d-fdab-a50a-56d6d331cf24@oracle.com>
 <55e0e055-2e65-5c83-3f8e-36895f71860e@oracle.com>
Message-ID: <3b4c5abb-762f-a66c-02d5-93909dc656d4@oracle.com>

Hi Magnus,

The current approach uses AddressSanitizer as a shared library 
(libasan.so) which is part of GCC/Clang toolkit. In case you use system 
toolkit, then libasan.so is available for linker and at runtime. But if 
you set a custom toolkit by --with-devkit option, then libasan.so form 
this toolkit may not be available for linker and at runtime by default. 
As a result, you can get errors while linking and running. To fix that, 
you normally need to make it available using ldconfig, or update 
LD_LIBRARY_PATH. That's why it updates LD_LIBRARY_PATH with 
DEVKIT_LIB_DIR if a custom toolkit was used. That may be helpful when 
you build JDK in environment like jib/jprt.

I tried to remove exporting ASAN_ENABLED and DEVKIT_LIB_DIR, and as a 
result, ASAN_OPTIONS and DEVKIT_LIB_DIR didn't go to jtreg command which 
caused tests to fail when you run "make test". If we don't export 
ASAN_OPTIONS and DEVKIT_LIB_DIR, then the updates in TestCommon.gmk 
don't make much sense to me because those variables have to be 
explicitly set for "make" anyway.

I can remove exporting those variables and revert TestCommon.gmk. 
Although, it looks nicer to me if we can run the tests just with "make 
test" without specifying ASAN_OPTIONS and DEVKIT_LIB_DIR explicitly.

What do you think?

Artem


On 10/30/2017 10:50 AM, Magnus Ihse Bursie wrote:
> On 2017-10-30 08:39, Artem Smotrakov wrote:
>> cc'ing hotspot-dev at openjdk.java.net as David suggested.
>>
>> Artem
>>
>>
>> On 10/27/2017 11:02 PM, Artem Smotrakov wrote:
>>> Hello,
>>>
>>> Please review the following patch which adds support for 
>>> AddressSanitizer.
>>>
>>> AddressSanitizer is a runtime memory error detector which looks for 
>>> various memory corruption issues and leaks.
>>>
>>> Please refer to [1] for details. AddressSanitizer is available in 
>>> gcc 4.8+ and clang 3.1+
>>>
>>> The patch below introduces --enable-asan parameter for the configure 
>>> script which enables AddressSanitizer.
>>>
>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8189800
>>> Webrev: http://cr.openjdk.java.net/~asmotrak/8189800/webrev.00/
> spec.gmk.in should only have export for variables that needs to be 
> exported in the environment for executing binaries, that is 
> ASAN_OPTIONS and LD_LIBRARY_PATH, not ASAN_ENABLED or DEVKIT_LIB_DIR.
>
> I'm also a bit curious about the addition of of DEVKIT_LIB_DIR. Would 
> you care to elaborate your thinking?
>
> Otherwise it looks good.
>
> /Magnus
>
>>>
>>> [1] https://github.com/google/sanitizers/wiki/AddressSanitizer
>>>
>>> Artem
>>
>


From coleen.phillimore at oracle.com  Mon Oct 30 14:48:32 2017
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Mon, 30 Oct 2017 10:48:32 -0400
Subject: RFR (L, tedious again, sorry) 8189610: Reconcile jvm.h and all
 jvm_md.h between java.base and hotspot
In-Reply-To: <440f79ba-2da3-b627-53bc-e1842e3cf73c@oracle.com>
References: <8671321f-398c-5f7f-634d-9d9664e04d87@oracle.com>
 <05bf853a-52f8-c450-4171-89f6b64d793a@oracle.com>
 <72f2aac7-ed20-c995-913b-ee4341a2a978@oracle.com>
 <55ec3559-c593-bcb6-51b0-4639da126068@oracle.com>
 <a05deb31-c7bf-4fda-f621-d39cfb56abb4@oracle.com>
 <4509dce7-10f8-4558-2adb-90d4745e054e@oracle.com>
 <396ab0f7-3710-3f76-675a-5108bcb50af5@oracle.com>
 <22afedef-59cc-ecde-48fc-0afb7b4bbb47@oracle.com>
 <a3389ea3-83b2-40fb-85cd-bb741191d219@oracle.com>
 <815ac734-ea8b-ea2d-ecec-85cb547ba2f4@oracle.com>
 <440f79ba-2da3-b627-53bc-e1842e3cf73c@oracle.com>
Message-ID: <f664005e-a495-d996-8a89-b8fa0a9f3f18@oracle.com>


http://cr.openjdk.java.net/~coleenp/8189610.incr.02/webrev/index.html

Changed JDK file to use PATH_MAX.? Retested jdk tier1 tests.

thanks,
Coleen

On 10/30/17 8:38 AM, coleen.phillimore at oracle.com wrote:
>
>
> On 10/30/17 8:17 AM, David Holmes wrote:
>> On 30/10/2017 10:13 PM, coleen.phillimore at oracle.com wrote:
>>> On 10/28/17 3:50 AM, David Holmes wrote:
>>>> Hi Coleen,
>>>>
>>>> I've commented on the file location in response to Mandy's email.
>>>>
>>>> The only issue I'm still concerned about is the JVM_MAXPATHLEN 
>>>> issue. I think it is a bug to define a JVM_MAXPATHLEN that is 
>>>> bigger than the platform MAXPATHLEN. I also would not want to see 
>>>> any change in behaviour because of this - so AIX and Solaris should 
>>>> not get a different JVM_MAXPATHLEN due to this refactoring change. 
>>>> So yes I think this needs to be ifdef'd for Linux and reluctantly 
>>>> (because it was a copy error) for OSX/BSD as well.
>>>
>>> #if defined(AIX) || defined(SOLARIS)
>>> #define JVM_MAXPATHLEN MAXPATHLEN
>>> #else
>>> // Hack: MAXPATHLEN is 4095 on some Linux and 4096 on others. This may
>>> //?????? cause problems if JVM and the rest of JDK are built on 
>>> different
>>> //?????? Linux releases. Here we define JVM_MAXPATHLEN to be 
>>> MAXPATHLEN + 1,
>>> //?????? so buffers declared in VM are always >= 4096.
>>> #define JVM_MAXPATHLEN MAXPATHLEN + 1
>>> #endif
>>>
>>> Is this ok?
>>
>> Yes - thanks. It preserves existing behaviour on the VM side at 
>> least. Time will tell if it messes anything up on the JDK side for 
>> Linux/OSX.
>
> I don't want to wait for time so I'm investigating.
>
> It's one use is:
>
> Java_java_io_UnixFileSystem_canonicalize0(JNIEnv *env, jobject this,
> ...
> ??????? char canonicalPath[JVM_MAXPATHLEN];
> ??????? if (canonicalize((char *)path,
> ???????????????????????? canonicalPath, JVM_MAXPATHLEN) < 0) {
> ??????????? JNU_ThrowIOExceptionWithLastError(env, "Bad pathname");
>
> Which goes to:
>
> canonicalize_md.c
>
> canonicalize(char *original, char *resolved, int len)
> ??? if (len < PATH_MAX) {
> ??????? errno = EINVAL;
> ??????? return -1;
> ??? }
>
>
> So this should fail every time.
>
> sys/param.h:# define MAXPATHLEN??? PATH_MAX
>
> I haven't found any tests for it.
>
> I don't know why Java_java_io_UnixFileSystem uses JVM_MAXPATHLEN since 
> it's not calling the JVM interface as far as I can tell. I think it 
> should be changed to PATH_MAX.
>
> ?
> Coleen
>>
>> David
>>
>>> thanks,
>>> Coleen
>>>>
>>>> Thanks,
>>>> David
>>>>
>>>> On 28/10/2017 12:08 AM, coleen.phillimore at oracle.com wrote:
>>>>>
>>>>>
>>>>> On 10/27/17 9:37 AM, David Holmes wrote:
>>>>>> On 27/10/2017 10:13 PM, coleen.phillimore at oracle.com wrote:
>>>>>>>
>>>>>>>
>>>>>>> On 10/27/17 3:23 AM, David Holmes wrote:
>>>>>>>> Hi Coleen,
>>>>>>>>
>>>>>>>> Thanks for tackling this.
>>>>>>>>
>>>>>>>>> Summary: removed hotspot version of jvm*h and jni*h files
>>>>>>>>
>>>>>>>> Can you update the bug synopsis to show it covers both sets of 
>>>>>>>> files please.
>>>>>>>>
>>>>>>>> I hate to start with this (and it took me quite a while to 
>>>>>>>> realize it) but as Mandy pointed out jvm.h is not an exported 
>>>>>>>> interface from the JDK to the outside world (so not subject to 
>>>>>>>> CSR review), but is a private interface between the JVM and the 
>>>>>>>> JDK libraries. So I think really jvm.h belongs in the hotspot 
>>>>>>>> sources where it was, while jni.h belongs in the exported JDK 
>>>>>>>> sources. In which case the bulk of your changes to the hotspot 
>>>>>>>> files would not be needed - sorry.
>>>>>>>
>>>>>>> Maybe someone can make that decision and change at a later date. 
>>>>>>> The point of this change is that there is now only one of these 
>>>>>>> files that is shared.? I don't think jvm.h and the jvm_md.h 
>>>>>>> belong on the hotspot sources for the jdk to find them in some 
>>>>>>> random prims and os dependent directories.
>>>>>>
>>>>>> The one file that is needed is a hotspot file - jvm.h defines the 
>>>>>> interface that hotspot exports via jvm.cpp.
>>>>>>
>>>>>> If you leave jvm.h in hotspot/prims then a very large chunk of 
>>>>>> your boilerplate changes are not needed. The JDK code doesn't 
>>>>>> care what the name of the directory is - whatever it is just gets 
>>>>>> added as a -I directive (the JDK code will include "jvm.h" not 
>>>>>> "prims/jvm.h" the way hotspot sources do.
>>>>>>
>>>>>> This isn't something we want to change back or move again later. 
>>>>>> Whatever we do now we live with.
>>>>>
>>>>> I think it belongs with jni.h and I think the core libraries group 
>>>>> would agree.?? It seems more natural there than buried in the 
>>>>> hotspot prims directory.? I guess this is on hold while we have 
>>>>> this debate.?? Sigh.
>>>>>
>>>>> Actually with -I directives, changing to jvm.h from prims/jvm.h 
>>>>> would still work.?? Maybe we should change the name to jvm.hpp 
>>>>> since it's jvm.cpp though??? Or maybe just have two divergent 
>>>>> copies and close this as WNF.
>>>>>
>>>>>>
>>>>>>> I'm happy to withdraw the CSR. We generally use the CSR process 
>>>>>>> to add and remove JVM_ interfaces even though they're a private 
>>>>>>> interface in case some other JVM/JDK combination relies on them. 
>>>>>>> The changes to these files are very minor though and not likely 
>>>>>>> to cause any even theoretical incompatibility, so I'll withdraw it.
>>>>>>>>
>>>>>>>> Moving on ...
>>>>>>>>
>>>>>>>> First to address the initial comments/query you had:
>>>>>>>>
>>>>>>>>> The JDK windows jni_md.h file defined jint as long and the 
>>>>>>>>> hotspot
>>>>>>>>> windows jni_x86.h as int. I had to choose the jdk version 
>>>>>>>>> since it's the
>>>>>>>>> public version, so there are changes to the hotspot files for 
>>>>>>>>> this.
>>>>>>>>
>>>>>>>> On Windows int and long are always the same as it uses ILP32 or 
>>>>>>>> LLP64 (not LP64 like *nix platforms). So either choice should 
>>>>>>>> be fine. That said there are some odd casting issues I comment 
>>>>>>>> on below. Does the VS compiler complain about mixing int and 
>>>>>>>> long in expressions?
>>>>>>>
>>>>>>> Yes, it does even though int and long are the same representation.
>>>>>>
>>>>>> And what an absolute mess that makes. :(
>>>>>>
>>>>>>>>
>>>>>>>>> Generally I changed the code to use 'int' rather than 'jint' 
>>>>>>>>> where the
>>>>>>>>> surrounding API didn't insist on consistently using java 
>>>>>>>>> types. We
>>>>>>>>> should mostly be using C++ types within hotspot except in 
>>>>>>>>> interfaces to
>>>>>>>>> native/JNI code.
>>>>>>>>
>>>>>>>> I think you pulled too hard on a few threads here and things 
>>>>>>>> are starting to unravel. There are numerous cases I refer to 
>>>>>>>> below where either the cast seems unnecessary/inappropriate or 
>>>>>>>> else highlights a bunch of additional changes that also need to 
>>>>>>>> be made. The fan out from this could be horrendous. Unless you 
>>>>>>>> actually get some kind of error - and I'd like to understand 
>>>>>>>> the details of those - I would not suggest making these changes 
>>>>>>>> as part of this work.
>>>>>>>
>>>>>>> I didn't make any change unless there was was an error. I have 
>>>>>>> 100 failed JPRT jobs to confirm!? I eventually got a Windows 
>>>>>>> system to compile and test this on. Actually some of the changes 
>>>>>>> came out better.? Cases where we use jint as a bool simply 
>>>>>>> turned to int. We do not have an overload for bool for cmpxchg.
>>>>>>
>>>>>> That's unfortunate - ditto for OrderAccess.
>>>>>>
>>>>>>>>
>>>>>>>> Looking through I have a quite a few queries/comments - 
>>>>>>>> apologies in advance as I know how tedious this is:
>>>>>>>>
>>>>>>>> make/hotspot/lib/CompileLibjsig.gmk
>>>>>>>> src/java.base/solaris/native/libjsig/jsig.c
>>>>>>>>
>>>>>>>> Took a while to figure out why the include was needed. :) As a 
>>>>>>>> follow up I suggest just deleting the -I include directive, 
>>>>>>>> delete the Solaris-only definition of JSIG_VERSION_1_4_1, and 
>>>>>>>> delete everything to do with JVM_get_libjsig_version. It is all 
>>>>>>>> obsolete.
>>>>>>>
>>>>>>> Can I patch up jsig in a separate RFE?? I don't remember why 
>>>>>>> this broke so I simply moved JSIG #define.? Is jsig obsolete? 
>>>>>>> Removing JVM_* definitions generally requires a CSR.
>>>>>>
>>>>>> I did say "As a follow up". jsig is not obsolete but the jsig 
>>>>>> versioning code, only used by Solaris, is.
>>>>>>
>>>>>>>>
>>>>>>>> ---
>>>>>>>>
>>>>>>>> src/hotspot/cpu/arm/interp_masm_arm.cpp
>>>>>>>>
>>>>>>>> Why did you need to add the jvm.h include?
>>>>>>>>
>>>>>>>
>>>>>>> ?? tbz(Raccess_flags, JVM_ACC_SYNCHRONIZED_BIT, unlocked);
>>>>>>
>>>>>> Okay. I'm not going to try and figure out how this code found 
>>>>>> this before.
>>>>>>
>>>>>>>> ---
>>>>>>>>
>>>>>>>> src/hotspot/os/windows/os_windows.cpp.
>>>>>>>>
>>>>>>>> The type of process_exiting should be uint to match the DWORD 
>>>>>>>> of GetCurrentThreadID. Then you should need any casts. Also you 
>>>>>>>> missed this jint cast:
>>>>>>>>
>>>>>>>> 3796???????? process_exiting != (jint)GetCurrentThreadId()) {
>>>>>>>
>>>>>>> Yes, that's better to change process_exiting to a DWORD.? It 
>>>>>>> needs a DWORD cast to 0 in the cmpxchg.
>>>>>>>
>>>>>>> ???????? Atomic::cmpxchg(GetCurrentThreadId(), &process_exiting, 
>>>>>>> (DWORD)0);
>>>>>>>
>>>>>>> These templates are picky.
>>>>>>
>>>>>> Yes - their inability to deal with literals is extremely 
>>>>>> frustrating.
>>>>>>
>>>>>>>>
>>>>>>>> ---
>>>>>>>>
>>>>>>>> src/hotspot/share/c1/c1_Canonicalizer.hpp
>>>>>>>>
>>>>>>>> ? 43 #ifdef _WINDOWS
>>>>>>>> ? 44?? // jint is defined as long in jni_md.h, so convert from 
>>>>>>>> int to jint
>>>>>>>> ? 45?? void set_constant(int x) { set_constant((jint)x); }
>>>>>>>> ? 46 #endif
>>>>>>>>
>>>>>>>> Why is this necessary? int and long are the same on Windows. 
>>>>>>>> The whole point is that jint hides the underlying type, so 
>>>>>>>> where does this go wrong?
>>>>>>>
>>>>>>> No, they are not the same types even though they have the same 
>>>>>>> representation!
>>>>>>
>>>>>> This is truly unfortunate.
>>>>>>
>>>>>>>>
>>>>>>>> ---
>>>>>>>>
>>>>>>>> src/hotspot/share/c1/c1_LinearScan.cpp
>>>>>>>>
>>>>>>>> ?ConstantIntValue((jint)0);
>>>>>>>>
>>>>>>>> why is this cast needed? what causes the ambiguity? (If this 
>>>>>>>> was a template I'd understand ;-) ). Also didn't you change 
>>>>>>>> that constructor to take an int anyway - not that I think it 
>>>>>>>> should - see below.
>>>>>>>
>>>>>>> Yes, it caused an ambiguity.? 0 matches 'int' but it doesn't 
>>>>>>> match 'long' better than any pointer type.? So this cast is needed.
>>>>>>
>>>>>> But you changed the constructor to take an int!
>>>>>>
>>>>>> ?class ConstantIntValue: public ScopeValue {
>>>>>> ? private:
>>>>>> -? jint _value;
>>>>>> +? int _value;
>>>>>> ? public:
>>>>>> -? ConstantIntValue(jint value)???????? { _value = value; }
>>>>>> +? ConstantIntValue(int value)????????? { _value = value; }
>>>>>>
>>>>>>
>>>>>
>>>>> Okay I removed this cast.
>>>>>
>>>>>>>> ---
>>>>>>>>
>>>>>>>> src/hotspot/share/ci/ciReplay.cpp
>>>>>>>>
>>>>>>>> 793???????? jint* dims = NEW_RESOURCE_ARRAY(jint, rank);
>>>>>>>>
>>>>>>>> why should this be jint?
>>>>>>>
>>>>>>> To avoid a cast from int* to jint* in the line below:
>>>>>>>
>>>>>>> ????????? value = kelem->multi_allocate(rank, dims, CHECK);
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> ---
>>>>>>>>
>>>>>>>> src/hotspot/share/classfile/altHashing.cpp
>>>>>>>>
>>>>>>>> Okay this looks more consistent with jint.
>>>>>>>
>>>>>>> Yes.? I translated this from some native code iirc.
>>>>>>>>
>>>>>>>> ---
>>>>>>>>
>>>>>>>> src/hotspot/share/code/debugInfo.hpp
>>>>>>>>
>>>>>>>> These changes seem wrong. We have:
>>>>>>>>
>>>>>>>> ConstantLongValue(jlong value)
>>>>>>>> ConstantDoubleValue(jdouble value)
>>>>>>>>
>>>>>>>> so we should have:
>>>>>>>>
>>>>>>>> ConstantIntValue(jint value)
>>>>>>>
>>>>>>> Again, there are multiple call sites with '0', which match int 
>>>>>>> trivially but are confused with long.? It's less consistent I 
>>>>>>> agree but better to not cast all the call sites.
>>>>>>
>>>>>> This is really making a mess of the APIs - they should be a jint 
>>>>>> but we declare them int because of a 0 casting problem. Can't we 
>>>>>> just use 0L?
>>>>>
>>>>> There aren't that many casts.? You're right, that would have been 
>>>>> better in some places.
>>>>>
>>>>>>>>
>>>>>>>> ---
>>>>>>>>
>>>>>>>> src/hotspot/share/code/relocInfo.cpp
>>>>>>>>
>>>>>>>> Change seems unnecessary - int32_t is fine
>>>>>>>>
>>>>>>>
>>>>>>> No, int32_t doesn't match the calls below it.? They all assume 
>>>>>>> _lo and _hi are jint.
>>>>>>>> ---
>>>>>>>>
>>>>>>>> src/hotspot/share/compiler/compileBroker.cpp
>>>>>>>> src/hotspot/share/compiler/compileBroker.hpp
>>>>>>>>
>>>>>>>> I see a complete mix of int and jint in this class, so why make 
>>>>>>>> the one change you did ??
>>>>>>>
>>>>>>> This is another case of using jint as a flag with cmpxchg. The 
>>>>>>> templates for cmpxchg want the types to match and 0 and 1 are 
>>>>>>> essentially 'int'.? This is a lot cleaner this way.
>>>>>>
>>>>>> <sigh>
>>>>>>
>>>>>>>>
>>>>>>>> ---
>>>>>>>>
>>>>>>>> src/hotspot/share/jvmci/jvmciCompilerToVM.cpp
>>>>>>>>
>>>>>>>> 1700???? tty->write((char*) start, MIN2(length, (jint)O_BUFLEN));
>>>>>>>>
>>>>>>>> why did you need to add the jint cast? It's used without any 
>>>>>>>> cast on the next two lines:
>>>>>>>>
>>>>>>>> 1701???? length -= O_BUFLEN;
>>>>>>>> 1702???? offset += O_BUFLEN;
>>>>>>>>
>>>>>>>
>>>>>>> There's a conversion from O_BUFLEN from int to long in 1701 and 
>>>>>>> 1702.?? MIN2 is a template that wants the types to match exactly.
>>>>>>
>>>>>> $%^%$! templates!
>>>>>>
>>>>>>>> ??
>>>>>>>>
>>>>>>>> ---
>>>>>>>>
>>>>>>>> src/hotspot/share/jvmci/jvmciRuntime.cpp
>>>>>>>>
>>>>>>>> Looking around this code it seems very confused about types - 
>>>>>>>> eg the previous function is declared jboolean yet returns a 
>>>>>>>> jint on one path! It isn't clear to me if the return type is 
>>>>>>>> what should be changed or the parameter type? I would just 
>>>>>>>> leave this alone.
>>>>>>>
>>>>>>> I can't leave it alone because it doesn't compile that way. This 
>>>>>>> was the minimal change and yea, does look a bit inconsistent.
>>>>>>>>
>>>>>>>> ---
>>>>>>>>
>>>>>>>> src/hotspot/share/opto/mulnode.cpp
>>>>>>>>
>>>>>>>> Okay TypeInt has jint parts, so the remaining int32_t 
>>>>>>>> declarations (A, B, C, D) should also be jint.
>>>>>>>
>>>>>>> Yes.? c2 uses jint types.
>>>>>>>>
>>>>>>>> ---
>>>>>>>>
>>>>>>>> src/hotspot/share/opto/parse3.cpp
>>>>>>>>
>>>>>>>> I agree with the changes you made, but then:
>>>>>>>>
>>>>>>>> ?419???? jint dim_con = find_int_con(length[j], -1);
>>>>>>>>
>>>>>>>> should also be changed.
>>>>>>>>
>>>>>>>> And obviously MultiArrayExpandLimit should be defined as int 
>>>>>>>> not intx!
>>>>>>>
>>>>>>> Everything in globals.hpp is intx.? That's a thread that I don't 
>>>>>>> want to pull on!
>>>>>>
>>>>>> We still have that limitation? <double sigh>
>>>>>>>
>>>>>>> Changed dim_con to int.
>>>>>>>>
>>>>>>>> ---
>>>>>>>>
>>>>>>>> src/hotspot/share/opto/phaseX.cpp
>>>>>>>>
>>>>>>>> I can see that intcon(jint i) is consistent with longcon(jlong 
>>>>>>>> l), but the use of "i" in the code is more consistent with int 
>>>>>>>> than jint.
>>>>>>>
>>>>>>> huh?? really?
>>>>>>>>
>>>>>>>> ---
>>>>>>>>
>>>>>>>> src/hotspot/share/opto/type.cpp
>>>>>>>>
>>>>>>>> 1505 int TypeInt::hash(void) const {
>>>>>>>> 1506?? return java_add(java_add(_lo, _hi), 
>>>>>>>> java_add((jint)_widen, (jint)Type::Int));
>>>>>>>> 1507 }
>>>>>>>>
>>>>>>>> I can see that the (jint) casts you added make sense, but then 
>>>>>>>> the whole function should be returning jint not int. Ditto the 
>>>>>>>> other hash functions.
>>>>>>>
>>>>>>> I'm not messing with this, this is the minimal in type fixing 
>>>>>>> that I'm going to do here.
>>>>>>
>>>>>> <sigh>
>>>>>>
>>>>>>>>
>>>>>>>> ---
>>>>>>>>
>>>>>>>> src/hotspot/share/prims/jni.cpp
>>>>>>>>
>>>>>>>> I think vm_created should be a bool. In fact all the fields you 
>>>>>>>> changed are logically bools - do Atomics work for bool now?
>>>>>>>
>>>>>>> No, they do not.?? I had thought bool would be better originally 
>>>>>>> too.
>>>>>>>>
>>>>>>>> ---
>>>>>>>>
>>>>>>>> src/hotspot/share/prims/jvm.cpp
>>>>>>>>
>>>>>>>> is_attachable is the terminology used in the JDK code.
>>>>>>>
>>>>>>> Well the JDK version had is_attach_supported() as the flag name 
>>>>>>> so I used that in this one place.
>>>>>>>>
>>>>>>>> ---
>>>>>>>>
>>>>>>>> src/hotspot/share/prims/jvmtiEnvBase.cpp
>>>>>>>> src/hotspot/share/prims/jvmtiImpl.cpp
>>>>>>>>
>>>>>>>> Are you making parameters consistent with the fields they 
>>>>>>>> initialize?
>>>>>>>
>>>>>>> They're consistent with the declarations now.
>>>>>>>>
>>>>>>>> ---
>>>>>>>>
>>>>>>>> src/hotspot/share/prims/jvmtiTagMap.cpp
>>>>>>>>
>>>>>>>> There is a mix of int and jint for slot in this code. You fixed 
>>>>>>>> some, but this remains:
>>>>>>>>
>>>>>>>> 2440 inline bool CallbackInvoker::report_stack_ref_root(jlong 
>>>>>>>> thread_tag,
>>>>>>>> 2441 jlong tid,
>>>>>>>> 2442 jint depth,
>>>>>>>> 2443 jmethodID method,
>>>>>>>> 2444 jlocation bci,
>>>>>>>> 2445 jint slot,
>>>>>>>
>>>>>>> Right for consistency with the declarations.
>>>>>>>>
>>>>>>>> ---
>>>>>>>>
>>>>>>>> src/hotspot/share/runtime/perfData.cpp
>>>>>>>>
>>>>>>>> Callers pass both jint and int, so param type seems arbitrary.
>>>>>>>
>>>>>>> They are, but importantly they match the declarations.
>>>>>>>>
>>>>>>>> ---
>>>>>>>>
>>>>>>>> src/hotspot/share/runtime/perfMemory.cpp
>>>>>>>> src/hotspot/share/runtime/perfMemory.hpp
>>>>>>>>
>>>>>>>> PerfMemory::_initialized should ideally be a bool - can 
>>>>>>>> OrderAccess handle that now?
>>>>>>>
>>>>>>> Nope.
>>>>>>>>
>>>>>>>> ---
>>>>>>>>
>>>>>>>> src/java.base/share/native/include/jvm.h
>>>>>>>>
>>>>>>>> Not clear why the jio functions are not also JNICALL ?
>>>>>>>
>>>>>>> They are now.? The JDK version didn't have JNICALL. JVM needs 
>>>>>>> JNICALL.? I can't tell you why JDK didn't need JNICALL linkage.
>>>>>>
>>>>>> ?? JVM currently does not have JNICALL. But they are declared as 
>>>>>> "extern C".
>>>>>
>>>>> This was a compilation error on Windows with JDK.?? Maybe the C 
>>>>> code in the JDK doesn't complain about linkage differences. I'll 
>>>>> have to go back and figure this out then.
>>>>>>
>>>>>>>>
>>>>>>>> ---
>>>>>>>>
>>>>>>>> src/java.base/unix/native/include/jni_md.h
>>>>>>>>
>>>>>>>> There is no need to special case ARM. The differences in the 
>>>>>>>> existing code were for LTO support and that is now irrelevant.
>>>>>>>
>>>>>>> See discussion with Magnus.?? We still build ARM for jdk10/hs so 
>>>>>>> I needed this conditional or of course I wouldn't have added 
>>>>>>> it.? We can remove it with LTO support.
>>>>>>
>>>>>> Those builds are gone - this is obsolete. But yes all LTO can be 
>>>>>> removed later if you wish. Just trying to simplify things now.
>>>>>>
>>>>>>>>
>>>>>>>> ---
>>>>>>>>
>>>>>>>> src/java.base/unix/native/include/jvm_md.h
>>>>>>>>
>>>>>>>> I know you've just copied this across, but it seems wrong to me:
>>>>>>>>
>>>>>>>> ?57 // Hack: MAXPATHLEN is 4095 on some Linux and 4096 on 
>>>>>>>> others. This may
>>>>>>>> ? 58 //?????? cause problems if JVM and the rest of JDK are 
>>>>>>>> built on different
>>>>>>>> ? 59 //?????? Linux releases. Here we define JVM_MAXPATHLEN to 
>>>>>>>> be MAXPATHLEN + 1,
>>>>>>>> ? 60 //?????? so buffers declared in VM are always >= 4096.
>>>>>>>> ? 61 #define JVM_MAXPATHLEN MAXPATHLEN + 1
>>>>>>>>
>>>>>>>> It doesn't make sense to me to define an internal "max path 
>>>>>>>> length" that can _exceed_ the platform max!
>>>>>>>>
>>>>>>>> That aside there's no support for building different parts of 
>>>>>>>> the JDK on different platforms and then bringing them together. 
>>>>>>>> And in any case I would think the real problem would be 
>>>>>>>> building on a platform that uses 4096 and running on one that 
>>>>>>>> uses 4095!
>>>>>>>>
>>>>>>>> But that aside this is a Linux hack and should be guarded by 
>>>>>>>> ifdef LINUX. (I doubt BSD needs it, the bsd file is just a copy 
>>>>>>>> of the linux one - the JDK macosx version does the right 
>>>>>>>> thing). Solaris and AIX should stay as-is at MAXPATHLEN.
>>>>>>>
>>>>>>> All of the unix platforms had MAXPATHLEN+1.? I'll leave it for 
>>>>>>> now and we can investigate that further.
>>>>>>
>>>>>> I see the following existing code:
>>>>>>
>>>>>> src/java.base/unix/native/include/jvm_md.h:
>>>>>>
>>>>>> #define JVM_MAXPATHLEN MAXPATHLEN
>>>>>>
>>>>>> src/java.base/macosx/native/include/jvm_md.h
>>>>>>
>>>>>> #define JVM_MAXPATHLEN MAXPATHLEN
>>>>>>
>>>>>> src/hotspot/os/aix/jvm_aix.h
>>>>>>
>>>>>> #define JVM_MAXPATHLEN MAXPATHLEN
>>>>>>
>>>>>> src/hotspot/os/bsd/jvm_bsd.h
>>>>>>
>>>>>> #define JVM_MAXPATHLEN MAXPATHLEN + 1? // blindly copied from 
>>>>>> Linux version
>>>>>>
>>>>>> src/hotspot/os/linux/jvm_linux.h
>>>>>>
>>>>>> #define JVM_MAXPATHLEN MAXPATHLEN + 1
>>>>>>
>>>>>> src/hotspot/os/solaris/jvm_solaris.h
>>>>>>
>>>>>> #define JVM_MAXPATHLEN MAXPATHLEN
>>>>>>
>>>>>> This is a linux only hack (if you ignore the blind copy from 
>>>>>> linux into the BSD code in the VM).
>>>>>
>>>>> Oh, thanks, so should I add a bunch of ifdefs then?? Or do you 
>>>>> think having MAXPATHLEN + 1 will really break the other 
>>>>> platforms?? Do you really see this as a problem or are you just 
>>>>> pointing out inconsistency?
>>>>>>
>>>>>>>>
>>>>>>>> ?86 #define ASYNC_SIGNAL???? SIGJVM2
>>>>>>>>
>>>>>>>> This only exists on Solaris so I think should be in #ifdef 
>>>>>>>> SOLARIS, to make that clear.
>>>>>>>
>>>>>>> Ok.? I'll add this.
>>>>>>>>
>>>>>>>> ---
>>>>>>>>
>>>>>>>> src/java.base/windows/native/include/jvm_md.h
>>>>>>>>
>>>>>>>> Given the differences between the two versions either something 
>>>>>>>> has been broken or "extern C" declarations are not needed :)
>>>>>>>
>>>>>>> Well, they are needed for Hotspot to build and do not prevent 
>>>>>>> jdk from building.? I don't know what was broken.
>>>>>>
>>>>>> We really need to understand this better. Maybe related to the 
>>>>>> map files that expose the symbols. ??
>>>>>
>>>>> They're needed because the JDK files are written mostly in C and 
>>>>> that doesn't complain about the linkage difference. Hotspot files 
>>>>> are in C++ which does complain.
>>>>>
>>>>>>
>>>>>>>>
>>>>>>>> ---
>>>>>>>>
>>>>>>>> That was a really painful way to spend most of my Friday. TGIF! :)
>>>>>>>
>>>>>>> Thanks for going through it.? See comments inline for changes. 
>>>>>>> Generating a webrev takes hours so I'm not going to do that 
>>>>>>> unless you insist.
>>>>>>
>>>>>> An incremental webrev shouldn't take long - right? You're a mq 
>>>>>> maestro now. :)
>>>>>
>>>>> Well I generally trash a repository whenever I use mq but sure.
>>>>>>
>>>>>> If you can reasonably produce an incremental webrev once you've 
>>>>>> settled on all the comments/issues that would be good.
>>>>>
>>>>> Ok, sure.
>>>>>
>>>>> Coleen
>>>>>>
>>>>>> Thanks,
>>>>>> David
>>>>>>
>>>>>>> Thanks,
>>>>>>> Coleen
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> David
>>>>>>>> -----
>>>>>>>>
>>>>>>>>
>>>>>>>> On 27/10/2017 6:44 AM, coleen.phillimore at oracle.com wrote:
>>>>>>>>> ??Hi Magnus,
>>>>>>>>>
>>>>>>>>> Thank you for reviewing this.?? I have a new version that 
>>>>>>>>> takes out the hack in globalDefinitions.hpp and adds casts to 
>>>>>>>>> src/hotspot/share/opto/type.cpp instead.
>>>>>>>>>
>>>>>>>>> Also some fixes from Martin at SAP.
>>>>>>>>>
>>>>>>>>> open webrev at 
>>>>>>>>> http://cr.openjdk.java.net/~coleenp/8189610.02/webrev
>>>>>>>>>
>>>>>>>>> see below.
>>>>>>>>>
>>>>>>>>> On 10/26/17 5:57 AM, Magnus Ihse Bursie wrote:
>>>>>>>>>> Coleen,
>>>>>>>>>>
>>>>>>>>>> Thank you for addressing this!
>>>>>>>>>>
>>>>>>>>>> On 2017-10-25 18:49, coleen.phillimore at oracle.com wrote:
>>>>>>>>>>> Summary: removed hotspot version of jvm*h and jni*h files
>>>>>>>>>>>
>>>>>>>>>>> Mostly used sed to remove prims/jvm.h and move #include 
>>>>>>>>>>> "jvm.h" after precompiled.h, so if you have repetitive 
>>>>>>>>>>> stress wrist issues don't click on most of these files.
>>>>>>>>>>>
>>>>>>>>>>> There were more issues to resolve, however. The JDK windows 
>>>>>>>>>>> jni_md.h file defined jint as long and the hotspot windows 
>>>>>>>>>>> jni_x86.h as int. I had to choose the jdk version since it's 
>>>>>>>>>>> the public version, so there are changes to the hotspot 
>>>>>>>>>>> files for this. Generally I changed the code to use 'int' 
>>>>>>>>>>> rather than 'jint' where the surrounding API didn't insist 
>>>>>>>>>>> on consistently using java types. We should mostly be using 
>>>>>>>>>>> C++ types within hotspot except in interfaces to native/JNI 
>>>>>>>>>>> code. There are a couple of hacks in places where adding 
>>>>>>>>>>> multiple jint casts was too painful.
>>>>>>>>>>>
>>>>>>>>>>> Tested with JPRT and tier2-4 (in progress).
>>>>>>>>>>>
>>>>>>>>>>> open webrev at 
>>>>>>>>>>> http://cr.openjdk.java.net/~coleenp/8189610.01/webrev
>>>>>>>>>>
>>>>>>>>>> Looks great!
>>>>>>>>>>
>>>>>>>>>> Just a few comments:
>>>>>>>>>>
>>>>>>>>>> * src/java.base/unix/native/include/jni_md.h:
>>>>>>>>>>
>>>>>>>>>> I don't think the externally_visible attribute should be 
>>>>>>>>>> there for arm. I know this was the case for the corresponding 
>>>>>>>>>> hotspot file for arm, but that was techically incorrect. The 
>>>>>>>>>> proper dependency here is that externally_visible should be 
>>>>>>>>>> in all JNIEXPORT if and only if we're building with JVM 
>>>>>>>>>> feature "link-time-opt". Traditionally, that feature been 
>>>>>>>>>> enabled when building arm32 builds, and only then, so there's 
>>>>>>>>>> been a (coincidentally) connection here. Nowadays, Oracle 
>>>>>>>>>> does not care about the arm32 builds, and I'm not sure if 
>>>>>>>>>> anyone else is building them with link-time-opt enabled.
>>>>>>>>>>
>>>>>>>>>> It does seem wrong to me to export this behavior in the 
>>>>>>>>>> public jni_md.h file, though. I think the correct way to 
>>>>>>>>>> solve this, if we should continue supporting link-time-opt is 
>>>>>>>>>> to make sure this attribute is set for exported hotspot 
>>>>>>>>>> functions. If it's still needed, that is. A quick googling 
>>>>>>>>>> seems to indicate that visibility("default") might be enough 
>>>>>>>>>> in modern gcc's.
>>>>>>>>>>
>>>>>>>>>> A third option is to remove the support for link-time-opt 
>>>>>>>>>> entirely, if it's not really used.
>>>>>>>>>
>>>>>>>>> I didn't know how to change this since we are still building 
>>>>>>>>> ARM with the jdk10/hs repository, and ARM needed this change.? 
>>>>>>>>> I could wait until we bring down the jdk10/master changes that 
>>>>>>>>> remove the ARM build and remove this conditional before I 
>>>>>>>>> push. Or we could file an RFE to remove link-time-opt (?) and 
>>>>>>>>> remove it then?
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> * src/java.base/unix/native/include/jvm_md.h and 
>>>>>>>>>> src/java.base/windows/native/include/jvm_md.h:
>>>>>>>>>>
>>>>>>>>>> These files define a public API, and contain non-trivial 
>>>>>>>>>> changes. I suspect you should file a CSR request. (Even 
>>>>>>>>>> though I realize you're only matching the header file with 
>>>>>>>>>> the reality.)
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> I filed the CSR.?? Waiting for the next steps.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Coleen
>>>>>>>>>
>>>>>>>>>> /Magnus
>>>>>>>>>>
>>>>>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8189610
>>>>>>>>>>>
>>>>>>>>>>> I have a script to update copyright files on commit.
>>>>>>>>>>>
>>>>>>>>>>> Thanks to Magnus and ErikJ for the makefile changes.
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Coleen
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>
>>>
>


From dmitry.samersoff at bell-sw.com  Mon Oct 30 18:05:41 2017
From: dmitry.samersoff at bell-sw.com (Dmitry Samersoff)
Date: Mon, 30 Oct 2017 21:05:41 +0300
Subject: [10] RFR 8186046 Minimal ConstantDynamic support
In-Reply-To: <A3420FD7-0556-4865-9CE7-2E5252DD1381@oracle.com>
References: <A3420FD7-0556-4865-9CE7-2E5252DD1381@oracle.com>
Message-ID: <b8c8d33a-2782-cfba-64df-fc32151ac117@bell-sw.com>

Paul,

templateTable_x86.cpp:

 564   const Register flags = rcx;
 565   const Register rarg = NOT_LP64(rcx) LP64_ONLY(c_rarg1);

Should we use another register for rarg under NOT_LP64 ?

-Dmitry


On 10/26/2017 08:03 PM, Paul Sandoz wrote:
> Hi,
> 
> Please review the following patch for minimal dynamic constant support:
> 
>   http://cr.openjdk.java.net/~psandoz/jdk10/JDK-8186046-minimal-condy-support-hs/webrev/ <http://cr.openjdk.java.net/~psandoz/jdk10/JDK-8186046-minimal-condy-support-hs/webrev/>
> 
>   https://bugs.openjdk.java.net/browse/JDK-8186046 <https://bugs.openjdk.java.net/browse/JDK-8186046>
>   https://bugs.openjdk.java.net/browse/JDK-8186209 <https://bugs.openjdk.java.net/browse/JDK-8186209>
> 
> This patch is based on the JDK 10 unified HotSpot repository. Testing so far looks good.
> 
> By minimal i mean just the support in the runtime for a dynamic constant pool entry to be referenced by a LDC instruction or a bootstrap method argument. Much of the work leverages the foundations built by invoke dynamic but is arguably simpler since resolution is less complex.
> 
> A small set of bootstrap methods will be proposed as a follow on issue for 10 (these are currently being refined in the amber repository).
> 
> Bootstrap method invocation has not changed (and the rules are the same for dynamic constants and indy). It is planned to enhance this in a further major release to support lazy resolution of bootstrap method arguments.
> 
> The CSR for the VM specification is here:
> 
>   https://bugs.openjdk.java.net/browse/JDK-8189199 <https://bugs.openjdk.java.net/browse/JDK-8189199>
> 
> the j.l.invoke package documentation was also updated but please consider the VM specification as the definitive "source of truth" (we may clean up this area further later on so it becomes more informative, and that may also apply to duplicative text on MethodHandles/VarHandles).
> 
> Any AoT-related work will be deferred to a future release.
> 
> ?
> 
> This patch only supports x64 platforms. There is a small set of changes specific to x64 (specifically to support null and primitives constants, as prior to this patch null was used as a sentinel for resolution and certain primitives types would never have been encountered, such as say byte).
> 
> We will need to follow up with the SPARC platform and it is hoped/anticipated that OpenJDK members responsible for other platforms (namely ARM and PPC) will separately provide patches.
> 
> ?
> 
> Many of tests rely on an experimental byte code API that supports the generation of byte code with dynamic constants.
> 
> One test uses class file bytes produced from a modified version of asmtools.  The modifications have now been pushed but a new version of asmtools need to be rolled into jtreg before the test can operate directly on asmtools information rather than embedding class file bytes directly in the test.
> 
> ?
> 
> Paul.
> 


From volker.simonis at gmail.com  Mon Oct 30 19:34:22 2017
From: volker.simonis at gmail.com (Volker Simonis)
Date: Mon, 30 Oct 2017 20:34:22 +0100
Subject: RFR(S): 8187091: ReturnBlobToWrongHeapTest fails because of
 problems in CodeHeap::contains_blob()
In-Reply-To: <c1699d0a-1d1c-3ff3-8d6d-a5a92c6ba44b@oracle.com>
References: <CA+3eh11t7hC+kVqRWOuPJEyEi61E0eUNOHNpexfLobcqXvhjBg@mail.gmail.com>
 <f9048a51-50fc-669b-19f7-ede52098f9a5@oracle.com>
 <CA+3eh11R26VPpDM-N54gAcTH8KBVarPg8F03tc+EnDWs2Ys9NA@mail.gmail.com>
 <c1699d0a-1d1c-3ff3-8d6d-a5a92c6ba44b@oracle.com>
Message-ID: <CA+3eh12hn1DRyTk+HWBD5F2ttuT+Qtk2TPXhWRozb=1uq1vWRw@mail.gmail.com>

Hi Vladimir,

this one is still pending (you only pushed "8166317:
InterpreterCodeSize should be computed").

Could you please also sponsor this one?

The latest version is here:

http://cr.openjdk.java.net/~simonis/webrevs/2017/8187091.v2/

Thank you and best regards,
Volker

On Tue, Sep 5, 2017 at 6:35 PM, Vladimir Kozlov
<vladimir.kozlov at oracle.com> wrote:
> On 9/4/17 10:23 AM, Volker Simonis wrote:
>>
>> On Fri, Sep 1, 2017 at 6:00 PM, Vladimir Kozlov
>> <vladimir.kozlov at oracle.com> wrote:
>>>
>>> Checking type is emulation of virtual call ;-)
>>
>>
>> I agree :)  But it is only a bimorphic dispatch in this case which
>> should be still faster than a normal virtual call.
>>
>>> But I agree that it is simplest solution - one line change (excluding
>>> comment - comment is good BTW).
>>>
>>
>> Thanks.
>>
>>> You can also add guard AOT_ONLY() around aot specific code:
>>>
>>>     const void* start = AOT_ONLY( (code_blob_type() == CodeBlobType::AOT)
>>> ?
>>> blob->code_begin() : ) (void*)blob;
>>>
>>> because we do have builds without AOT.
>>>
>>
>> Done. Please find the new webrev here:
>>
>> http://cr.openjdk.java.net/~simonis/webrevs/2017/8187091.v1/
>
>
> Looks good. Thank you for updated CodeBlob description comment.
>
>>
>> Could you please sponsor the change once jdk10-hs opens again?
>
>
> We have to wait when jdk10 "consolidation" is finished. It may take 2 weeks.
>
>>
>> Thanks,
>> Volker
>>
>> PS: one thing which is still unclear to me is why you haven't caught
>> this issue before? Isn't
>> test/compiler/codecache/stress/ReturnBlobToWrongHeapTest.java part of
>> JPRT and/or your regular tests?
>
>
> test/compiler/codecache/stress are excluded from JPRT runs:
>
> https://bugs.openjdk.java.net/browse/JDK-8069021
>
> Also these tests are marked with @key stress. Originally it was only 2 tests
> and ReturnBlobToWrongHeapTest.java was added later:
>
> https://bugs.openjdk.java.net/browse/JDK-8069021
>
> I am trying to find which testing tier runs them. I will follow this.
>
> Thanks,
> Vladimir
>
>
>>
>>
>>> Thanks,
>>> Vladimir
>>>
>>>
>>> On 9/1/17 8:42 AM, Volker Simonis wrote:
>>>>
>>>>
>>>> Hi,
>>>>
>>>> can I please have a review and sponsor for the following small fix:
>>>>
>>>> http://cr.openjdk.java.net/~simonis/webrevs/2017/8187091/
>>>> https://bugs.openjdk.java.net/browse/JDK-8187091
>>>>
>>>> We see failures in
>>>> test/compiler/codecache/stress/ReturnBlobToWrongHeapTest.java which
>>>> are cause by problems in CodeHeap::contains_blob() for corner cases
>>>> with CodeBlobs of zero size:
>>>>
>>>> # A fatal error has been detected by the Java Runtime Environment:
>>>> #
>>>> # Internal Error (heap.cpp:248), pid=27586, tid=27587
>>>> # guarantee((char*) b >= _memory.low_boundary() && (char*) b <
>>>> _memory.high()) failed: The block to be deallocated 0x00007fffe6666f80
>>>> is not within the heap starting with 0x00007fffe6667000 and ending
>>>> with 0x00007fffe6ba000
>>>>
>>>> The problem is that JDK-8183573 replaced
>>>>
>>>>     virtual bool contains_blob(const CodeBlob* blob) const { return
>>>> low_boundary() <= (char*) blob && (char*) blob < high(); }
>>>>
>>>> by:
>>>>
>>>>     bool contains_blob(const CodeBlob* blob) const { return
>>>> contains(blob->code_begin()); }
>>>>
>>>> But that my be wrong in the corner case where the size of the
>>>> CodeBlob's payload is zero (i.e. the CodeBlob consists only of the
>>>> 'header' - i.e. the C++ object itself) because in that case
>>>> CodeBlob::code_begin() points right behind the CodeBlob's header which
>>>> is a memory location which doesn't belong to the CodeBlob anymore.
>>>>
>>>> This exact corner case is exercised by ReturnBlobToWrongHeapTest which
>>>> allocates CodeBlobs of size zero (i.e. zero 'payload') with the help
>>>> of sun.hotspot.WhiteBox.allocateCodeBlob() until the CodeCache fills
>>>> up. The test first fills the 'non-profiled nmethods' CodeHeap. If the
>>>> 'non-profiled nmethods' CodeHeap is full, the VM automatically tries
>>>> to allocate from the 'profiled nmethods' CodeHeap until that fills up
>>>> as well. But in the CodeCache the 'profiled nmethods' CodeHeap is
>>>> located right before the non-profiled nmethods' CodeHeap. So if the
>>>> last CodeBlob allocated from the 'profiled nmethods' CodeHeap has a
>>>> payload size of zero and uses all the CodeHeaps remaining size, we
>>>> will end up with a CodeBlob whose code_begin() address will point
>>>> right behind the actual CodeHeap (i.e. it will point right at the
>>>> beginning of the adjacent, 'non-profiled nmethods' CodeHeap). This
>>>> will result in the above guarantee to fire, when we will try to free
>>>> the last allocated CodeBlob (with
>>>> sun.hotspot.WhiteBox.freeCodeBlob()).
>>>>
>>>> In a previous mail thread
>>>>
>>>>
>>>> (http://mail.openjdk.java.net/pipermail/hotspot-dev/2017-August/028175.html)
>>>> Vladimir explained why JDK-8183573 was done:
>>>>
>>>>> About contains_blob(). The problem is that AOTCompiledMethod allocated
>>>>> in
>>>>> CHeap and not in aot code section (which is RO):
>>>>>
>>>>>
>>>>>
>>>>> http://hg.openjdk.java.net/jdk10/hs/hotspot/file/8acd232fb52a/src/share/vm/aot/aotCompiledMethod.hpp#l124
>>>>>
>>>>> It is allocated in CHeap after AOT library is loaded. Its code_begin()
>>>>> points to AOT code section but AOTCompiledMethod*
>>>>> points outside it (to normal malloced space) so you can't use
>>>>> (char*)blob
>>>>> address.
>>>>
>>>>
>>>>
>>>> and proposed these two fixes:
>>>>
>>>>> There are 2 ways to fix it, I think.
>>>>> One is to add new field to CodeBlobLayout and set it to blob* address
>>>>> for
>>>>> normal CodeCache blobs and to code_begin for
>>>>> AOT code.
>>>>> Second is to use contains(blob->code_end() - 1) assuming that AOT code
>>>>> is
>>>>> never zero.
>>>>
>>>>
>>>>
>>>> I came up with a slightly different solution - just use
>>>> 'CodeHeap::code_blob_type()' whether to use 'blob->code_begin()' (for
>>>> the AOT case) or '(void*)blob' (for all other blobs) as input for the
>>>> call to 'CodeHeap::contain()'. It's simple and still much cheaper than
>>>> a virtual call. What do you think?
>>>>
>>>> I've also updated the documentation of the CodeBlob class hierarchy in
>>>> codeBlob.hpp. Please let me know if I've missed something.
>>>>
>>>> Thank you and best regards,
>>>> Volker
>>>>
>>>
>

From paul.sandoz at oracle.com  Mon Oct 30 19:44:54 2017
From: paul.sandoz at oracle.com (Paul Sandoz)
Date: Mon, 30 Oct 2017 12:44:54 -0700
Subject: [10] RFR 8186046 Minimal ConstantDynamic support
In-Reply-To: <b8c8d33a-2782-cfba-64df-fc32151ac117@bell-sw.com>
References: <A3420FD7-0556-4865-9CE7-2E5252DD1381@oracle.com>
 <b8c8d33a-2782-cfba-64df-fc32151ac117@bell-sw.com>
Message-ID: <93431280-9CBF-4722-961D-F2D2D0F83B4E@oracle.com>

Hi,

Thanks for reviewing.

> On 30 Oct 2017, at 11:05, Dmitry Samersoff <dmitry.samersoff at bell-sw.com> wrote:
> 
> Paul,
> 
> templateTable_x86.cpp:
> 
> 564   const Register flags = rcx;
> 565   const Register rarg = NOT_LP64(rcx) LP64_ONLY(c_rarg1);
> 
> Should we use another register for rarg under NOT_LP64 ?
> 

I think it should be ok, it i ain?t an expert here on the interpreter and the calling conventions, so please correct me.

Some more context:

+  const Register flags = rcx;
+  const Register rarg = NOT_LP64(rcx) LP64_ONLY(c_rarg1);
+  __ movl(rarg, (int)bytecode());

The current bytecode code is loaded into ?rarg?

+  call_VM(obj, CAST_FROM_FN_PTR(address, InterpreterRuntime::resolve_ldc), rarg);

Then ?rarg" is the argument to the call to InterpreterRuntime::resolve_ldc, after which it is no longer referred to.

+#ifndef _LP64
+  // borrow rdi from locals
+  __ get_thread(rdi);
+  __ get_vm_result_2(flags, rdi);
+  __ restore_locals();
+#else
+  __ get_vm_result_2(flags, r15_thread);
+#endif

The result from the call is then loaded into flags.

So i don?t think it matters in this case if rcx is aliased.

Paul.

> -Dmitry
> 
> 
> On 10/26/2017 08:03 PM, Paul Sandoz wrote:
>> Hi,
>> 
>> Please review the following patch for minimal dynamic constant support:
>> 
>>  http://cr.openjdk.java.net/~psandoz/jdk10/JDK-8186046-minimal-condy-support-hs/webrev/ <http://cr.openjdk.java.net/~psandoz/jdk10/JDK-8186046-minimal-condy-support-hs/webrev/>
>> 
>>  https://bugs.openjdk.java.net/browse/JDK-8186046 <https://bugs.openjdk.java.net/browse/JDK-8186046>
>>  https://bugs.openjdk.java.net/browse/JDK-8186209 <https://bugs.openjdk.java.net/browse/JDK-8186209>
>> 
>> This patch is based on the JDK 10 unified HotSpot repository. Testing so far looks good.
>> 
>> By minimal i mean just the support in the runtime for a dynamic constant pool entry to be referenced by a LDC instruction or a bootstrap method argument. Much of the work leverages the foundations built by invoke dynamic but is arguably simpler since resolution is less complex.
>> 
>> A small set of bootstrap methods will be proposed as a follow on issue for 10 (these are currently being refined in the amber repository).
>> 
>> Bootstrap method invocation has not changed (and the rules are the same for dynamic constants and indy). It is planned to enhance this in a further major release to support lazy resolution of bootstrap method arguments.
>> 
>> The CSR for the VM specification is here:
>> 
>>  https://bugs.openjdk.java.net/browse/JDK-8189199 <https://bugs.openjdk.java.net/browse/JDK-8189199>
>> 
>> the j.l.invoke package documentation was also updated but please consider the VM specification as the definitive "source of truth" (we may clean up this area further later on so it becomes more informative, and that may also apply to duplicative text on MethodHandles/VarHandles).
>> 
>> Any AoT-related work will be deferred to a future release.
>> 
>> ?
>> 
>> This patch only supports x64 platforms. There is a small set of changes specific to x64 (specifically to support null and primitives constants, as prior to this patch null was used as a sentinel for resolution and certain primitives types would never have been encountered, such as say byte).
>> 
>> We will need to follow up with the SPARC platform and it is hoped/anticipated that OpenJDK members responsible for other platforms (namely ARM and PPC) will separately provide patches.
>> 
>> ?
>> 
>> Many of tests rely on an experimental byte code API that supports the generation of byte code with dynamic constants.
>> 
>> One test uses class file bytes produced from a modified version of asmtools.  The modifications have now been pushed but a new version of asmtools need to be rolled into jtreg before the test can operate directly on asmtools information rather than embedding class file bytes directly in the test.
>> 
>> ?
>> 
>> Paul.
>> 
> 


From mandy.chung at oracle.com  Mon Oct 30 21:08:53 2017
From: mandy.chung at oracle.com (mandy chung)
Date: Mon, 30 Oct 2017 14:08:53 -0700
Subject: RFR: 8190287: Update JDK's internal ASM to ASMv6
In-Reply-To: <59F3690B.6070309@oracle.com>
References: <59F3690B.6070309@oracle.com>
Message-ID: <1d6c773a-8495-cf14-61b6-7616c8b80225@oracle.com>


On 10/27/17 10:12 AM, Kumar Srinivasan wrote:
> Hello Remi, Sundar and others,
>
> Please review the webrev [1] to update JDK's internal ASM to v6.
>
> [1] http://cr.openjdk.java.net/~ksrini/8190287/webrev.00/index.html

The jlink and module-related change looks fine to me.? I also skimmed 
through asm6 change which looks fine too.

Please update src/java.base/share/legal/asm.md to reflect the new version.

thanks
Mandy

From frederic.parain at oracle.com  Mon Oct 30 21:56:37 2017
From: frederic.parain at oracle.com (Frederic Parain)
Date: Mon, 30 Oct 2017 17:56:37 -0400
Subject: [10] RFR 8186046 Minimal ConstantDynamic support
In-Reply-To: <93431280-9CBF-4722-961D-F2D2D0F83B4E@oracle.com>
References: <A3420FD7-0556-4865-9CE7-2E5252DD1381@oracle.com>
 <b8c8d33a-2782-cfba-64df-fc32151ac117@bell-sw.com>
 <93431280-9CBF-4722-961D-F2D2D0F83B4E@oracle.com>
Message-ID: <EBC28788-F65A-42F8-A6AD-7B670FDB74C5@oracle.com>

I?m seeing no issue with rcx being aliased in this code.

Fred

> On Oct 30, 2017, at 15:44, Paul Sandoz <paul.sandoz at oracle.com> wrote:
> 
> Hi,
> 
> Thanks for reviewing.
> 
>> On 30 Oct 2017, at 11:05, Dmitry Samersoff <dmitry.samersoff at bell-sw.com> wrote:
>> 
>> Paul,
>> 
>> templateTable_x86.cpp:
>> 
>> 564   const Register flags = rcx;
>> 565   const Register rarg = NOT_LP64(rcx) LP64_ONLY(c_rarg1);
>> 
>> Should we use another register for rarg under NOT_LP64 ?
>> 
> 
> I think it should be ok, it i ain?t an expert here on the interpreter and the calling conventions, so please correct me.
> 
> Some more context:
> 
> +  const Register flags = rcx;
> +  const Register rarg = NOT_LP64(rcx) LP64_ONLY(c_rarg1);
> +  __ movl(rarg, (int)bytecode());
> 
> The current bytecode code is loaded into ?rarg?
> 
> +  call_VM(obj, CAST_FROM_FN_PTR(address, InterpreterRuntime::resolve_ldc), rarg);
> 
> Then ?rarg" is the argument to the call to InterpreterRuntime::resolve_ldc, after which it is no longer referred to.
> 
> +#ifndef _LP64
> +  // borrow rdi from locals
> +  __ get_thread(rdi);
> +  __ get_vm_result_2(flags, rdi);
> +  __ restore_locals();
> +#else
> +  __ get_vm_result_2(flags, r15_thread);
> +#endif
> 
> The result from the call is then loaded into flags.
> 
> So i don?t think it matters in this case if rcx is aliased.
> 
> Paul.
> 
>> -Dmitry
>> 
>> 
>> On 10/26/2017 08:03 PM, Paul Sandoz wrote:
>>> Hi,
>>> 
>>> Please review the following patch for minimal dynamic constant support:
>>> 
>>> http://cr.openjdk.java.net/~psandoz/jdk10/JDK-8186046-minimal-condy-support-hs/webrev/ <http://cr.openjdk.java.net/~psandoz/jdk10/JDK-8186046-minimal-condy-support-hs/webrev/>
>>> 
>>> https://bugs.openjdk.java.net/browse/JDK-8186046 <https://bugs.openjdk.java.net/browse/JDK-8186046>
>>> https://bugs.openjdk.java.net/browse/JDK-8186209 <https://bugs.openjdk.java.net/browse/JDK-8186209>
>>> 
>>> This patch is based on the JDK 10 unified HotSpot repository. Testing so far looks good.
>>> 
>>> By minimal i mean just the support in the runtime for a dynamic constant pool entry to be referenced by a LDC instruction or a bootstrap method argument. Much of the work leverages the foundations built by invoke dynamic but is arguably simpler since resolution is less complex.
>>> 
>>> A small set of bootstrap methods will be proposed as a follow on issue for 10 (these are currently being refined in the amber repository).
>>> 
>>> Bootstrap method invocation has not changed (and the rules are the same for dynamic constants and indy). It is planned to enhance this in a further major release to support lazy resolution of bootstrap method arguments.
>>> 
>>> The CSR for the VM specification is here:
>>> 
>>> https://bugs.openjdk.java.net/browse/JDK-8189199 <https://bugs.openjdk.java.net/browse/JDK-8189199>
>>> 
>>> the j.l.invoke package documentation was also updated but please consider the VM specification as the definitive "source of truth" (we may clean up this area further later on so it becomes more informative, and that may also apply to duplicative text on MethodHandles/VarHandles).
>>> 
>>> Any AoT-related work will be deferred to a future release.
>>> 
>>> ?
>>> 
>>> This patch only supports x64 platforms. There is a small set of changes specific to x64 (specifically to support null and primitives constants, as prior to this patch null was used as a sentinel for resolution and certain primitives types would never have been encountered, such as say byte).
>>> 
>>> We will need to follow up with the SPARC platform and it is hoped/anticipated that OpenJDK members responsible for other platforms (namely ARM and PPC) will separately provide patches.
>>> 
>>> ?
>>> 
>>> Many of tests rely on an experimental byte code API that supports the generation of byte code with dynamic constants.
>>> 
>>> One test uses class file bytes produced from a modified version of asmtools.  The modifications have now been pushed but a new version of asmtools need to be rolled into jtreg before the test can operate directly on asmtools information rather than embedding class file bytes directly in the test.
>>> 
>>> ?
>>> 
>>> Paul.
>>> 
>> 
> 


From david.holmes at oracle.com  Tue Oct 31 00:21:45 2017
From: david.holmes at oracle.com (David Holmes)
Date: Tue, 31 Oct 2017 10:21:45 +1000
Subject: RFR (L, tedious again, sorry) 8189610: Reconcile jvm.h and all
 jvm_md.h between java.base and hotspot
In-Reply-To: <f664005e-a495-d996-8a89-b8fa0a9f3f18@oracle.com>
References: <8671321f-398c-5f7f-634d-9d9664e04d87@oracle.com>
 <05bf853a-52f8-c450-4171-89f6b64d793a@oracle.com>
 <72f2aac7-ed20-c995-913b-ee4341a2a978@oracle.com>
 <55ec3559-c593-bcb6-51b0-4639da126068@oracle.com>
 <a05deb31-c7bf-4fda-f621-d39cfb56abb4@oracle.com>
 <4509dce7-10f8-4558-2adb-90d4745e054e@oracle.com>
 <396ab0f7-3710-3f76-675a-5108bcb50af5@oracle.com>
 <22afedef-59cc-ecde-48fc-0afb7b4bbb47@oracle.com>
 <a3389ea3-83b2-40fb-85cd-bb741191d219@oracle.com>
 <815ac734-ea8b-ea2d-ecec-85cb547ba2f4@oracle.com>
 <440f79ba-2da3-b627-53bc-e1842e3cf73c@oracle.com>
 <f664005e-a495-d996-8a89-b8fa0a9f3f18@oracle.com>
Message-ID: <058662bb-5d5b-0085-cc08-02192d000838@oracle.com>

On 31/10/2017 12:48 AM, coleen.phillimore at oracle.com wrote:
> 
> http://cr.openjdk.java.net/~coleenp/8189610.incr.02/webrev/index.html
> 
> Changed JDK file to use PATH_MAX.? Retested jdk tier1 tests.

Why PATH_MAX instead of MAXPATHLEN? They appear to be the same on Linux 
and Solaris, but I don't know if that is true for AIX and Mac OS / BSD.

Does UnixFileSystem_md.c still need the jvm.h include now?

Thanks,
David

> thanks,
> Coleen
> 
> On 10/30/17 8:38 AM, coleen.phillimore at oracle.com wrote:
>>
>>
>> On 10/30/17 8:17 AM, David Holmes wrote:
>>> On 30/10/2017 10:13 PM, coleen.phillimore at oracle.com wrote:
>>>> On 10/28/17 3:50 AM, David Holmes wrote:
>>>>> Hi Coleen,
>>>>>
>>>>> I've commented on the file location in response to Mandy's email.
>>>>>
>>>>> The only issue I'm still concerned about is the JVM_MAXPATHLEN 
>>>>> issue. I think it is a bug to define a JVM_MAXPATHLEN that is 
>>>>> bigger than the platform MAXPATHLEN. I also would not want to see 
>>>>> any change in behaviour because of this - so AIX and Solaris should 
>>>>> not get a different JVM_MAXPATHLEN due to this refactoring change. 
>>>>> So yes I think this needs to be ifdef'd for Linux and reluctantly 
>>>>> (because it was a copy error) for OSX/BSD as well.
>>>>
>>>> #if defined(AIX) || defined(SOLARIS)
>>>> #define JVM_MAXPATHLEN MAXPATHLEN
>>>> #else
>>>> // Hack: MAXPATHLEN is 4095 on some Linux and 4096 on others. This may
>>>> //?????? cause problems if JVM and the rest of JDK are built on 
>>>> different
>>>> //?????? Linux releases. Here we define JVM_MAXPATHLEN to be 
>>>> MAXPATHLEN + 1,
>>>> //?????? so buffers declared in VM are always >= 4096.
>>>> #define JVM_MAXPATHLEN MAXPATHLEN + 1
>>>> #endif
>>>>
>>>> Is this ok?
>>>
>>> Yes - thanks. It preserves existing behaviour on the VM side at 
>>> least. Time will tell if it messes anything up on the JDK side for 
>>> Linux/OSX.
>>
>> I don't want to wait for time so I'm investigating.
>>
>> It's one use is:
>>
>> Java_java_io_UnixFileSystem_canonicalize0(JNIEnv *env, jobject this,
>> ...
>> ??????? char canonicalPath[JVM_MAXPATHLEN];
>> ??????? if (canonicalize((char *)path,
>> ???????????????????????? canonicalPath, JVM_MAXPATHLEN) < 0) {
>> ??????????? JNU_ThrowIOExceptionWithLastError(env, "Bad pathname");
>>
>> Which goes to:
>>
>> canonicalize_md.c
>>
>> canonicalize(char *original, char *resolved, int len)
>> ??? if (len < PATH_MAX) {
>> ??????? errno = EINVAL;
>> ??????? return -1;
>> ??? }
>>
>>
>> So this should fail every time.
>>
>> sys/param.h:# define MAXPATHLEN??? PATH_MAX
>>
>> I haven't found any tests for it.
>>
>> I don't know why Java_java_io_UnixFileSystem uses JVM_MAXPATHLEN since 
>> it's not calling the JVM interface as far as I can tell. I think it 
>> should be changed to PATH_MAX.
>>
>> ?
>> Coleen
>>>
>>> David
>>>
>>>> thanks,
>>>> Coleen
>>>>>
>>>>> Thanks,
>>>>> David
>>>>>
>>>>> On 28/10/2017 12:08 AM, coleen.phillimore at oracle.com wrote:
>>>>>>
>>>>>>
>>>>>> On 10/27/17 9:37 AM, David Holmes wrote:
>>>>>>> On 27/10/2017 10:13 PM, coleen.phillimore at oracle.com wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> On 10/27/17 3:23 AM, David Holmes wrote:
>>>>>>>>> Hi Coleen,
>>>>>>>>>
>>>>>>>>> Thanks for tackling this.
>>>>>>>>>
>>>>>>>>>> Summary: removed hotspot version of jvm*h and jni*h files
>>>>>>>>>
>>>>>>>>> Can you update the bug synopsis to show it covers both sets of 
>>>>>>>>> files please.
>>>>>>>>>
>>>>>>>>> I hate to start with this (and it took me quite a while to 
>>>>>>>>> realize it) but as Mandy pointed out jvm.h is not an exported 
>>>>>>>>> interface from the JDK to the outside world (so not subject to 
>>>>>>>>> CSR review), but is a private interface between the JVM and the 
>>>>>>>>> JDK libraries. So I think really jvm.h belongs in the hotspot 
>>>>>>>>> sources where it was, while jni.h belongs in the exported JDK 
>>>>>>>>> sources. In which case the bulk of your changes to the hotspot 
>>>>>>>>> files would not be needed - sorry.
>>>>>>>>
>>>>>>>> Maybe someone can make that decision and change at a later date. 
>>>>>>>> The point of this change is that there is now only one of these 
>>>>>>>> files that is shared.? I don't think jvm.h and the jvm_md.h 
>>>>>>>> belong on the hotspot sources for the jdk to find them in some 
>>>>>>>> random prims and os dependent directories.
>>>>>>>
>>>>>>> The one file that is needed is a hotspot file - jvm.h defines the 
>>>>>>> interface that hotspot exports via jvm.cpp.
>>>>>>>
>>>>>>> If you leave jvm.h in hotspot/prims then a very large chunk of 
>>>>>>> your boilerplate changes are not needed. The JDK code doesn't 
>>>>>>> care what the name of the directory is - whatever it is just gets 
>>>>>>> added as a -I directive (the JDK code will include "jvm.h" not 
>>>>>>> "prims/jvm.h" the way hotspot sources do.
>>>>>>>
>>>>>>> This isn't something we want to change back or move again later. 
>>>>>>> Whatever we do now we live with.
>>>>>>
>>>>>> I think it belongs with jni.h and I think the core libraries group 
>>>>>> would agree.?? It seems more natural there than buried in the 
>>>>>> hotspot prims directory.? I guess this is on hold while we have 
>>>>>> this debate.?? Sigh.
>>>>>>
>>>>>> Actually with -I directives, changing to jvm.h from prims/jvm.h 
>>>>>> would still work.?? Maybe we should change the name to jvm.hpp 
>>>>>> since it's jvm.cpp though??? Or maybe just have two divergent 
>>>>>> copies and close this as WNF.
>>>>>>
>>>>>>>
>>>>>>>> I'm happy to withdraw the CSR. We generally use the CSR process 
>>>>>>>> to add and remove JVM_ interfaces even though they're a private 
>>>>>>>> interface in case some other JVM/JDK combination relies on them. 
>>>>>>>> The changes to these files are very minor though and not likely 
>>>>>>>> to cause any even theoretical incompatibility, so I'll withdraw it.
>>>>>>>>>
>>>>>>>>> Moving on ...
>>>>>>>>>
>>>>>>>>> First to address the initial comments/query you had:
>>>>>>>>>
>>>>>>>>>> The JDK windows jni_md.h file defined jint as long and the 
>>>>>>>>>> hotspot
>>>>>>>>>> windows jni_x86.h as int. I had to choose the jdk version 
>>>>>>>>>> since it's the
>>>>>>>>>> public version, so there are changes to the hotspot files for 
>>>>>>>>>> this.
>>>>>>>>>
>>>>>>>>> On Windows int and long are always the same as it uses ILP32 or 
>>>>>>>>> LLP64 (not LP64 like *nix platforms). So either choice should 
>>>>>>>>> be fine. That said there are some odd casting issues I comment 
>>>>>>>>> on below. Does the VS compiler complain about mixing int and 
>>>>>>>>> long in expressions?
>>>>>>>>
>>>>>>>> Yes, it does even though int and long are the same representation.
>>>>>>>
>>>>>>> And what an absolute mess that makes. :(
>>>>>>>
>>>>>>>>>
>>>>>>>>>> Generally I changed the code to use 'int' rather than 'jint' 
>>>>>>>>>> where the
>>>>>>>>>> surrounding API didn't insist on consistently using java 
>>>>>>>>>> types. We
>>>>>>>>>> should mostly be using C++ types within hotspot except in 
>>>>>>>>>> interfaces to
>>>>>>>>>> native/JNI code.
>>>>>>>>>
>>>>>>>>> I think you pulled too hard on a few threads here and things 
>>>>>>>>> are starting to unravel. There are numerous cases I refer to 
>>>>>>>>> below where either the cast seems unnecessary/inappropriate or 
>>>>>>>>> else highlights a bunch of additional changes that also need to 
>>>>>>>>> be made. The fan out from this could be horrendous. Unless you 
>>>>>>>>> actually get some kind of error - and I'd like to understand 
>>>>>>>>> the details of those - I would not suggest making these changes 
>>>>>>>>> as part of this work.
>>>>>>>>
>>>>>>>> I didn't make any change unless there was was an error. I have 
>>>>>>>> 100 failed JPRT jobs to confirm!? I eventually got a Windows 
>>>>>>>> system to compile and test this on. Actually some of the changes 
>>>>>>>> came out better.? Cases where we use jint as a bool simply 
>>>>>>>> turned to int. We do not have an overload for bool for cmpxchg.
>>>>>>>
>>>>>>> That's unfortunate - ditto for OrderAccess.
>>>>>>>
>>>>>>>>>
>>>>>>>>> Looking through I have a quite a few queries/comments - 
>>>>>>>>> apologies in advance as I know how tedious this is:
>>>>>>>>>
>>>>>>>>> make/hotspot/lib/CompileLibjsig.gmk
>>>>>>>>> src/java.base/solaris/native/libjsig/jsig.c
>>>>>>>>>
>>>>>>>>> Took a while to figure out why the include was needed. :) As a 
>>>>>>>>> follow up I suggest just deleting the -I include directive, 
>>>>>>>>> delete the Solaris-only definition of JSIG_VERSION_1_4_1, and 
>>>>>>>>> delete everything to do with JVM_get_libjsig_version. It is all 
>>>>>>>>> obsolete.
>>>>>>>>
>>>>>>>> Can I patch up jsig in a separate RFE?? I don't remember why 
>>>>>>>> this broke so I simply moved JSIG #define.? Is jsig obsolete? 
>>>>>>>> Removing JVM_* definitions generally requires a CSR.
>>>>>>>
>>>>>>> I did say "As a follow up". jsig is not obsolete but the jsig 
>>>>>>> versioning code, only used by Solaris, is.
>>>>>>>
>>>>>>>>>
>>>>>>>>> ---
>>>>>>>>>
>>>>>>>>> src/hotspot/cpu/arm/interp_masm_arm.cpp
>>>>>>>>>
>>>>>>>>> Why did you need to add the jvm.h include?
>>>>>>>>>
>>>>>>>>
>>>>>>>> ?? tbz(Raccess_flags, JVM_ACC_SYNCHRONIZED_BIT, unlocked);
>>>>>>>
>>>>>>> Okay. I'm not going to try and figure out how this code found 
>>>>>>> this before.
>>>>>>>
>>>>>>>>> ---
>>>>>>>>>
>>>>>>>>> src/hotspot/os/windows/os_windows.cpp.
>>>>>>>>>
>>>>>>>>> The type of process_exiting should be uint to match the DWORD 
>>>>>>>>> of GetCurrentThreadID. Then you should need any casts. Also you 
>>>>>>>>> missed this jint cast:
>>>>>>>>>
>>>>>>>>> 3796???????? process_exiting != (jint)GetCurrentThreadId()) {
>>>>>>>>
>>>>>>>> Yes, that's better to change process_exiting to a DWORD.? It 
>>>>>>>> needs a DWORD cast to 0 in the cmpxchg.
>>>>>>>>
>>>>>>>> ???????? Atomic::cmpxchg(GetCurrentThreadId(), &process_exiting, 
>>>>>>>> (DWORD)0);
>>>>>>>>
>>>>>>>> These templates are picky.
>>>>>>>
>>>>>>> Yes - their inability to deal with literals is extremely 
>>>>>>> frustrating.
>>>>>>>
>>>>>>>>>
>>>>>>>>> ---
>>>>>>>>>
>>>>>>>>> src/hotspot/share/c1/c1_Canonicalizer.hpp
>>>>>>>>>
>>>>>>>>> ? 43 #ifdef _WINDOWS
>>>>>>>>> ? 44?? // jint is defined as long in jni_md.h, so convert from 
>>>>>>>>> int to jint
>>>>>>>>> ? 45?? void set_constant(int x) { set_constant((jint)x); }
>>>>>>>>> ? 46 #endif
>>>>>>>>>
>>>>>>>>> Why is this necessary? int and long are the same on Windows. 
>>>>>>>>> The whole point is that jint hides the underlying type, so 
>>>>>>>>> where does this go wrong?
>>>>>>>>
>>>>>>>> No, they are not the same types even though they have the same 
>>>>>>>> representation!
>>>>>>>
>>>>>>> This is truly unfortunate.
>>>>>>>
>>>>>>>>>
>>>>>>>>> ---
>>>>>>>>>
>>>>>>>>> src/hotspot/share/c1/c1_LinearScan.cpp
>>>>>>>>>
>>>>>>>>> ?ConstantIntValue((jint)0);
>>>>>>>>>
>>>>>>>>> why is this cast needed? what causes the ambiguity? (If this 
>>>>>>>>> was a template I'd understand ;-) ). Also didn't you change 
>>>>>>>>> that constructor to take an int anyway - not that I think it 
>>>>>>>>> should - see below.
>>>>>>>>
>>>>>>>> Yes, it caused an ambiguity.? 0 matches 'int' but it doesn't 
>>>>>>>> match 'long' better than any pointer type.? So this cast is needed.
>>>>>>>
>>>>>>> But you changed the constructor to take an int!
>>>>>>>
>>>>>>> ?class ConstantIntValue: public ScopeValue {
>>>>>>> ? private:
>>>>>>> -? jint _value;
>>>>>>> +? int _value;
>>>>>>> ? public:
>>>>>>> -? ConstantIntValue(jint value)???????? { _value = value; }
>>>>>>> +? ConstantIntValue(int value)????????? { _value = value; }
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> Okay I removed this cast.
>>>>>>
>>>>>>>>> ---
>>>>>>>>>
>>>>>>>>> src/hotspot/share/ci/ciReplay.cpp
>>>>>>>>>
>>>>>>>>> 793???????? jint* dims = NEW_RESOURCE_ARRAY(jint, rank);
>>>>>>>>>
>>>>>>>>> why should this be jint?
>>>>>>>>
>>>>>>>> To avoid a cast from int* to jint* in the line below:
>>>>>>>>
>>>>>>>> ????????? value = kelem->multi_allocate(rank, dims, CHECK);
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>> ---
>>>>>>>>>
>>>>>>>>> src/hotspot/share/classfile/altHashing.cpp
>>>>>>>>>
>>>>>>>>> Okay this looks more consistent with jint.
>>>>>>>>
>>>>>>>> Yes.? I translated this from some native code iirc.
>>>>>>>>>
>>>>>>>>> ---
>>>>>>>>>
>>>>>>>>> src/hotspot/share/code/debugInfo.hpp
>>>>>>>>>
>>>>>>>>> These changes seem wrong. We have:
>>>>>>>>>
>>>>>>>>> ConstantLongValue(jlong value)
>>>>>>>>> ConstantDoubleValue(jdouble value)
>>>>>>>>>
>>>>>>>>> so we should have:
>>>>>>>>>
>>>>>>>>> ConstantIntValue(jint value)
>>>>>>>>
>>>>>>>> Again, there are multiple call sites with '0', which match int 
>>>>>>>> trivially but are confused with long.? It's less consistent I 
>>>>>>>> agree but better to not cast all the call sites.
>>>>>>>
>>>>>>> This is really making a mess of the APIs - they should be a jint 
>>>>>>> but we declare them int because of a 0 casting problem. Can't we 
>>>>>>> just use 0L?
>>>>>>
>>>>>> There aren't that many casts.? You're right, that would have been 
>>>>>> better in some places.
>>>>>>
>>>>>>>>>
>>>>>>>>> ---
>>>>>>>>>
>>>>>>>>> src/hotspot/share/code/relocInfo.cpp
>>>>>>>>>
>>>>>>>>> Change seems unnecessary - int32_t is fine
>>>>>>>>>
>>>>>>>>
>>>>>>>> No, int32_t doesn't match the calls below it.? They all assume 
>>>>>>>> _lo and _hi are jint.
>>>>>>>>> ---
>>>>>>>>>
>>>>>>>>> src/hotspot/share/compiler/compileBroker.cpp
>>>>>>>>> src/hotspot/share/compiler/compileBroker.hpp
>>>>>>>>>
>>>>>>>>> I see a complete mix of int and jint in this class, so why make 
>>>>>>>>> the one change you did ??
>>>>>>>>
>>>>>>>> This is another case of using jint as a flag with cmpxchg. The 
>>>>>>>> templates for cmpxchg want the types to match and 0 and 1 are 
>>>>>>>> essentially 'int'.? This is a lot cleaner this way.
>>>>>>>
>>>>>>> <sigh>
>>>>>>>
>>>>>>>>>
>>>>>>>>> ---
>>>>>>>>>
>>>>>>>>> src/hotspot/share/jvmci/jvmciCompilerToVM.cpp
>>>>>>>>>
>>>>>>>>> 1700???? tty->write((char*) start, MIN2(length, (jint)O_BUFLEN));
>>>>>>>>>
>>>>>>>>> why did you need to add the jint cast? It's used without any 
>>>>>>>>> cast on the next two lines:
>>>>>>>>>
>>>>>>>>> 1701???? length -= O_BUFLEN;
>>>>>>>>> 1702???? offset += O_BUFLEN;
>>>>>>>>>
>>>>>>>>
>>>>>>>> There's a conversion from O_BUFLEN from int to long in 1701 and 
>>>>>>>> 1702.?? MIN2 is a template that wants the types to match exactly.
>>>>>>>
>>>>>>> $%^%$! templates!
>>>>>>>
>>>>>>>>> ??
>>>>>>>>>
>>>>>>>>> ---
>>>>>>>>>
>>>>>>>>> src/hotspot/share/jvmci/jvmciRuntime.cpp
>>>>>>>>>
>>>>>>>>> Looking around this code it seems very confused about types - 
>>>>>>>>> eg the previous function is declared jboolean yet returns a 
>>>>>>>>> jint on one path! It isn't clear to me if the return type is 
>>>>>>>>> what should be changed or the parameter type? I would just 
>>>>>>>>> leave this alone.
>>>>>>>>
>>>>>>>> I can't leave it alone because it doesn't compile that way. This 
>>>>>>>> was the minimal change and yea, does look a bit inconsistent.
>>>>>>>>>
>>>>>>>>> ---
>>>>>>>>>
>>>>>>>>> src/hotspot/share/opto/mulnode.cpp
>>>>>>>>>
>>>>>>>>> Okay TypeInt has jint parts, so the remaining int32_t 
>>>>>>>>> declarations (A, B, C, D) should also be jint.
>>>>>>>>
>>>>>>>> Yes.? c2 uses jint types.
>>>>>>>>>
>>>>>>>>> ---
>>>>>>>>>
>>>>>>>>> src/hotspot/share/opto/parse3.cpp
>>>>>>>>>
>>>>>>>>> I agree with the changes you made, but then:
>>>>>>>>>
>>>>>>>>> ?419???? jint dim_con = find_int_con(length[j], -1);
>>>>>>>>>
>>>>>>>>> should also be changed.
>>>>>>>>>
>>>>>>>>> And obviously MultiArrayExpandLimit should be defined as int 
>>>>>>>>> not intx!
>>>>>>>>
>>>>>>>> Everything in globals.hpp is intx.? That's a thread that I don't 
>>>>>>>> want to pull on!
>>>>>>>
>>>>>>> We still have that limitation? <double sigh>
>>>>>>>>
>>>>>>>> Changed dim_con to int.
>>>>>>>>>
>>>>>>>>> ---
>>>>>>>>>
>>>>>>>>> src/hotspot/share/opto/phaseX.cpp
>>>>>>>>>
>>>>>>>>> I can see that intcon(jint i) is consistent with longcon(jlong 
>>>>>>>>> l), but the use of "i" in the code is more consistent with int 
>>>>>>>>> than jint.
>>>>>>>>
>>>>>>>> huh?? really?
>>>>>>>>>
>>>>>>>>> ---
>>>>>>>>>
>>>>>>>>> src/hotspot/share/opto/type.cpp
>>>>>>>>>
>>>>>>>>> 1505 int TypeInt::hash(void) const {
>>>>>>>>> 1506?? return java_add(java_add(_lo, _hi), 
>>>>>>>>> java_add((jint)_widen, (jint)Type::Int));
>>>>>>>>> 1507 }
>>>>>>>>>
>>>>>>>>> I can see that the (jint) casts you added make sense, but then 
>>>>>>>>> the whole function should be returning jint not int. Ditto the 
>>>>>>>>> other hash functions.
>>>>>>>>
>>>>>>>> I'm not messing with this, this is the minimal in type fixing 
>>>>>>>> that I'm going to do here.
>>>>>>>
>>>>>>> <sigh>
>>>>>>>
>>>>>>>>>
>>>>>>>>> ---
>>>>>>>>>
>>>>>>>>> src/hotspot/share/prims/jni.cpp
>>>>>>>>>
>>>>>>>>> I think vm_created should be a bool. In fact all the fields you 
>>>>>>>>> changed are logically bools - do Atomics work for bool now?
>>>>>>>>
>>>>>>>> No, they do not.?? I had thought bool would be better originally 
>>>>>>>> too.
>>>>>>>>>
>>>>>>>>> ---
>>>>>>>>>
>>>>>>>>> src/hotspot/share/prims/jvm.cpp
>>>>>>>>>
>>>>>>>>> is_attachable is the terminology used in the JDK code.
>>>>>>>>
>>>>>>>> Well the JDK version had is_attach_supported() as the flag name 
>>>>>>>> so I used that in this one place.
>>>>>>>>>
>>>>>>>>> ---
>>>>>>>>>
>>>>>>>>> src/hotspot/share/prims/jvmtiEnvBase.cpp
>>>>>>>>> src/hotspot/share/prims/jvmtiImpl.cpp
>>>>>>>>>
>>>>>>>>> Are you making parameters consistent with the fields they 
>>>>>>>>> initialize?
>>>>>>>>
>>>>>>>> They're consistent with the declarations now.
>>>>>>>>>
>>>>>>>>> ---
>>>>>>>>>
>>>>>>>>> src/hotspot/share/prims/jvmtiTagMap.cpp
>>>>>>>>>
>>>>>>>>> There is a mix of int and jint for slot in this code. You fixed 
>>>>>>>>> some, but this remains:
>>>>>>>>>
>>>>>>>>> 2440 inline bool CallbackInvoker::report_stack_ref_root(jlong 
>>>>>>>>> thread_tag,
>>>>>>>>> 2441 jlong tid,
>>>>>>>>> 2442 jint depth,
>>>>>>>>> 2443 jmethodID method,
>>>>>>>>> 2444 jlocation bci,
>>>>>>>>> 2445 jint slot,
>>>>>>>>
>>>>>>>> Right for consistency with the declarations.
>>>>>>>>>
>>>>>>>>> ---
>>>>>>>>>
>>>>>>>>> src/hotspot/share/runtime/perfData.cpp
>>>>>>>>>
>>>>>>>>> Callers pass both jint and int, so param type seems arbitrary.
>>>>>>>>
>>>>>>>> They are, but importantly they match the declarations.
>>>>>>>>>
>>>>>>>>> ---
>>>>>>>>>
>>>>>>>>> src/hotspot/share/runtime/perfMemory.cpp
>>>>>>>>> src/hotspot/share/runtime/perfMemory.hpp
>>>>>>>>>
>>>>>>>>> PerfMemory::_initialized should ideally be a bool - can 
>>>>>>>>> OrderAccess handle that now?
>>>>>>>>
>>>>>>>> Nope.
>>>>>>>>>
>>>>>>>>> ---
>>>>>>>>>
>>>>>>>>> src/java.base/share/native/include/jvm.h
>>>>>>>>>
>>>>>>>>> Not clear why the jio functions are not also JNICALL ?
>>>>>>>>
>>>>>>>> They are now.? The JDK version didn't have JNICALL. JVM needs 
>>>>>>>> JNICALL.? I can't tell you why JDK didn't need JNICALL linkage.
>>>>>>>
>>>>>>> ?? JVM currently does not have JNICALL. But they are declared as 
>>>>>>> "extern C".
>>>>>>
>>>>>> This was a compilation error on Windows with JDK.?? Maybe the C 
>>>>>> code in the JDK doesn't complain about linkage differences. I'll 
>>>>>> have to go back and figure this out then.
>>>>>>>
>>>>>>>>>
>>>>>>>>> ---
>>>>>>>>>
>>>>>>>>> src/java.base/unix/native/include/jni_md.h
>>>>>>>>>
>>>>>>>>> There is no need to special case ARM. The differences in the 
>>>>>>>>> existing code were for LTO support and that is now irrelevant.
>>>>>>>>
>>>>>>>> See discussion with Magnus.?? We still build ARM for jdk10/hs so 
>>>>>>>> I needed this conditional or of course I wouldn't have added 
>>>>>>>> it.? We can remove it with LTO support.
>>>>>>>
>>>>>>> Those builds are gone - this is obsolete. But yes all LTO can be 
>>>>>>> removed later if you wish. Just trying to simplify things now.
>>>>>>>
>>>>>>>>>
>>>>>>>>> ---
>>>>>>>>>
>>>>>>>>> src/java.base/unix/native/include/jvm_md.h
>>>>>>>>>
>>>>>>>>> I know you've just copied this across, but it seems wrong to me:
>>>>>>>>>
>>>>>>>>> ?57 // Hack: MAXPATHLEN is 4095 on some Linux and 4096 on 
>>>>>>>>> others. This may
>>>>>>>>> ? 58 //?????? cause problems if JVM and the rest of JDK are 
>>>>>>>>> built on different
>>>>>>>>> ? 59 //?????? Linux releases. Here we define JVM_MAXPATHLEN to 
>>>>>>>>> be MAXPATHLEN + 1,
>>>>>>>>> ? 60 //?????? so buffers declared in VM are always >= 4096.
>>>>>>>>> ? 61 #define JVM_MAXPATHLEN MAXPATHLEN + 1
>>>>>>>>>
>>>>>>>>> It doesn't make sense to me to define an internal "max path 
>>>>>>>>> length" that can _exceed_ the platform max!
>>>>>>>>>
>>>>>>>>> That aside there's no support for building different parts of 
>>>>>>>>> the JDK on different platforms and then bringing them together. 
>>>>>>>>> And in any case I would think the real problem would be 
>>>>>>>>> building on a platform that uses 4096 and running on one that 
>>>>>>>>> uses 4095!
>>>>>>>>>
>>>>>>>>> But that aside this is a Linux hack and should be guarded by 
>>>>>>>>> ifdef LINUX. (I doubt BSD needs it, the bsd file is just a copy 
>>>>>>>>> of the linux one - the JDK macosx version does the right 
>>>>>>>>> thing). Solaris and AIX should stay as-is at MAXPATHLEN.
>>>>>>>>
>>>>>>>> All of the unix platforms had MAXPATHLEN+1.? I'll leave it for 
>>>>>>>> now and we can investigate that further.
>>>>>>>
>>>>>>> I see the following existing code:
>>>>>>>
>>>>>>> src/java.base/unix/native/include/jvm_md.h:
>>>>>>>
>>>>>>> #define JVM_MAXPATHLEN MAXPATHLEN
>>>>>>>
>>>>>>> src/java.base/macosx/native/include/jvm_md.h
>>>>>>>
>>>>>>> #define JVM_MAXPATHLEN MAXPATHLEN
>>>>>>>
>>>>>>> src/hotspot/os/aix/jvm_aix.h
>>>>>>>
>>>>>>> #define JVM_MAXPATHLEN MAXPATHLEN
>>>>>>>
>>>>>>> src/hotspot/os/bsd/jvm_bsd.h
>>>>>>>
>>>>>>> #define JVM_MAXPATHLEN MAXPATHLEN + 1? // blindly copied from 
>>>>>>> Linux version
>>>>>>>
>>>>>>> src/hotspot/os/linux/jvm_linux.h
>>>>>>>
>>>>>>> #define JVM_MAXPATHLEN MAXPATHLEN + 1
>>>>>>>
>>>>>>> src/hotspot/os/solaris/jvm_solaris.h
>>>>>>>
>>>>>>> #define JVM_MAXPATHLEN MAXPATHLEN
>>>>>>>
>>>>>>> This is a linux only hack (if you ignore the blind copy from 
>>>>>>> linux into the BSD code in the VM).
>>>>>>
>>>>>> Oh, thanks, so should I add a bunch of ifdefs then?? Or do you 
>>>>>> think having MAXPATHLEN + 1 will really break the other 
>>>>>> platforms?? Do you really see this as a problem or are you just 
>>>>>> pointing out inconsistency?
>>>>>>>
>>>>>>>>>
>>>>>>>>> ?86 #define ASYNC_SIGNAL???? SIGJVM2
>>>>>>>>>
>>>>>>>>> This only exists on Solaris so I think should be in #ifdef 
>>>>>>>>> SOLARIS, to make that clear.
>>>>>>>>
>>>>>>>> Ok.? I'll add this.
>>>>>>>>>
>>>>>>>>> ---
>>>>>>>>>
>>>>>>>>> src/java.base/windows/native/include/jvm_md.h
>>>>>>>>>
>>>>>>>>> Given the differences between the two versions either something 
>>>>>>>>> has been broken or "extern C" declarations are not needed :)
>>>>>>>>
>>>>>>>> Well, they are needed for Hotspot to build and do not prevent 
>>>>>>>> jdk from building.? I don't know what was broken.
>>>>>>>
>>>>>>> We really need to understand this better. Maybe related to the 
>>>>>>> map files that expose the symbols. ??
>>>>>>
>>>>>> They're needed because the JDK files are written mostly in C and 
>>>>>> that doesn't complain about the linkage difference. Hotspot files 
>>>>>> are in C++ which does complain.
>>>>>>
>>>>>>>
>>>>>>>>>
>>>>>>>>> ---
>>>>>>>>>
>>>>>>>>> That was a really painful way to spend most of my Friday. TGIF! :)
>>>>>>>>
>>>>>>>> Thanks for going through it.? See comments inline for changes. 
>>>>>>>> Generating a webrev takes hours so I'm not going to do that 
>>>>>>>> unless you insist.
>>>>>>>
>>>>>>> An incremental webrev shouldn't take long - right? You're a mq 
>>>>>>> maestro now. :)
>>>>>>
>>>>>> Well I generally trash a repository whenever I use mq but sure.
>>>>>>>
>>>>>>> If you can reasonably produce an incremental webrev once you've 
>>>>>>> settled on all the comments/issues that would be good.
>>>>>>
>>>>>> Ok, sure.
>>>>>>
>>>>>> Coleen
>>>>>>>
>>>>>>> Thanks,
>>>>>>> David
>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Coleen
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> David
>>>>>>>>> -----
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 27/10/2017 6:44 AM, coleen.phillimore at oracle.com wrote:
>>>>>>>>>> ??Hi Magnus,
>>>>>>>>>>
>>>>>>>>>> Thank you for reviewing this.?? I have a new version that 
>>>>>>>>>> takes out the hack in globalDefinitions.hpp and adds casts to 
>>>>>>>>>> src/hotspot/share/opto/type.cpp instead.
>>>>>>>>>>
>>>>>>>>>> Also some fixes from Martin at SAP.
>>>>>>>>>>
>>>>>>>>>> open webrev at 
>>>>>>>>>> http://cr.openjdk.java.net/~coleenp/8189610.02/webrev
>>>>>>>>>>
>>>>>>>>>> see below.
>>>>>>>>>>
>>>>>>>>>> On 10/26/17 5:57 AM, Magnus Ihse Bursie wrote:
>>>>>>>>>>> Coleen,
>>>>>>>>>>>
>>>>>>>>>>> Thank you for addressing this!
>>>>>>>>>>>
>>>>>>>>>>> On 2017-10-25 18:49, coleen.phillimore at oracle.com wrote:
>>>>>>>>>>>> Summary: removed hotspot version of jvm*h and jni*h files
>>>>>>>>>>>>
>>>>>>>>>>>> Mostly used sed to remove prims/jvm.h and move #include 
>>>>>>>>>>>> "jvm.h" after precompiled.h, so if you have repetitive 
>>>>>>>>>>>> stress wrist issues don't click on most of these files.
>>>>>>>>>>>>
>>>>>>>>>>>> There were more issues to resolve, however. The JDK windows 
>>>>>>>>>>>> jni_md.h file defined jint as long and the hotspot windows 
>>>>>>>>>>>> jni_x86.h as int. I had to choose the jdk version since it's 
>>>>>>>>>>>> the public version, so there are changes to the hotspot 
>>>>>>>>>>>> files for this. Generally I changed the code to use 'int' 
>>>>>>>>>>>> rather than 'jint' where the surrounding API didn't insist 
>>>>>>>>>>>> on consistently using java types. We should mostly be using 
>>>>>>>>>>>> C++ types within hotspot except in interfaces to native/JNI 
>>>>>>>>>>>> code. There are a couple of hacks in places where adding 
>>>>>>>>>>>> multiple jint casts was too painful.
>>>>>>>>>>>>
>>>>>>>>>>>> Tested with JPRT and tier2-4 (in progress).
>>>>>>>>>>>>
>>>>>>>>>>>> open webrev at 
>>>>>>>>>>>> http://cr.openjdk.java.net/~coleenp/8189610.01/webrev
>>>>>>>>>>>
>>>>>>>>>>> Looks great!
>>>>>>>>>>>
>>>>>>>>>>> Just a few comments:
>>>>>>>>>>>
>>>>>>>>>>> * src/java.base/unix/native/include/jni_md.h:
>>>>>>>>>>>
>>>>>>>>>>> I don't think the externally_visible attribute should be 
>>>>>>>>>>> there for arm. I know this was the case for the corresponding 
>>>>>>>>>>> hotspot file for arm, but that was techically incorrect. The 
>>>>>>>>>>> proper dependency here is that externally_visible should be 
>>>>>>>>>>> in all JNIEXPORT if and only if we're building with JVM 
>>>>>>>>>>> feature "link-time-opt". Traditionally, that feature been 
>>>>>>>>>>> enabled when building arm32 builds, and only then, so there's 
>>>>>>>>>>> been a (coincidentally) connection here. Nowadays, Oracle 
>>>>>>>>>>> does not care about the arm32 builds, and I'm not sure if 
>>>>>>>>>>> anyone else is building them with link-time-opt enabled.
>>>>>>>>>>>
>>>>>>>>>>> It does seem wrong to me to export this behavior in the 
>>>>>>>>>>> public jni_md.h file, though. I think the correct way to 
>>>>>>>>>>> solve this, if we should continue supporting link-time-opt is 
>>>>>>>>>>> to make sure this attribute is set for exported hotspot 
>>>>>>>>>>> functions. If it's still needed, that is. A quick googling 
>>>>>>>>>>> seems to indicate that visibility("default") might be enough 
>>>>>>>>>>> in modern gcc's.
>>>>>>>>>>>
>>>>>>>>>>> A third option is to remove the support for link-time-opt 
>>>>>>>>>>> entirely, if it's not really used.
>>>>>>>>>>
>>>>>>>>>> I didn't know how to change this since we are still building 
>>>>>>>>>> ARM with the jdk10/hs repository, and ARM needed this change. 
>>>>>>>>>> I could wait until we bring down the jdk10/master changes that 
>>>>>>>>>> remove the ARM build and remove this conditional before I 
>>>>>>>>>> push. Or we could file an RFE to remove link-time-opt (?) and 
>>>>>>>>>> remove it then?
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> * src/java.base/unix/native/include/jvm_md.h and 
>>>>>>>>>>> src/java.base/windows/native/include/jvm_md.h:
>>>>>>>>>>>
>>>>>>>>>>> These files define a public API, and contain non-trivial 
>>>>>>>>>>> changes. I suspect you should file a CSR request. (Even 
>>>>>>>>>>> though I realize you're only matching the header file with 
>>>>>>>>>>> the reality.)
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> I filed the CSR.?? Waiting for the next steps.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Coleen
>>>>>>>>>>
>>>>>>>>>>> /Magnus
>>>>>>>>>>>
>>>>>>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8189610
>>>>>>>>>>>>
>>>>>>>>>>>> I have a script to update copyright files on commit.
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks to Magnus and ErikJ for the makefile changes.
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Coleen
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>
>>>>>>
>>>>
>>
> 

From david.holmes at oracle.com  Tue Oct 31 06:33:10 2017
From: david.holmes at oracle.com (David Holmes)
Date: Tue, 31 Oct 2017 16:33:10 +1000
Subject: RFR (L, tedious again, sorry) 8189610: Reconcile jvm.h and all
 jvm_md.h between java.base and hotspot
In-Reply-To: <29688c76-4983-dffc-6ce2-402cf91dafbf@oracle.com>
References: <8671321f-398c-5f7f-634d-9d9664e04d87@oracle.com>
 <05bf853a-52f8-c450-4171-89f6b64d793a@oracle.com>
 <72f2aac7-ed20-c995-913b-ee4341a2a978@oracle.com>
 <55ec3559-c593-bcb6-51b0-4639da126068@oracle.com>
 <a05deb31-c7bf-4fda-f621-d39cfb56abb4@oracle.com>
 <4509dce7-10f8-4558-2adb-90d4745e054e@oracle.com>
 <57390ec3-8d8d-a3d7-9774-b5945a323be9@oracle.com>
 <b62a479f-7fc2-70be-1305-ad214572b54f@oracle.com>
 <0f568e05-6f06-d2df-571e-0c591f062c15@oracle.com>
 <29688c76-4983-dffc-6ce2-402cf91dafbf@oracle.com>
Message-ID: <d51c7f37-f1fe-1c03-4174-d2ffd173b911@oracle.com>

On 30/10/2017 10:15 PM, coleen.phillimore at oracle.com wrote:
> On 10/28/17 3:58 AM, David Holmes wrote:
>> On 28/10/2017 6:20 AM, coleen.phillimore at oracle.com wrote:
>>>
>>> Incremental webrev:
>>>
>>> http://cr.openjdk.java.net/~coleenp/8189610.incr.01/webrev/index.html
>>
>> That all looks fine - thanks.
>>
>> If I get a chance I'll look deeper into why the VS compiler needs 0 to 
>> be cast to jint (aka long) to avoid ambiguity with it being a NULL 
>> pointer. I could understand if it always needed the cast, but not only 
>> needing it for long, but not int.
> 
> Thanks,? Kim can probably tell you where in the spec this is.

Now I get it.

Given:

void x(int i) { ...}
void x(Foo* p) { ... }

a call x(0) is a call to x(int) because 0 is an int. No conversion needed.

But given:

void x(long i) { ...}
void x(Foo* p) { ... }

a call x(0) has no direct match (no int version) so standard conversions 
apply and IIUC conversion to long and conversion to Foo* have the same 
rank, so neither is preferred and the call is ambiguous.

David

> Coleen
> 
>>
>> Thanks,
>> David
>>
>>> thanks,
>>> Coleen
>>>
>>> On 10/27/17 11:13 AM, coleen.phillimore at oracle.com wrote:
>>>>
>>>>
>>>> On 10/27/17 9:37 AM, David Holmes wrote:
>>>>>>> src/hotspot/share/c1/c1_LinearScan.cpp
>>>>>>>
>>>>>>> ?ConstantIntValue((jint)0);
>>>>>>>
>>>>>>> why is this cast needed? what causes the ambiguity? (If this was 
>>>>>>> a template I'd understand ;-) ). Also didn't you change that 
>>>>>>> constructor to take an int anyway - not that I think it should - 
>>>>>>> see below.
>>>>>>
>>>>>> Yes, it caused an ambiguity.? 0 matches 'int' but it doesn't match 
>>>>>> 'long' better than any pointer type.? So this cast is needed.
>>>>>
>>>>> But you changed the constructor to take an int!
>>>>>
>>>>> ?class ConstantIntValue: public ScopeValue {
>>>>> ? private:
>>>>> -? jint _value;
>>>>> +? int _value;
>>>>> ? public:
>>>>> -? ConstantIntValue(jint value)???????? { _value = value; }
>>>>> +? ConstantIntValue(int value)????????? { _value = value; }
>>>>>
>>>> I changed this back to not take an int and changed c1_LinearScan.cpp 
>>>> to have the (jint)0 cast and output.cp needed (jint)0 casts.? 0L 
>>>> doesn't work for platforms where jint is an 'int' rather than a long 
>>>> because it's ambiguous with the functions that take a pointer type.
>>>> Probably better to keep the type of ConstantIntValue consistent with 
>>>> j types.
>>>>
>>>> Thanks,
>>>> Coleen
>>>
> 

From david.holmes at oracle.com  Tue Oct 31 10:27:47 2017
From: david.holmes at oracle.com (David Holmes)
Date: Tue, 31 Oct 2017 20:27:47 +1000
Subject: RFR(XL): 8185640: Thread-local handshakes
In-Reply-To: <a2fe667f-d89b-9856-8630-d0e15ae90ae4@oracle.com>
References: <6f2f6259-73f1-c09c-063e-39ae528fb96f@oracle.com>
 <6e10687c-f70e-5ee7-414f-b2c22d3e8f21@oracle.com>
 <f3553eec9b1d4815adbf7c87ca550527@sap.com>
 <580AD7F0-2713-472C-A440-AAFDDA2D3EB3@oracle.com>
 <7591f6c0-7192-78c3-fe79-56a7785c43e4@oracle.com>
 <2ff79d24-90ab-822a-bd61-e01b79c01ada@redhat.com>
 <8d7678bf2281406da43cbe090276b51f@sap.com>
 <28bc3976-424d-1e05-cf7f-29bc38ccabcb@oracle.com>
 <818e352d5e3a450491cf0c140bf129d6@sap.com>
 <1c4a025da6ad4d39bedc1d6a12549b87@sap.com>
 <93cce80b-e9d7-f016-1324-2b0f5fac48c4@redhat.com> <59F1F3B3.10701@oracle.com>
 <43837915-a3a3-b36f-940e-1327937f0f17@redhat.com>
 <2EB9D7C3-B868-4C3E-BD88-6A4F92A39999@oracle.com>
 <b4d75241-f87f-3aef-ad55-897d1f91be1e@redhat.com>
 <59F2DC24.8050701@oracle.com>
 <cd99843e-2602-c423-c74e-210884713ef5@redhat.com> <59F2F01A.403@oracle.com>
 <d0fe324f-26ed-7fca-e8f9-81b1ca4f452d@oracle.com>
 <4ebb905f23324a00b9cf10d8d410d420@sap.com>
 <a2fe667f-d89b-9856-8630-d0e15ae90ae4@oracle.com>
Message-ID: <9ff3abc3-9809-a9df-141b-15f0b05bd8a4@oracle.com>

Hi Robbin,

On 31/10/2017 12:34 AM, Robbin Ehn wrote:
> Thanks!
> 
> There have been a bit hesitation and confusion about the option (at 
> least internally).
> The option is opt-out but in globals.hpp it starts out as false.
> 
> Now instead we explicit set it true in globals.hpp but we turn it off if 
> we notice that:
> - We are on an unsupported platform
> - User have specified UseAOT
> - User have specified EnableJVMCI

That logic from #4617 onwards is absolutely doing my head in!

4617   bool aot_enabled = UseAOT && ((AOTLibrary != NULL) || 
!FLAG_IS_DEFAULT(UseAOT));

why do we care if the flag is default or not? If they set an AOTLibrary 
they expect AOT to be enabled. If they know UseAOT is true by default 
then they won't set it explicitly and so the flag will be default. If 
they set UseAOT directly but don't set a library then they won't get AOT 
- and UseAOT should be turned off somewhere else.

4623     if (FLAG_IS_DEFAULT(UseAOT)) {

Why do we care if it is default or not? If we got here AOT is not 
enabled. We can just do:

if (UseAOT) FLAG_SET_DEFAULT(UseAOT, false)

or even skip the query and just set it false.

4627     if (FLAG_IS_DEFAULT(ThreadLocalHandshakes) && 
ThreadLocalHandshakes) {

Okay I get why you check for default here :)

4631         FLAG_SET_ERGO(bool, ThreadLocalHandshakes, false);

I wouldn't really say this is an "ergo" choice - if we can't have it on 
then we set it off - just as previously done with UseAOT.

4632       } else if (!FLAG_IS_DEFAULT(UseAOT) && UseAOT) {

Again why do we care about default? You seem to be saying that "java 
-XX:+UseAOT -XX:AOTLibrary=..." is a stronger request for AOT than just 
"java -XX:AOTLibrary=...". But I'd always use the latter if I know 
UseAOT defaults to true anyway.

4635         FLAG_SET_ERGO(bool, ThreadLocalHandshakes, false);
4639         FLAG_SET_ERGO(bool, ThreadLocalHandshakes, false);

Same ergo comment.

I'm also thinking, if this is platform dependent then shouldn't 
ThreadLocalHandshakes be a product_pd flag, with pd specific default 
setting - and turning it on when on an unsupported platform should be a 
error ?

Thanks,
David
-----

> Here is webrev for changes needed:
> http://cr.openjdk.java.net/~rehn/8185640/v8/Option-Cleanup-12/webrev/
> And here is CSR:
> https://bugs.openjdk.java.net/browse/JDK-8189942
> 
> Manual testing + basic testing done.
> 
> And since I'm really hoping that this can be the last incremental, here 
> is my whole patch queue flatten out:
> http://cr.openjdk.java.net/~rehn/8185640/v8/Full/webrev/
> 
> Thanks, Robbin
> 
> On 10/27/2017 04:47 PM, Doerr, Martin wrote:
>> Hi Robbin,
>>
>> excellent. I think this matches what Coleen had proposed, now.
>> Thanks for doing all the work with so many incremental patches and for 
>> responding on so many discussions. Seems to be a tough piece of work.
>>
>> Best regards,
>> Martin
>>
>>
>> -----Original Message-----
>> From: Robbin Ehn [mailto:robbin.ehn at oracle.com]
>> Sent: Freitag, 27. Oktober 2017 15:15
>> To: Erik ?sterlund <erik.osterlund at oracle.com>; Andrew Haley 
>> <aph at redhat.com>; Doerr, Martin <martin.doerr at sap.com>; Karen Kinnear 
>> <karen.kinnear at oracle.com>; Coleen Phillimore 
>> (coleen.phillimore at oracle.com) <coleen.phillimore at oracle.com>
>> Cc: hotspot-dev developers <hotspot-dev at openjdk.java.net>
>> Subject: Re: RFR(XL): 8185640: Thread-local handshakes
>>
>> Hi all,
>>
>> Poll in switches:
>> http://cr.openjdk.java.net/~rehn/8185640/v7/Interpreter-Poll-Switch-10/
>>
>> Poll in return:
>> http://cr.openjdk.java.net/~rehn/8185640/v7/Interpreter-Poll-Ret-11/
>>
>> Please take an extra look at poll in return.
>>
>> Sanity tested, big test run still running (99% complete - OK).
>>
>> Performance regression for the added polls increased to total of 
>> -0.68% vs
>> global poll. (was -0.44%)
>>
>> We are discussing the opt-out option, the newest suggestion is to make it
>> diagnostic. Opinions?
>>
>> For anyone applying these patches, the number 9 patch changes the 
>> option from
>> product. I have not sent that out.
>>
>> Thanks, Robbin
>>
>>
>>

From artem.smotrakov at oracle.com  Tue Oct 31 10:58:07 2017
From: artem.smotrakov at oracle.com (Artem Smotrakov)
Date: Tue, 31 Oct 2017 13:58:07 +0300
Subject: RFR [10] 8189800: Add support for AddressSanitizer
In-Reply-To: <4982c49e-859c-75d4-e5b1-b4a68b49d746@oracle.com>
References: <51eabbae-5435-59be-f443-a6b214a17513@oracle.com>
 <4982c49e-859c-75d4-e5b1-b4a68b49d746@oracle.com>
Message-ID: <3b08231c-4f8b-3d9a-3515-15afe743b82d@oracle.com>

Hi David,

That's a good question, I thought about it. According to [1]:

- recommended versions of gcc is 4.9.2
- the minimum accepted version of gcc is 4.7 (Older versions will 
generate a warning by `configure` and are unlikely to work.)
- the minimum accepted version of clang is 3.2 (Older versions will not 
be accepted by `configure`)

It looks like that clang has to be at least 3.2 which should contain 
AddressSanitizer. Only for gcc, there may be a chance that someone wants 
to use 4.7. So, we might want to check version to see if it's 4.7, 
although I am not sure how many people would like to use gcc 4.7. As a 
result, this case didn't look very common to me, so I preferred to 
simplify the patch, and didn't add such a check.

Without version check, compilation is going to fail if gcc 4.7 is used, 
and -fsanitize=address enabled.

[1] 
http://hg.openjdk.java.net/jdk10/master/file/438e0c9f2f17/doc/building.md

Artem

On 10/31/2017 01:37 PM, David Holmes wrote:
> Hi Artem,
>
> On 28/10/2017 6:02 AM, Artem Smotrakov wrote:
>> Hello,
>>
>> Please review the following patch which adds support for 
>> AddressSanitizer.
>>
>> AddressSanitizer is a runtime memory error detector which looks for 
>> various memory corruption issues and leaks.
>>
>> Please refer to [1] for details. AddressSanitizer is available in gcc 
>> 4.8+ and clang 3.1+
>
> Should we be checking the version before adding the flags?
>
> Thanks,
> David
>
>> The patch below introduces --enable-asan parameter for the configure 
>> script which enables AddressSanitizer.
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8189800
>> Webrev: http://cr.openjdk.java.net/~asmotrak/8189800/webrev.00/
>>
>> [1] https://github.com/google/sanitizers/wiki/AddressSanitizer
>>
>> Artem


From david.holmes at oracle.com  Tue Oct 31 12:24:10 2017
From: david.holmes at oracle.com (David Holmes)
Date: Tue, 31 Oct 2017 22:24:10 +1000
Subject: RFR [10] 8189800: Add support for AddressSanitizer
In-Reply-To: <3b08231c-4f8b-3d9a-3515-15afe743b82d@oracle.com>
References: <51eabbae-5435-59be-f443-a6b214a17513@oracle.com>
 <4982c49e-859c-75d4-e5b1-b4a68b49d746@oracle.com>
 <3b08231c-4f8b-3d9a-3515-15afe743b82d@oracle.com>
Message-ID: <8a97162b-4055-472c-dc55-5e38bd9a5ca8@oracle.com>

Sounds reasonable. Anyone using older gcc simply won't/shouldn't enable 
Asan.

Thanks,
David

On 31/10/2017 8:58 PM, Artem Smotrakov wrote:
> Hi David,
> 
> That's a good question, I thought about it. According to [1]:
> 
> - recommended versions of gcc is 4.9.2
> - the minimum accepted version of gcc is 4.7 (Older versions will 
> generate a warning by `configure` and are unlikely to work.)
> - the minimum accepted version of clang is 3.2 (Older versions will not 
> be accepted by `configure`)
> 
> It looks like that clang has to be at least 3.2 which should contain 
> AddressSanitizer. Only for gcc, there may be a chance that someone wants 
> to use 4.7. So, we might want to check version to see if it's 4.7, 
> although I am not sure how many people would like to use gcc 4.7. As a 
> result, this case didn't look very common to me, so I preferred to 
> simplify the patch, and didn't add such a check.
> 
> Without version check, compilation is going to fail if gcc 4.7 is used, 
> and -fsanitize=address enabled.
> 
> [1] 
> http://hg.openjdk.java.net/jdk10/master/file/438e0c9f2f17/doc/building.md
> 
> Artem
> 
> On 10/31/2017 01:37 PM, David Holmes wrote:
>> Hi Artem,
>>
>> On 28/10/2017 6:02 AM, Artem Smotrakov wrote:
>>> Hello,
>>>
>>> Please review the following patch which adds support for 
>>> AddressSanitizer.
>>>
>>> AddressSanitizer is a runtime memory error detector which looks for 
>>> various memory corruption issues and leaks.
>>>
>>> Please refer to [1] for details. AddressSanitizer is available in gcc 
>>> 4.8+ and clang 3.1+
>>
>> Should we be checking the version before adding the flags?
>>
>> Thanks,
>> David
>>
>>> The patch below introduces --enable-asan parameter for the configure 
>>> script which enables AddressSanitizer.
>>>
>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8189800
>>> Webrev: http://cr.openjdk.java.net/~asmotrak/8189800/webrev.00/
>>>
>>> [1] https://github.com/google/sanitizers/wiki/AddressSanitizer
>>>
>>> Artem
> 

From magnus.ihse.bursie at oracle.com  Tue Oct 31 12:41:09 2017
From: magnus.ihse.bursie at oracle.com (Magnus Ihse Bursie)
Date: Tue, 31 Oct 2017 13:41:09 +0100
Subject: RFR [10] 8189800: Add support for AddressSanitizer
In-Reply-To: <3b4c5abb-762f-a66c-02d5-93909dc656d4@oracle.com>
References: <51eabbae-5435-59be-f443-a6b214a17513@oracle.com>
 <aafaf71d-087d-fdab-a50a-56d6d331cf24@oracle.com>
 <55e0e055-2e65-5c83-3f8e-36895f71860e@oracle.com>
 <3b4c5abb-762f-a66c-02d5-93909dc656d4@oracle.com>
Message-ID: <dae3e953-9ccd-7a88-20b9-017d1ae71cfc@oracle.com>

On 2017-10-30 10:31, Artem Smotrakov wrote:
> Hi Magnus,
>
> The current approach uses AddressSanitizer as a shared library 
> (libasan.so) which is part of GCC/Clang toolkit. In case you use 
> system toolkit, then libasan.so is available for linker and at 
> runtime. But if you set a custom toolkit by --with-devkit option, then 
> libasan.so form this toolkit may not be available for linker and at 
> runtime by default. As a result, you can get errors while linking and 
> running. To fix that, you normally need to make it available using 
> ldconfig, or update LD_LIBRARY_PATH. That's why it updates 
> LD_LIBRARY_PATH with DEVKIT_LIB_DIR if a custom toolkit was used. That 
> may be helpful when you build JDK in environment like jib/jprt.
>
> I tried to remove exporting ASAN_ENABLED and DEVKIT_LIB_DIR, and as a 
> result, ASAN_OPTIONS and DEVKIT_LIB_DIR didn't go to jtreg command 
> which caused tests to fail when you run "make test". If we don't 
> export ASAN_OPTIONS and DEVKIT_LIB_DIR, then the updates in 
> TestCommon.gmk don't make much sense to me because those variables 
> have to be explicitly set for "make" anyway.
>
> I can remove exporting those variables and revert TestCommon.gmk. 
> Although, it looks nicer to me if we can run the tests just with "make 
> test" without specifying ASAN_OPTIONS and DEVKIT_LIB_DIR explicitly.
>
> What do you think?
Ah, I see. TestCommon.gmk is not properly integrated into the rest of 
the build system. I'm still a bit surprised at this behavior, but I 
accept your explanation.

Keep it as it is. TestCommon is due to be removed by the new 
RunTests.gmk (which is properly integrated), and when that happens, we 
can remove the exports then.

/Magnus

>
> Artem
>
>
> On 10/30/2017 10:50 AM, Magnus Ihse Bursie wrote:
>> On 2017-10-30 08:39, Artem Smotrakov wrote:
>>> cc'ing hotspot-dev at openjdk.java.net as David suggested.
>>>
>>> Artem
>>>
>>>
>>> On 10/27/2017 11:02 PM, Artem Smotrakov wrote:
>>>> Hello,
>>>>
>>>> Please review the following patch which adds support for 
>>>> AddressSanitizer.
>>>>
>>>> AddressSanitizer is a runtime memory error detector which looks for 
>>>> various memory corruption issues and leaks.
>>>>
>>>> Please refer to [1] for details. AddressSanitizer is available in 
>>>> gcc 4.8+ and clang 3.1+
>>>>
>>>> The patch below introduces --enable-asan parameter for the 
>>>> configure script which enables AddressSanitizer.
>>>>
>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8189800
>>>> Webrev: http://cr.openjdk.java.net/~asmotrak/8189800/webrev.00/
>> spec.gmk.in should only have export for variables that needs to be 
>> exported in the environment for executing binaries, that is 
>> ASAN_OPTIONS and LD_LIBRARY_PATH, not ASAN_ENABLED or DEVKIT_LIB_DIR.
>>
>> I'm also a bit curious about the addition of of DEVKIT_LIB_DIR. Would 
>> you care to elaborate your thinking?
>>
>> Otherwise it looks good.
>>
>> /Magnus
>>
>>>>
>>>> [1] https://github.com/google/sanitizers/wiki/AddressSanitizer
>>>>
>>>> Artem
>>>
>>
>


From magnus.ihse.bursie at oracle.com  Tue Oct 31 12:42:45 2017
From: magnus.ihse.bursie at oracle.com (Magnus Ihse Bursie)
Date: Tue, 31 Oct 2017 13:42:45 +0100
Subject: RFR [10] 8189800: Add support for AddressSanitizer
In-Reply-To: <8a97162b-4055-472c-dc55-5e38bd9a5ca8@oracle.com>
References: <51eabbae-5435-59be-f443-a6b214a17513@oracle.com>
 <4982c49e-859c-75d4-e5b1-b4a68b49d746@oracle.com>
 <3b08231c-4f8b-3d9a-3515-15afe743b82d@oracle.com>
 <8a97162b-4055-472c-dc55-5e38bd9a5ca8@oracle.com>
Message-ID: <65ec3f30-2d0a-fe84-5455-d7ced2235061@oracle.com>

On 2017-10-31 13:24, David Holmes wrote:
> Sounds reasonable. Anyone using older gcc simply won't/shouldn't 
> enable Asan.

Agree. We will probably not be keeping any pretense of supporting 
anything older than gcc 4.9 at some point in time anyway. I believe the 
only known user of the oldest gcc is SAP for some of their platforms.

/Magnus
>
> Thanks,
> David
>
> On 31/10/2017 8:58 PM, Artem Smotrakov wrote:
>> Hi David,
>>
>> That's a good question, I thought about it. According to [1]:
>>
>> - recommended versions of gcc is 4.9.2
>> - the minimum accepted version of gcc is 4.7 (Older versions will 
>> generate a warning by `configure` and are unlikely to work.)
>> - the minimum accepted version of clang is 3.2 (Older versions will 
>> not be accepted by `configure`)
>>
>> It looks like that clang has to be at least 3.2 which should contain 
>> AddressSanitizer. Only for gcc, there may be a chance that someone 
>> wants to use 4.7. So, we might want to check version to see if it's 
>> 4.7, although I am not sure how many people would like to use gcc 
>> 4.7. As a result, this case didn't look very common to me, so I 
>> preferred to simplify the patch, and didn't add such a check.
>>
>> Without version check, compilation is going to fail if gcc 4.7 is 
>> used, and -fsanitize=address enabled.
>>
>> [1] 
>> http://hg.openjdk.java.net/jdk10/master/file/438e0c9f2f17/doc/building.md
>>
>> Artem
>>
>> On 10/31/2017 01:37 PM, David Holmes wrote:
>>> Hi Artem,
>>>
>>> On 28/10/2017 6:02 AM, Artem Smotrakov wrote:
>>>> Hello,
>>>>
>>>> Please review the following patch which adds support for 
>>>> AddressSanitizer.
>>>>
>>>> AddressSanitizer is a runtime memory error detector which looks for 
>>>> various memory corruption issues and leaks.
>>>>
>>>> Please refer to [1] for details. AddressSanitizer is available in 
>>>> gcc 4.8+ and clang 3.1+
>>>
>>> Should we be checking the version before adding the flags?
>>>
>>> Thanks,
>>> David
>>>
>>>> The patch below introduces --enable-asan parameter for the 
>>>> configure script which enables AddressSanitizer.
>>>>
>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8189800
>>>> Webrev: http://cr.openjdk.java.net/~asmotrak/8189800/webrev.00/
>>>>
>>>> [1] https://github.com/google/sanitizers/wiki/AddressSanitizer
>>>>
>>>> Artem
>>


From coleen.phillimore at oracle.com  Tue Oct 31 12:53:17 2017
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Tue, 31 Oct 2017 08:53:17 -0400
Subject: RFR (L, tedious again, sorry) 8189610: Reconcile jvm.h and all
 jvm_md.h between java.base and hotspot
In-Reply-To: <058662bb-5d5b-0085-cc08-02192d000838@oracle.com>
References: <8671321f-398c-5f7f-634d-9d9664e04d87@oracle.com>
 <05bf853a-52f8-c450-4171-89f6b64d793a@oracle.com>
 <72f2aac7-ed20-c995-913b-ee4341a2a978@oracle.com>
 <55ec3559-c593-bcb6-51b0-4639da126068@oracle.com>
 <a05deb31-c7bf-4fda-f621-d39cfb56abb4@oracle.com>
 <4509dce7-10f8-4558-2adb-90d4745e054e@oracle.com>
 <396ab0f7-3710-3f76-675a-5108bcb50af5@oracle.com>
 <22afedef-59cc-ecde-48fc-0afb7b4bbb47@oracle.com>
 <a3389ea3-83b2-40fb-85cd-bb741191d219@oracle.com>
 <815ac734-ea8b-ea2d-ecec-85cb547ba2f4@oracle.com>
 <440f79ba-2da3-b627-53bc-e1842e3cf73c@oracle.com>
 <f664005e-a495-d996-8a89-b8fa0a9f3f18@oracle.com>
 <058662bb-5d5b-0085-cc08-02192d000838@oracle.com>
Message-ID: <f13db159-7874-12e6-34d3-b252825e6a6f@oracle.com>


On 10/30/17 8:21 PM, David Holmes wrote:
> On 31/10/2017 12:48 AM, coleen.phillimore at oracle.com wrote:
>>
>> http://cr.openjdk.java.net/~coleenp/8189610.incr.02/webrev/index.html
>>
>> Changed JDK file to use PATH_MAX.? Retested jdk tier1 tests.
>
> Why PATH_MAX instead of MAXPATHLEN? They appear to be the same on 
> Linux and Solaris, but I don't know if that is true for AIX and Mac OS 
> / BSD.

I picked PATH_MAX because canonicalize_md.c uses that constant.
>
> Does UnixFileSystem_md.c still need the jvm.h include now?

No, I will remove it.

Thanks,
Coleen
>
> Thanks,
> David
>
>> thanks,
>> Coleen
>>
>> On 10/30/17 8:38 AM, coleen.phillimore at oracle.com wrote:
>>>
>>>
>>> On 10/30/17 8:17 AM, David Holmes wrote:
>>>> On 30/10/2017 10:13 PM, coleen.phillimore at oracle.com wrote:
>>>>> On 10/28/17 3:50 AM, David Holmes wrote:
>>>>>> Hi Coleen,
>>>>>>
>>>>>> I've commented on the file location in response to Mandy's email.
>>>>>>
>>>>>> The only issue I'm still concerned about is the JVM_MAXPATHLEN 
>>>>>> issue. I think it is a bug to define a JVM_MAXPATHLEN that is 
>>>>>> bigger than the platform MAXPATHLEN. I also would not want to see 
>>>>>> any change in behaviour because of this - so AIX and Solaris 
>>>>>> should not get a different JVM_MAXPATHLEN due to this refactoring 
>>>>>> change. So yes I think this needs to be ifdef'd for Linux and 
>>>>>> reluctantly (because it was a copy error) for OSX/BSD as well.
>>>>>
>>>>> #if defined(AIX) || defined(SOLARIS)
>>>>> #define JVM_MAXPATHLEN MAXPATHLEN
>>>>> #else
>>>>> // Hack: MAXPATHLEN is 4095 on some Linux and 4096 on others. This 
>>>>> may
>>>>> //?????? cause problems if JVM and the rest of JDK are built on 
>>>>> different
>>>>> //?????? Linux releases. Here we define JVM_MAXPATHLEN to be 
>>>>> MAXPATHLEN + 1,
>>>>> //?????? so buffers declared in VM are always >= 4096.
>>>>> #define JVM_MAXPATHLEN MAXPATHLEN + 1
>>>>> #endif
>>>>>
>>>>> Is this ok?
>>>>
>>>> Yes - thanks. It preserves existing behaviour on the VM side at 
>>>> least. Time will tell if it messes anything up on the JDK side for 
>>>> Linux/OSX.
>>>
>>> I don't want to wait for time so I'm investigating.
>>>
>>> It's one use is:
>>>
>>> Java_java_io_UnixFileSystem_canonicalize0(JNIEnv *env, jobject this,
>>> ...
>>> ??????? char canonicalPath[JVM_MAXPATHLEN];
>>> ??????? if (canonicalize((char *)path,
>>> ???????????????????????? canonicalPath, JVM_MAXPATHLEN) < 0) {
>>> ??????????? JNU_ThrowIOExceptionWithLastError(env, "Bad pathname");
>>>
>>> Which goes to:
>>>
>>> canonicalize_md.c
>>>
>>> canonicalize(char *original, char *resolved, int len)
>>> ??? if (len < PATH_MAX) {
>>> ??????? errno = EINVAL;
>>> ??????? return -1;
>>> ??? }
>>>
>>>
>>> So this should fail every time.
>>>
>>> sys/param.h:# define MAXPATHLEN??? PATH_MAX
>>>
>>> I haven't found any tests for it.
>>>
>>> I don't know why Java_java_io_UnixFileSystem uses JVM_MAXPATHLEN 
>>> since it's not calling the JVM interface as far as I can tell. I 
>>> think it should be changed to PATH_MAX.
>>>
>>> ?
>>> Coleen
>>>>
>>>> David
>>>>
>>>>> thanks,
>>>>> Coleen
>>>>>>
>>>>>> Thanks,
>>>>>> David
>>>>>>
>>>>>> On 28/10/2017 12:08 AM, coleen.phillimore at oracle.com wrote:
>>>>>>>
>>>>>>>
>>>>>>> On 10/27/17 9:37 AM, David Holmes wrote:
>>>>>>>> On 27/10/2017 10:13 PM, coleen.phillimore at oracle.com wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 10/27/17 3:23 AM, David Holmes wrote:
>>>>>>>>>> Hi Coleen,
>>>>>>>>>>
>>>>>>>>>> Thanks for tackling this.
>>>>>>>>>>
>>>>>>>>>>> Summary: removed hotspot version of jvm*h and jni*h files
>>>>>>>>>>
>>>>>>>>>> Can you update the bug synopsis to show it covers both sets 
>>>>>>>>>> of files please.
>>>>>>>>>>
>>>>>>>>>> I hate to start with this (and it took me quite a while to 
>>>>>>>>>> realize it) but as Mandy pointed out jvm.h is not an exported 
>>>>>>>>>> interface from the JDK to the outside world (so not subject 
>>>>>>>>>> to CSR review), but is a private interface between the JVM 
>>>>>>>>>> and the JDK libraries. So I think really jvm.h belongs in the 
>>>>>>>>>> hotspot sources where it was, while jni.h belongs in the 
>>>>>>>>>> exported JDK sources. In which case the bulk of your changes 
>>>>>>>>>> to the hotspot files would not be needed - sorry.
>>>>>>>>>
>>>>>>>>> Maybe someone can make that decision and change at a later 
>>>>>>>>> date. The point of this change is that there is now only one 
>>>>>>>>> of these files that is shared.? I don't think jvm.h and the 
>>>>>>>>> jvm_md.h belong on the hotspot sources for the jdk to find 
>>>>>>>>> them in some random prims and os dependent directories.
>>>>>>>>
>>>>>>>> The one file that is needed is a hotspot file - jvm.h defines 
>>>>>>>> the interface that hotspot exports via jvm.cpp.
>>>>>>>>
>>>>>>>> If you leave jvm.h in hotspot/prims then a very large chunk of 
>>>>>>>> your boilerplate changes are not needed. The JDK code doesn't 
>>>>>>>> care what the name of the directory is - whatever it is just 
>>>>>>>> gets added as a -I directive (the JDK code will include "jvm.h" 
>>>>>>>> not "prims/jvm.h" the way hotspot sources do.
>>>>>>>>
>>>>>>>> This isn't something we want to change back or move again 
>>>>>>>> later. Whatever we do now we live with.
>>>>>>>
>>>>>>> I think it belongs with jni.h and I think the core libraries 
>>>>>>> group would agree.?? It seems more natural there than buried in 
>>>>>>> the hotspot prims directory.? I guess this is on hold while we 
>>>>>>> have this debate. Sigh.
>>>>>>>
>>>>>>> Actually with -I directives, changing to jvm.h from prims/jvm.h 
>>>>>>> would still work.?? Maybe we should change the name to jvm.hpp 
>>>>>>> since it's jvm.cpp though??? Or maybe just have two divergent 
>>>>>>> copies and close this as WNF.
>>>>>>>
>>>>>>>>
>>>>>>>>> I'm happy to withdraw the CSR. We generally use the CSR 
>>>>>>>>> process to add and remove JVM_ interfaces even though they're 
>>>>>>>>> a private interface in case some other JVM/JDK combination 
>>>>>>>>> relies on them. The changes to these files are very minor 
>>>>>>>>> though and not likely to cause any even theoretical 
>>>>>>>>> incompatibility, so I'll withdraw it.
>>>>>>>>>>
>>>>>>>>>> Moving on ...
>>>>>>>>>>
>>>>>>>>>> First to address the initial comments/query you had:
>>>>>>>>>>
>>>>>>>>>>> The JDK windows jni_md.h file defined jint as long and the 
>>>>>>>>>>> hotspot
>>>>>>>>>>> windows jni_x86.h as int. I had to choose the jdk version 
>>>>>>>>>>> since it's the
>>>>>>>>>>> public version, so there are changes to the hotspot files 
>>>>>>>>>>> for this.
>>>>>>>>>>
>>>>>>>>>> On Windows int and long are always the same as it uses ILP32 
>>>>>>>>>> or LLP64 (not LP64 like *nix platforms). So either choice 
>>>>>>>>>> should be fine. That said there are some odd casting issues I 
>>>>>>>>>> comment on below. Does the VS compiler complain about mixing 
>>>>>>>>>> int and long in expressions?
>>>>>>>>>
>>>>>>>>> Yes, it does even though int and long are the same 
>>>>>>>>> representation.
>>>>>>>>
>>>>>>>> And what an absolute mess that makes. :(
>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> Generally I changed the code to use 'int' rather than 'jint' 
>>>>>>>>>>> where the
>>>>>>>>>>> surrounding API didn't insist on consistently using java 
>>>>>>>>>>> types. We
>>>>>>>>>>> should mostly be using C++ types within hotspot except in 
>>>>>>>>>>> interfaces to
>>>>>>>>>>> native/JNI code.
>>>>>>>>>>
>>>>>>>>>> I think you pulled too hard on a few threads here and things 
>>>>>>>>>> are starting to unravel. There are numerous cases I refer to 
>>>>>>>>>> below where either the cast seems unnecessary/inappropriate 
>>>>>>>>>> or else highlights a bunch of additional changes that also 
>>>>>>>>>> need to be made. The fan out from this could be horrendous. 
>>>>>>>>>> Unless you actually get some kind of error - and I'd like to 
>>>>>>>>>> understand the details of those - I would not suggest making 
>>>>>>>>>> these changes as part of this work.
>>>>>>>>>
>>>>>>>>> I didn't make any change unless there was was an error. I have 
>>>>>>>>> 100 failed JPRT jobs to confirm!? I eventually got a Windows 
>>>>>>>>> system to compile and test this on. Actually some of the 
>>>>>>>>> changes came out better.? Cases where we use jint as a bool 
>>>>>>>>> simply turned to int. We do not have an overload for bool for 
>>>>>>>>> cmpxchg.
>>>>>>>>
>>>>>>>> That's unfortunate - ditto for OrderAccess.
>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Looking through I have a quite a few queries/comments - 
>>>>>>>>>> apologies in advance as I know how tedious this is:
>>>>>>>>>>
>>>>>>>>>> make/hotspot/lib/CompileLibjsig.gmk
>>>>>>>>>> src/java.base/solaris/native/libjsig/jsig.c
>>>>>>>>>>
>>>>>>>>>> Took a while to figure out why the include was needed. :) As 
>>>>>>>>>> a follow up I suggest just deleting the -I include directive, 
>>>>>>>>>> delete the Solaris-only definition of JSIG_VERSION_1_4_1, and 
>>>>>>>>>> delete everything to do with JVM_get_libjsig_version. It is 
>>>>>>>>>> all obsolete.
>>>>>>>>>
>>>>>>>>> Can I patch up jsig in a separate RFE?? I don't remember why 
>>>>>>>>> this broke so I simply moved JSIG #define.? Is jsig obsolete? 
>>>>>>>>> Removing JVM_* definitions generally requires a CSR.
>>>>>>>>
>>>>>>>> I did say "As a follow up". jsig is not obsolete but the jsig 
>>>>>>>> versioning code, only used by Solaris, is.
>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> ---
>>>>>>>>>>
>>>>>>>>>> src/hotspot/cpu/arm/interp_masm_arm.cpp
>>>>>>>>>>
>>>>>>>>>> Why did you need to add the jvm.h include?
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> ?? tbz(Raccess_flags, JVM_ACC_SYNCHRONIZED_BIT, unlocked);
>>>>>>>>
>>>>>>>> Okay. I'm not going to try and figure out how this code found 
>>>>>>>> this before.
>>>>>>>>
>>>>>>>>>> ---
>>>>>>>>>>
>>>>>>>>>> src/hotspot/os/windows/os_windows.cpp.
>>>>>>>>>>
>>>>>>>>>> The type of process_exiting should be uint to match the DWORD 
>>>>>>>>>> of GetCurrentThreadID. Then you should need any casts. Also 
>>>>>>>>>> you missed this jint cast:
>>>>>>>>>>
>>>>>>>>>> 3796???????? process_exiting != (jint)GetCurrentThreadId()) {
>>>>>>>>>
>>>>>>>>> Yes, that's better to change process_exiting to a DWORD.? It 
>>>>>>>>> needs a DWORD cast to 0 in the cmpxchg.
>>>>>>>>>
>>>>>>>>> ???????? Atomic::cmpxchg(GetCurrentThreadId(), 
>>>>>>>>> &process_exiting, (DWORD)0);
>>>>>>>>>
>>>>>>>>> These templates are picky.
>>>>>>>>
>>>>>>>> Yes - their inability to deal with literals is extremely 
>>>>>>>> frustrating.
>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> ---
>>>>>>>>>>
>>>>>>>>>> src/hotspot/share/c1/c1_Canonicalizer.hpp
>>>>>>>>>>
>>>>>>>>>> ? 43 #ifdef _WINDOWS
>>>>>>>>>> ? 44?? // jint is defined as long in jni_md.h, so convert 
>>>>>>>>>> from int to jint
>>>>>>>>>> ? 45?? void set_constant(int x) { set_constant((jint)x); }
>>>>>>>>>> ? 46 #endif
>>>>>>>>>>
>>>>>>>>>> Why is this necessary? int and long are the same on Windows. 
>>>>>>>>>> The whole point is that jint hides the underlying type, so 
>>>>>>>>>> where does this go wrong?
>>>>>>>>>
>>>>>>>>> No, they are not the same types even though they have the same 
>>>>>>>>> representation!
>>>>>>>>
>>>>>>>> This is truly unfortunate.
>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> ---
>>>>>>>>>>
>>>>>>>>>> src/hotspot/share/c1/c1_LinearScan.cpp
>>>>>>>>>>
>>>>>>>>>> ?ConstantIntValue((jint)0);
>>>>>>>>>>
>>>>>>>>>> why is this cast needed? what causes the ambiguity? (If this 
>>>>>>>>>> was a template I'd understand ;-) ). Also didn't you change 
>>>>>>>>>> that constructor to take an int anyway - not that I think it 
>>>>>>>>>> should - see below.
>>>>>>>>>
>>>>>>>>> Yes, it caused an ambiguity.? 0 matches 'int' but it doesn't 
>>>>>>>>> match 'long' better than any pointer type.? So this cast is 
>>>>>>>>> needed.
>>>>>>>>
>>>>>>>> But you changed the constructor to take an int!
>>>>>>>>
>>>>>>>> ?class ConstantIntValue: public ScopeValue {
>>>>>>>> ? private:
>>>>>>>> -? jint _value;
>>>>>>>> +? int _value;
>>>>>>>> ? public:
>>>>>>>> -? ConstantIntValue(jint value)???????? { _value = value; }
>>>>>>>> +? ConstantIntValue(int value)????????? { _value = value; }
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> Okay I removed this cast.
>>>>>>>
>>>>>>>>>> ---
>>>>>>>>>>
>>>>>>>>>> src/hotspot/share/ci/ciReplay.cpp
>>>>>>>>>>
>>>>>>>>>> 793???????? jint* dims = NEW_RESOURCE_ARRAY(jint, rank);
>>>>>>>>>>
>>>>>>>>>> why should this be jint?
>>>>>>>>>
>>>>>>>>> To avoid a cast from int* to jint* in the line below:
>>>>>>>>>
>>>>>>>>> ????????? value = kelem->multi_allocate(rank, dims, CHECK);
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> ---
>>>>>>>>>>
>>>>>>>>>> src/hotspot/share/classfile/altHashing.cpp
>>>>>>>>>>
>>>>>>>>>> Okay this looks more consistent with jint.
>>>>>>>>>
>>>>>>>>> Yes.? I translated this from some native code iirc.
>>>>>>>>>>
>>>>>>>>>> ---
>>>>>>>>>>
>>>>>>>>>> src/hotspot/share/code/debugInfo.hpp
>>>>>>>>>>
>>>>>>>>>> These changes seem wrong. We have:
>>>>>>>>>>
>>>>>>>>>> ConstantLongValue(jlong value)
>>>>>>>>>> ConstantDoubleValue(jdouble value)
>>>>>>>>>>
>>>>>>>>>> so we should have:
>>>>>>>>>>
>>>>>>>>>> ConstantIntValue(jint value)
>>>>>>>>>
>>>>>>>>> Again, there are multiple call sites with '0', which match int 
>>>>>>>>> trivially but are confused with long.? It's less consistent I 
>>>>>>>>> agree but better to not cast all the call sites.
>>>>>>>>
>>>>>>>> This is really making a mess of the APIs - they should be a 
>>>>>>>> jint but we declare them int because of a 0 casting problem. 
>>>>>>>> Can't we just use 0L?
>>>>>>>
>>>>>>> There aren't that many casts.? You're right, that would have 
>>>>>>> been better in some places.
>>>>>>>
>>>>>>>>>>
>>>>>>>>>> ---
>>>>>>>>>>
>>>>>>>>>> src/hotspot/share/code/relocInfo.cpp
>>>>>>>>>>
>>>>>>>>>> Change seems unnecessary - int32_t is fine
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> No, int32_t doesn't match the calls below it. They all assume 
>>>>>>>>> _lo and _hi are jint.
>>>>>>>>>> ---
>>>>>>>>>>
>>>>>>>>>> src/hotspot/share/compiler/compileBroker.cpp
>>>>>>>>>> src/hotspot/share/compiler/compileBroker.hpp
>>>>>>>>>>
>>>>>>>>>> I see a complete mix of int and jint in this class, so why 
>>>>>>>>>> make the one change you did ??
>>>>>>>>>
>>>>>>>>> This is another case of using jint as a flag with cmpxchg. The 
>>>>>>>>> templates for cmpxchg want the types to match and 0 and 1 are 
>>>>>>>>> essentially 'int'.? This is a lot cleaner this way.
>>>>>>>>
>>>>>>>> <sigh>
>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> ---
>>>>>>>>>>
>>>>>>>>>> src/hotspot/share/jvmci/jvmciCompilerToVM.cpp
>>>>>>>>>>
>>>>>>>>>> 1700???? tty->write((char*) start, MIN2(length, 
>>>>>>>>>> (jint)O_BUFLEN));
>>>>>>>>>>
>>>>>>>>>> why did you need to add the jint cast? It's used without any 
>>>>>>>>>> cast on the next two lines:
>>>>>>>>>>
>>>>>>>>>> 1701???? length -= O_BUFLEN;
>>>>>>>>>> 1702???? offset += O_BUFLEN;
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> There's a conversion from O_BUFLEN from int to long in 1701 
>>>>>>>>> and 1702.?? MIN2 is a template that wants the types to match 
>>>>>>>>> exactly.
>>>>>>>>
>>>>>>>> $%^%$! templates!
>>>>>>>>
>>>>>>>>>> ??
>>>>>>>>>>
>>>>>>>>>> ---
>>>>>>>>>>
>>>>>>>>>> src/hotspot/share/jvmci/jvmciRuntime.cpp
>>>>>>>>>>
>>>>>>>>>> Looking around this code it seems very confused about types - 
>>>>>>>>>> eg the previous function is declared jboolean yet returns a 
>>>>>>>>>> jint on one path! It isn't clear to me if the return type is 
>>>>>>>>>> what should be changed or the parameter type? I would just 
>>>>>>>>>> leave this alone.
>>>>>>>>>
>>>>>>>>> I can't leave it alone because it doesn't compile that way. 
>>>>>>>>> This was the minimal change and yea, does look a bit 
>>>>>>>>> inconsistent.
>>>>>>>>>>
>>>>>>>>>> ---
>>>>>>>>>>
>>>>>>>>>> src/hotspot/share/opto/mulnode.cpp
>>>>>>>>>>
>>>>>>>>>> Okay TypeInt has jint parts, so the remaining int32_t 
>>>>>>>>>> declarations (A, B, C, D) should also be jint.
>>>>>>>>>
>>>>>>>>> Yes.? c2 uses jint types.
>>>>>>>>>>
>>>>>>>>>> ---
>>>>>>>>>>
>>>>>>>>>> src/hotspot/share/opto/parse3.cpp
>>>>>>>>>>
>>>>>>>>>> I agree with the changes you made, but then:
>>>>>>>>>>
>>>>>>>>>> ?419???? jint dim_con = find_int_con(length[j], -1);
>>>>>>>>>>
>>>>>>>>>> should also be changed.
>>>>>>>>>>
>>>>>>>>>> And obviously MultiArrayExpandLimit should be defined as int 
>>>>>>>>>> not intx!
>>>>>>>>>
>>>>>>>>> Everything in globals.hpp is intx.? That's a thread that I 
>>>>>>>>> don't want to pull on!
>>>>>>>>
>>>>>>>> We still have that limitation? <double sigh>
>>>>>>>>>
>>>>>>>>> Changed dim_con to int.
>>>>>>>>>>
>>>>>>>>>> ---
>>>>>>>>>>
>>>>>>>>>> src/hotspot/share/opto/phaseX.cpp
>>>>>>>>>>
>>>>>>>>>> I can see that intcon(jint i) is consistent with 
>>>>>>>>>> longcon(jlong l), but the use of "i" in the code is more 
>>>>>>>>>> consistent with int than jint.
>>>>>>>>>
>>>>>>>>> huh?? really?
>>>>>>>>>>
>>>>>>>>>> ---
>>>>>>>>>>
>>>>>>>>>> src/hotspot/share/opto/type.cpp
>>>>>>>>>>
>>>>>>>>>> 1505 int TypeInt::hash(void) const {
>>>>>>>>>> 1506?? return java_add(java_add(_lo, _hi), 
>>>>>>>>>> java_add((jint)_widen, (jint)Type::Int));
>>>>>>>>>> 1507 }
>>>>>>>>>>
>>>>>>>>>> I can see that the (jint) casts you added make sense, but 
>>>>>>>>>> then the whole function should be returning jint not int. 
>>>>>>>>>> Ditto the other hash functions.
>>>>>>>>>
>>>>>>>>> I'm not messing with this, this is the minimal in type fixing 
>>>>>>>>> that I'm going to do here.
>>>>>>>>
>>>>>>>> <sigh>
>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> ---
>>>>>>>>>>
>>>>>>>>>> src/hotspot/share/prims/jni.cpp
>>>>>>>>>>
>>>>>>>>>> I think vm_created should be a bool. In fact all the fields 
>>>>>>>>>> you changed are logically bools - do Atomics work for bool now?
>>>>>>>>>
>>>>>>>>> No, they do not.?? I had thought bool would be better 
>>>>>>>>> originally too.
>>>>>>>>>>
>>>>>>>>>> ---
>>>>>>>>>>
>>>>>>>>>> src/hotspot/share/prims/jvm.cpp
>>>>>>>>>>
>>>>>>>>>> is_attachable is the terminology used in the JDK code.
>>>>>>>>>
>>>>>>>>> Well the JDK version had is_attach_supported() as the flag 
>>>>>>>>> name so I used that in this one place.
>>>>>>>>>>
>>>>>>>>>> ---
>>>>>>>>>>
>>>>>>>>>> src/hotspot/share/prims/jvmtiEnvBase.cpp
>>>>>>>>>> src/hotspot/share/prims/jvmtiImpl.cpp
>>>>>>>>>>
>>>>>>>>>> Are you making parameters consistent with the fields they 
>>>>>>>>>> initialize?
>>>>>>>>>
>>>>>>>>> They're consistent with the declarations now.
>>>>>>>>>>
>>>>>>>>>> ---
>>>>>>>>>>
>>>>>>>>>> src/hotspot/share/prims/jvmtiTagMap.cpp
>>>>>>>>>>
>>>>>>>>>> There is a mix of int and jint for slot in this code. You 
>>>>>>>>>> fixed some, but this remains:
>>>>>>>>>>
>>>>>>>>>> 2440 inline bool CallbackInvoker::report_stack_ref_root(jlong 
>>>>>>>>>> thread_tag,
>>>>>>>>>> 2441 jlong tid,
>>>>>>>>>> 2442 jint depth,
>>>>>>>>>> 2443 jmethodID method,
>>>>>>>>>> 2444 jlocation bci,
>>>>>>>>>> 2445 jint slot,
>>>>>>>>>
>>>>>>>>> Right for consistency with the declarations.
>>>>>>>>>>
>>>>>>>>>> ---
>>>>>>>>>>
>>>>>>>>>> src/hotspot/share/runtime/perfData.cpp
>>>>>>>>>>
>>>>>>>>>> Callers pass both jint and int, so param type seems arbitrary.
>>>>>>>>>
>>>>>>>>> They are, but importantly they match the declarations.
>>>>>>>>>>
>>>>>>>>>> ---
>>>>>>>>>>
>>>>>>>>>> src/hotspot/share/runtime/perfMemory.cpp
>>>>>>>>>> src/hotspot/share/runtime/perfMemory.hpp
>>>>>>>>>>
>>>>>>>>>> PerfMemory::_initialized should ideally be a bool - can 
>>>>>>>>>> OrderAccess handle that now?
>>>>>>>>>
>>>>>>>>> Nope.
>>>>>>>>>>
>>>>>>>>>> ---
>>>>>>>>>>
>>>>>>>>>> src/java.base/share/native/include/jvm.h
>>>>>>>>>>
>>>>>>>>>> Not clear why the jio functions are not also JNICALL ?
>>>>>>>>>
>>>>>>>>> They are now.? The JDK version didn't have JNICALL. JVM needs 
>>>>>>>>> JNICALL.? I can't tell you why JDK didn't need JNICALL linkage.
>>>>>>>>
>>>>>>>> ?? JVM currently does not have JNICALL. But they are declared 
>>>>>>>> as "extern C".
>>>>>>>
>>>>>>> This was a compilation error on Windows with JDK. Maybe the C 
>>>>>>> code in the JDK doesn't complain about linkage differences. I'll 
>>>>>>> have to go back and figure this out then.
>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> ---
>>>>>>>>>>
>>>>>>>>>> src/java.base/unix/native/include/jni_md.h
>>>>>>>>>>
>>>>>>>>>> There is no need to special case ARM. The differences in the 
>>>>>>>>>> existing code were for LTO support and that is now irrelevant.
>>>>>>>>>
>>>>>>>>> See discussion with Magnus.?? We still build ARM for jdk10/hs 
>>>>>>>>> so I needed this conditional or of course I wouldn't have 
>>>>>>>>> added it.? We can remove it with LTO support.
>>>>>>>>
>>>>>>>> Those builds are gone - this is obsolete. But yes all LTO can 
>>>>>>>> be removed later if you wish. Just trying to simplify things now.
>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> ---
>>>>>>>>>>
>>>>>>>>>> src/java.base/unix/native/include/jvm_md.h
>>>>>>>>>>
>>>>>>>>>> I know you've just copied this across, but it seems wrong to me:
>>>>>>>>>>
>>>>>>>>>> ?57 // Hack: MAXPATHLEN is 4095 on some Linux and 4096 on 
>>>>>>>>>> others. This may
>>>>>>>>>> ? 58 //?????? cause problems if JVM and the rest of JDK are 
>>>>>>>>>> built on different
>>>>>>>>>> ? 59 //?????? Linux releases. Here we define JVM_MAXPATHLEN 
>>>>>>>>>> to be MAXPATHLEN + 1,
>>>>>>>>>> ? 60 //?????? so buffers declared in VM are always >= 4096.
>>>>>>>>>> ? 61 #define JVM_MAXPATHLEN MAXPATHLEN + 1
>>>>>>>>>>
>>>>>>>>>> It doesn't make sense to me to define an internal "max path 
>>>>>>>>>> length" that can _exceed_ the platform max!
>>>>>>>>>>
>>>>>>>>>> That aside there's no support for building different parts of 
>>>>>>>>>> the JDK on different platforms and then bringing them 
>>>>>>>>>> together. And in any case I would think the real problem 
>>>>>>>>>> would be building on a platform that uses 4096 and running on 
>>>>>>>>>> one that uses 4095!
>>>>>>>>>>
>>>>>>>>>> But that aside this is a Linux hack and should be guarded by 
>>>>>>>>>> ifdef LINUX. (I doubt BSD needs it, the bsd file is just a 
>>>>>>>>>> copy of the linux one - the JDK macosx version does the right 
>>>>>>>>>> thing). Solaris and AIX should stay as-is at MAXPATHLEN.
>>>>>>>>>
>>>>>>>>> All of the unix platforms had MAXPATHLEN+1.? I'll leave it for 
>>>>>>>>> now and we can investigate that further.
>>>>>>>>
>>>>>>>> I see the following existing code:
>>>>>>>>
>>>>>>>> src/java.base/unix/native/include/jvm_md.h:
>>>>>>>>
>>>>>>>> #define JVM_MAXPATHLEN MAXPATHLEN
>>>>>>>>
>>>>>>>> src/java.base/macosx/native/include/jvm_md.h
>>>>>>>>
>>>>>>>> #define JVM_MAXPATHLEN MAXPATHLEN
>>>>>>>>
>>>>>>>> src/hotspot/os/aix/jvm_aix.h
>>>>>>>>
>>>>>>>> #define JVM_MAXPATHLEN MAXPATHLEN
>>>>>>>>
>>>>>>>> src/hotspot/os/bsd/jvm_bsd.h
>>>>>>>>
>>>>>>>> #define JVM_MAXPATHLEN MAXPATHLEN + 1? // blindly copied from 
>>>>>>>> Linux version
>>>>>>>>
>>>>>>>> src/hotspot/os/linux/jvm_linux.h
>>>>>>>>
>>>>>>>> #define JVM_MAXPATHLEN MAXPATHLEN + 1
>>>>>>>>
>>>>>>>> src/hotspot/os/solaris/jvm_solaris.h
>>>>>>>>
>>>>>>>> #define JVM_MAXPATHLEN MAXPATHLEN
>>>>>>>>
>>>>>>>> This is a linux only hack (if you ignore the blind copy from 
>>>>>>>> linux into the BSD code in the VM).
>>>>>>>
>>>>>>> Oh, thanks, so should I add a bunch of ifdefs then? Or do you 
>>>>>>> think having MAXPATHLEN + 1 will really break the other 
>>>>>>> platforms?? Do you really see this as a problem or are you just 
>>>>>>> pointing out inconsistency?
>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> ?86 #define ASYNC_SIGNAL???? SIGJVM2
>>>>>>>>>>
>>>>>>>>>> This only exists on Solaris so I think should be in #ifdef 
>>>>>>>>>> SOLARIS, to make that clear.
>>>>>>>>>
>>>>>>>>> Ok.? I'll add this.
>>>>>>>>>>
>>>>>>>>>> ---
>>>>>>>>>>
>>>>>>>>>> src/java.base/windows/native/include/jvm_md.h
>>>>>>>>>>
>>>>>>>>>> Given the differences between the two versions either 
>>>>>>>>>> something has been broken or "extern C" declarations are not 
>>>>>>>>>> needed :)
>>>>>>>>>
>>>>>>>>> Well, they are needed for Hotspot to build and do not prevent 
>>>>>>>>> jdk from building.? I don't know what was broken.
>>>>>>>>
>>>>>>>> We really need to understand this better. Maybe related to the 
>>>>>>>> map files that expose the symbols. ??
>>>>>>>
>>>>>>> They're needed because the JDK files are written mostly in C and 
>>>>>>> that doesn't complain about the linkage difference. Hotspot 
>>>>>>> files are in C++ which does complain.
>>>>>>>
>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> ---
>>>>>>>>>>
>>>>>>>>>> That was a really painful way to spend most of my Friday. 
>>>>>>>>>> TGIF! :)
>>>>>>>>>
>>>>>>>>> Thanks for going through it.? See comments inline for changes. 
>>>>>>>>> Generating a webrev takes hours so I'm not going to do that 
>>>>>>>>> unless you insist.
>>>>>>>>
>>>>>>>> An incremental webrev shouldn't take long - right? You're a mq 
>>>>>>>> maestro now. :)
>>>>>>>
>>>>>>> Well I generally trash a repository whenever I use mq but sure.
>>>>>>>>
>>>>>>>> If you can reasonably produce an incremental webrev once you've 
>>>>>>>> settled on all the comments/issues that would be good.
>>>>>>>
>>>>>>> Ok, sure.
>>>>>>>
>>>>>>> Coleen
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> David
>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Coleen
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> David
>>>>>>>>>> -----
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 27/10/2017 6:44 AM, coleen.phillimore at oracle.com wrote:
>>>>>>>>>>> ??Hi Magnus,
>>>>>>>>>>>
>>>>>>>>>>> Thank you for reviewing this.?? I have a new version that 
>>>>>>>>>>> takes out the hack in globalDefinitions.hpp and adds casts 
>>>>>>>>>>> to src/hotspot/share/opto/type.cpp instead.
>>>>>>>>>>>
>>>>>>>>>>> Also some fixes from Martin at SAP.
>>>>>>>>>>>
>>>>>>>>>>> open webrev at 
>>>>>>>>>>> http://cr.openjdk.java.net/~coleenp/8189610.02/webrev
>>>>>>>>>>>
>>>>>>>>>>> see below.
>>>>>>>>>>>
>>>>>>>>>>> On 10/26/17 5:57 AM, Magnus Ihse Bursie wrote:
>>>>>>>>>>>> Coleen,
>>>>>>>>>>>>
>>>>>>>>>>>> Thank you for addressing this!
>>>>>>>>>>>>
>>>>>>>>>>>> On 2017-10-25 18:49, coleen.phillimore at oracle.com wrote:
>>>>>>>>>>>>> Summary: removed hotspot version of jvm*h and jni*h files
>>>>>>>>>>>>>
>>>>>>>>>>>>> Mostly used sed to remove prims/jvm.h and move #include 
>>>>>>>>>>>>> "jvm.h" after precompiled.h, so if you have repetitive 
>>>>>>>>>>>>> stress wrist issues don't click on most of these files.
>>>>>>>>>>>>>
>>>>>>>>>>>>> There were more issues to resolve, however. The JDK 
>>>>>>>>>>>>> windows jni_md.h file defined jint as long and the hotspot 
>>>>>>>>>>>>> windows jni_x86.h as int. I had to choose the jdk version 
>>>>>>>>>>>>> since it's the public version, so there are changes to the 
>>>>>>>>>>>>> hotspot files for this. Generally I changed the code to 
>>>>>>>>>>>>> use 'int' rather than 'jint' where the surrounding API 
>>>>>>>>>>>>> didn't insist on consistently using java types. We should 
>>>>>>>>>>>>> mostly be using C++ types within hotspot except in 
>>>>>>>>>>>>> interfaces to native/JNI code. There are a couple of hacks 
>>>>>>>>>>>>> in places where adding multiple jint casts was too painful.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Tested with JPRT and tier2-4 (in progress).
>>>>>>>>>>>>>
>>>>>>>>>>>>> open webrev at 
>>>>>>>>>>>>> http://cr.openjdk.java.net/~coleenp/8189610.01/webrev
>>>>>>>>>>>>
>>>>>>>>>>>> Looks great!
>>>>>>>>>>>>
>>>>>>>>>>>> Just a few comments:
>>>>>>>>>>>>
>>>>>>>>>>>> * src/java.base/unix/native/include/jni_md.h:
>>>>>>>>>>>>
>>>>>>>>>>>> I don't think the externally_visible attribute should be 
>>>>>>>>>>>> there for arm. I know this was the case for the 
>>>>>>>>>>>> corresponding hotspot file for arm, but that was techically 
>>>>>>>>>>>> incorrect. The proper dependency here is that 
>>>>>>>>>>>> externally_visible should be in all JNIEXPORT if and only 
>>>>>>>>>>>> if we're building with JVM feature "link-time-opt". 
>>>>>>>>>>>> Traditionally, that feature been enabled when building 
>>>>>>>>>>>> arm32 builds, and only then, so there's been a 
>>>>>>>>>>>> (coincidentally) connection here. Nowadays, Oracle does not 
>>>>>>>>>>>> care about the arm32 builds, and I'm not sure if anyone 
>>>>>>>>>>>> else is building them with link-time-opt enabled.
>>>>>>>>>>>>
>>>>>>>>>>>> It does seem wrong to me to export this behavior in the 
>>>>>>>>>>>> public jni_md.h file, though. I think the correct way to 
>>>>>>>>>>>> solve this, if we should continue supporting link-time-opt 
>>>>>>>>>>>> is to make sure this attribute is set for exported hotspot 
>>>>>>>>>>>> functions. If it's still needed, that is. A quick googling 
>>>>>>>>>>>> seems to indicate that visibility("default") might be 
>>>>>>>>>>>> enough in modern gcc's.
>>>>>>>>>>>>
>>>>>>>>>>>> A third option is to remove the support for link-time-opt 
>>>>>>>>>>>> entirely, if it's not really used.
>>>>>>>>>>>
>>>>>>>>>>> I didn't know how to change this since we are still building 
>>>>>>>>>>> ARM with the jdk10/hs repository, and ARM needed this 
>>>>>>>>>>> change. I could wait until we bring down the jdk10/master 
>>>>>>>>>>> changes that remove the ARM build and remove this 
>>>>>>>>>>> conditional before I push. Or we could file an RFE to remove 
>>>>>>>>>>> link-time-opt (?) and remove it then?
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> * src/java.base/unix/native/include/jvm_md.h and 
>>>>>>>>>>>> src/java.base/windows/native/include/jvm_md.h:
>>>>>>>>>>>>
>>>>>>>>>>>> These files define a public API, and contain non-trivial 
>>>>>>>>>>>> changes. I suspect you should file a CSR request. (Even 
>>>>>>>>>>>> though I realize you're only matching the header file with 
>>>>>>>>>>>> the reality.)
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> I filed the CSR.?? Waiting for the next steps.
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Coleen
>>>>>>>>>>>
>>>>>>>>>>>> /Magnus
>>>>>>>>>>>>
>>>>>>>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8189610
>>>>>>>>>>>>>
>>>>>>>>>>>>> I have a script to update copyright files on commit.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks to Magnus and ErikJ for the makefile changes.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Coleen
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>
>>>
>>


From dmitry.samersoff at bell-sw.com  Tue Oct 31 12:58:37 2017
From: dmitry.samersoff at bell-sw.com (Dmitry Samersoff)
Date: Tue, 31 Oct 2017 15:58:37 +0300
Subject: [10] RFR 8186046 Minimal ConstantDynamic support
In-Reply-To: <EBC28788-F65A-42F8-A6AD-7B670FDB74C5@oracle.com>
References: <A3420FD7-0556-4865-9CE7-2E5252DD1381@oracle.com>
 <b8c8d33a-2782-cfba-64df-fc32151ac117@bell-sw.com>
 <93431280-9CBF-4722-961D-F2D2D0F83B4E@oracle.com>
 <EBC28788-F65A-42F8-A6AD-7B670FDB74C5@oracle.com>
Message-ID: <d3d6c258-4a26-331b-a85c-9e7f23fa8229@bell-sw.com>

Paul and Frederic,

Thank you.

One more question. Do we need to call verify_oop below?

509   { // Check for the null sentinel.
...
517      xorptr(result, result);  // NULL object reference
...

521   if (VerifyOops) {
522      verify_oop(result);
523   }

-Dmitry


On 31.10.2017 00:56, Frederic Parain wrote:
> I?m seeing no issue with rcx being aliased in this code.
> 
> Fred
> 
>> On Oct 30, 2017, at 15:44, Paul Sandoz <paul.sandoz at oracle.com> wrote:
>>
>> Hi,
>>
>> Thanks for reviewing.
>>
>>> On 30 Oct 2017, at 11:05, Dmitry Samersoff <dmitry.samersoff at bell-sw.com> wrote:
>>>
>>> Paul,
>>>
>>> templateTable_x86.cpp:
>>>
>>> 564   const Register flags = rcx;
>>> 565   const Register rarg = NOT_LP64(rcx) LP64_ONLY(c_rarg1);
>>>
>>> Should we use another register for rarg under NOT_LP64 ?
>>>
>>
>> I think it should be ok, it i ain?t an expert here on the interpreter and the calling conventions, so please correct me.
>>
>> Some more context:
>>
>> +  const Register flags = rcx;
>> +  const Register rarg = NOT_LP64(rcx) LP64_ONLY(c_rarg1);
>> +  __ movl(rarg, (int)bytecode());
>>
>> The current bytecode code is loaded into ?rarg?
>>
>> +  call_VM(obj, CAST_FROM_FN_PTR(address, InterpreterRuntime::resolve_ldc), rarg);
>>
>> Then ?rarg" is the argument to the call to InterpreterRuntime::resolve_ldc, after which it is no longer referred to.
>>
>> +#ifndef _LP64
>> +  // borrow rdi from locals
>> +  __ get_thread(rdi);
>> +  __ get_vm_result_2(flags, rdi);
>> +  __ restore_locals();
>> +#else
>> +  __ get_vm_result_2(flags, r15_thread);
>> +#endif
>>
>> The result from the call is then loaded into flags.
>>
>> So i don?t think it matters in this case if rcx is aliased.
>>
>> Paul.
>>
>>> -Dmitry
>>>
>>>
>>> On 10/26/2017 08:03 PM, Paul Sandoz wrote:
>>>> Hi,
>>>>
>>>> Please review the following patch for minimal dynamic constant support:
>>>>
>>>> http://cr.openjdk.java.net/~psandoz/jdk10/JDK-8186046-minimal-condy-support-hs/webrev/ <http://cr.openjdk.java.net/~psandoz/jdk10/JDK-8186046-minimal-condy-support-hs/webrev/>
>>>>
>>>> https://bugs.openjdk.java.net/browse/JDK-8186046 <https://bugs.openjdk.java.net/browse/JDK-8186046>
>>>> https://bugs.openjdk.java.net/browse/JDK-8186209 <https://bugs.openjdk.java.net/browse/JDK-8186209>
>>>>
>>>> This patch is based on the JDK 10 unified HotSpot repository. Testing so far looks good.
>>>>
>>>> By minimal i mean just the support in the runtime for a dynamic constant pool entry to be referenced by a LDC instruction or a bootstrap method argument. Much of the work leverages the foundations built by invoke dynamic but is arguably simpler since resolution is less complex.
>>>>
>>>> A small set of bootstrap methods will be proposed as a follow on issue for 10 (these are currently being refined in the amber repository).
>>>>
>>>> Bootstrap method invocation has not changed (and the rules are the same for dynamic constants and indy). It is planned to enhance this in a further major release to support lazy resolution of bootstrap method arguments.
>>>>
>>>> The CSR for the VM specification is here:
>>>>
>>>> https://bugs.openjdk.java.net/browse/JDK-8189199 <https://bugs.openjdk.java.net/browse/JDK-8189199>
>>>>
>>>> the j.l.invoke package documentation was also updated but please consider the VM specification as the definitive "source of truth" (we may clean up this area further later on so it becomes more informative, and that may also apply to duplicative text on MethodHandles/VarHandles).
>>>>
>>>> Any AoT-related work will be deferred to a future release.
>>>>
>>>> ?
>>>>
>>>> This patch only supports x64 platforms. There is a small set of changes specific to x64 (specifically to support null and primitives constants, as prior to this patch null was used as a sentinel for resolution and certain primitives types would never have been encountered, such as say byte).
>>>>
>>>> We will need to follow up with the SPARC platform and it is hoped/anticipated that OpenJDK members responsible for other platforms (namely ARM and PPC) will separately provide patches.
>>>>
>>>> ?
>>>>
>>>> Many of tests rely on an experimental byte code API that supports the generation of byte code with dynamic constants.
>>>>
>>>> One test uses class file bytes produced from a modified version of asmtools.  The modifications have now been pushed but a new version of asmtools need to be rolled into jtreg before the test can operate directly on asmtools information rather than embedding class file bytes directly in the test.
>>>>
>>>> ?
>>>>
>>>> Paul.
>>>>
>>>
>>
> 


From doug.simon at oracle.com  Tue Oct 31 13:05:11 2017
From: doug.simon at oracle.com (Doug Simon)
Date: Tue, 31 Oct 2017 14:05:11 +0100
Subject: RFR: 8190415: [JVMCI] JVMCIRuntime::adjust_comp_level must not
 swallow ThreadDeath
Message-ID: <F97985A1-5AFF-466F-9731-6F92C91CF4DC@oracle.com>

Please review this change that fixes a JVMCI code path that was swallowing ThreadDeath exceptions and thus preventing Thread.stop from working as intended.
The webrev also contains some minor unrelated cleanup to mx_jvmci.py needed for supporting the consolidated repo.

The internal test that caught this problem is now passing.

https://bugs.openjdk.java.net/browse/JDK-8190415
http://cr.openjdk.java.net/~dnsimon/8190415/

-Doug

From robbin.ehn at oracle.com  Tue Oct 31 14:37:14 2017
From: robbin.ehn at oracle.com (Robbin Ehn)
Date: Tue, 31 Oct 2017 15:37:14 +0100
Subject: RFR(XL): 8185640: Thread-local handshakes
In-Reply-To: <9ff3abc3-9809-a9df-141b-15f0b05bd8a4@oracle.com>
References: <6f2f6259-73f1-c09c-063e-39ae528fb96f@oracle.com>
 <f3553eec9b1d4815adbf7c87ca550527@sap.com>
 <580AD7F0-2713-472C-A440-AAFDDA2D3EB3@oracle.com>
 <7591f6c0-7192-78c3-fe79-56a7785c43e4@oracle.com>
 <2ff79d24-90ab-822a-bd61-e01b79c01ada@redhat.com>
 <8d7678bf2281406da43cbe090276b51f@sap.com>
 <28bc3976-424d-1e05-cf7f-29bc38ccabcb@oracle.com>
 <818e352d5e3a450491cf0c140bf129d6@sap.com>
 <1c4a025da6ad4d39bedc1d6a12549b87@sap.com>
 <93cce80b-e9d7-f016-1324-2b0f5fac48c4@redhat.com> <59F1F3B3.10701@oracle.com>
 <43837915-a3a3-b36f-940e-1327937f0f17@redhat.com>
 <2EB9D7C3-B868-4C3E-BD88-6A4F92A39999@oracle.com>
 <b4d75241-f87f-3aef-ad55-897d1f91be1e@redhat.com>
 <59F2DC24.8050701@oracle.com>
 <cd99843e-2602-c423-c74e-210884713ef5@redhat.com> <59F2F01A.403@oracle.com>
 <d0fe324f-26ed-7fca-e8f9-81b1ca4f452d@oracle.com>
 <4ebb905f23324a00b9cf10d8d410d420@sap.com>
 <a2fe667f-d89b-9856-8630-d0e15ae90ae4@oracle.com>
 <9ff3abc3-9809-a9df-141b-15f0b05bd8a4@oracle.com>
Message-ID: <a969ab7e-01a3-9dca-76d0-360d2d026134@oracle.com>

Thank you David for having a look.

I updated after your review, I think I got it all, please see:
http://cr.openjdk.java.net/~rehn/8185640/v9/DavidH-Option-Cleanup-13/webrev/

I'm also updating CSR with product_pd.

Short thing:

On 10/31/2017 11:27 AM, David Holmes wrote:
> 
> I'm also thinking, if this is platform dependent then shouldn't 
> ThreadLocalHandshakes be a product_pd flag, with pd specific default setting - 
> and turning it on when on an unsupported platform should be a error ?

Yes, the error checking already exists in:

  135 Flag::Error ThreadLocalHandshakesConstraintFunc(bool value, bool verbose) {
  136   if (value) {
  137     if (!SafepointMechanism::supports_thread_local_poll()) {
  138       CommandLineError::print(verbose, "ThreadLocalHandshakes not yet 
supported on this platform\n");
  139       return Flag::VIOLATES_CONSTRAINT;
  140     }
  141     if (UseAOT JVMCI_ONLY(|| EnableJVMCI || UseJVMCICompiler)) {
  142       CommandLineError::print(verbose, "ThreadLocalHandshakes not yet 
supported in combination with AOT or JVMCI\n");
  143       return Flag::VIOLATES_CONSTRAINT;
  144     }
  145   }
  146   return Flag::SUCCESS;
  147 }

Sanity tested with handshake benchmark on all supported + 1 unsupported platform.

Thanks, Robbin

> 
> Thanks,
> David
> -----
> 
>> Here is webrev for changes needed:
>> http://cr.openjdk.java.net/~rehn/8185640/v8/Option-Cleanup-12/webrev/
>> And here is CSR:
>> https://bugs.openjdk.java.net/browse/JDK-8189942
>>
>> Manual testing + basic testing done.
>>
>> And since I'm really hoping that this can be the last incremental, here is my 
>> whole patch queue flatten out:
>> http://cr.openjdk.java.net/~rehn/8185640/v8/Full/webrev/
>>
>> Thanks, Robbin
>>
>> On 10/27/2017 04:47 PM, Doerr, Martin wrote:
>>> Hi Robbin,
>>>
>>> excellent. I think this matches what Coleen had proposed, now.
>>> Thanks for doing all the work with so many incremental patches and for 
>>> responding on so many discussions. Seems to be a tough piece of work.
>>>
>>> Best regards,
>>> Martin
>>>
>>>
>>> -----Original Message-----
>>> From: Robbin Ehn [mailto:robbin.ehn at oracle.com]
>>> Sent: Freitag, 27. Oktober 2017 15:15
>>> To: Erik ?sterlund <erik.osterlund at oracle.com>; Andrew Haley 
>>> <aph at redhat.com>; Doerr, Martin <martin.doerr at sap.com>; Karen Kinnear 
>>> <karen.kinnear at oracle.com>; Coleen Phillimore (coleen.phillimore at oracle.com) 
>>> <coleen.phillimore at oracle.com>
>>> Cc: hotspot-dev developers <hotspot-dev at openjdk.java.net>
>>> Subject: Re: RFR(XL): 8185640: Thread-local handshakes
>>>
>>> Hi all,
>>>
>>> Poll in switches:
>>> http://cr.openjdk.java.net/~rehn/8185640/v7/Interpreter-Poll-Switch-10/
>>>
>>> Poll in return:
>>> http://cr.openjdk.java.net/~rehn/8185640/v7/Interpreter-Poll-Ret-11/
>>>
>>> Please take an extra look at poll in return.
>>>
>>> Sanity tested, big test run still running (99% complete - OK).
>>>
>>> Performance regression for the added polls increased to total of -0.68% vs
>>> global poll. (was -0.44%)
>>>
>>> We are discussing the opt-out option, the newest suggestion is to make it
>>> diagnostic. Opinions?
>>>
>>> For anyone applying these patches, the number 9 patch changes the option from
>>> product. I have not sent that out.
>>>
>>> Thanks, Robbin
>>>
>>>
>>>

From kumar.x.srinivasan at oracle.com  Tue Oct 31 16:42:43 2017
From: kumar.x.srinivasan at oracle.com (Kumar Srinivasan)
Date: Tue, 31 Oct 2017 09:42:43 -0700
Subject: RFR: 8190287: Update JDK's internal ASM to ASMv6
In-Reply-To: <59F3690B.6070309@oracle.com>
References: <59F3690B.6070309@oracle.com>
Message-ID: <59F8A803.9060305@oracle.com>

Hi Remi,

Are you ok with the ASMv6 changes ?

Thanks
Kumar

On 10/27/2017 10:12 AM, Kumar Srinivasan wrote:
> Hello Remi, Sundar and others,
>
> Please review the webrev [1] to update JDK's internal ASM to v6.
>
> To help with review areas, you can use the browser to search for mq 
> patches commented with //
>
> Highlights of changes:
> 1. updated ASMv6 // jdk-new-asmv6.patch
> 2. changes to jlink and jar to add ModuleMainClass and ModulePackages 
> attributes //jdk-new-asm-update.patch
> 3. adjustments to jdk tests  //jdk-new-asm-test.patch
> 4. minor adjustments  to hotspot tests //jdk-new-hotspot-test.patch
>
> Tests:
> jdk_tier1, jdk_tier2, testset hotspot, hotspot_tier1, nashorn ant tests,
> Alan has also run several tests.
>
> Big thanks to Alan for #2 and #3 as part of [3].
>
> Thanks
> Kumar
>
> [1] http://cr.openjdk.java.net/~ksrini/8190287/webrev.00/index.html
> [2] https://bugs.openjdk.java.net/browse/JDK-8190287
> [3] https://bugs.openjdk.java.net/browse/JDK-8186236
>


From paul.sandoz at oracle.com  Tue Oct 31 17:32:25 2017
From: paul.sandoz at oracle.com (Paul Sandoz)
Date: Tue, 31 Oct 2017 10:32:25 -0700
Subject: [10] RFR 8186046 Minimal ConstantDynamic support
In-Reply-To: <d3d6c258-4a26-331b-a85c-9e7f23fa8229@bell-sw.com>
References: <A3420FD7-0556-4865-9CE7-2E5252DD1381@oracle.com>
 <b8c8d33a-2782-cfba-64df-fc32151ac117@bell-sw.com>
 <93431280-9CBF-4722-961D-F2D2D0F83B4E@oracle.com>
 <EBC28788-F65A-42F8-A6AD-7B670FDB74C5@oracle.com>
 <d3d6c258-4a26-331b-a85c-9e7f23fa8229@bell-sw.com>
Message-ID: <58726425-BA16-482B-A02E-3B0613CD5010@oracle.com>


> On 31 Oct 2017, at 05:58, Dmitry Samersoff <dmitry.samersoff at bell-sw.com> wrote:
> 
> Paul and Frederic,
> 
> Thank you.
> 
> One more question. Do we need to call verify_oop below?
> 
> 509   { // Check for the null sentinel.
> ...
> 517      xorptr(result, result);  // NULL object reference
> ...
> 
> 521   if (VerifyOops) {
> 522      verify_oop(result);
> 523   }
> 

I believe it?s harmless.

When the flag is on it eventually results in a call to the stub generated by generate_verify_oop:

http://hg.openjdk.java.net/jdk10/hs/file/tip/src/hotspot/cpu/x86/stubGenerator_x86_64.cpp#l1023

    // make sure object is 'reasonable'
    __ testptr(rax, rax);
    __ jcc(Assembler::zero, exit); // if obj is NULL it is OK

If the oop is null the verification will exit safely.

Paul.

> -Dmitry
> 
> 
> On 31.10.2017 00:56, Frederic Parain wrote:
>> I?m seeing no issue with rcx being aliased in this code.
>> 
>> Fred
>> 
>>> On Oct 30, 2017, at 15:44, Paul Sandoz <paul.sandoz at oracle.com> wrote:
>>> 
>>> Hi,
>>> 
>>> Thanks for reviewing.
>>> 
>>>> On 30 Oct 2017, at 11:05, Dmitry Samersoff <dmitry.samersoff at bell-sw.com> wrote:
>>>> 
>>>> Paul,
>>>> 
>>>> templateTable_x86.cpp:
>>>> 
>>>> 564   const Register flags = rcx;
>>>> 565   const Register rarg = NOT_LP64(rcx) LP64_ONLY(c_rarg1);
>>>> 
>>>> Should we use another register for rarg under NOT_LP64 ?
>>>> 
>>> 
>>> I think it should be ok, it i ain?t an expert here on the interpreter and the calling conventions, so please correct me.
>>> 
>>> Some more context:
>>> 
>>> +  const Register flags = rcx;
>>> +  const Register rarg = NOT_LP64(rcx) LP64_ONLY(c_rarg1);
>>> +  __ movl(rarg, (int)bytecode());
>>> 
>>> The current bytecode code is loaded into ?rarg?
>>> 
>>> +  call_VM(obj, CAST_FROM_FN_PTR(address, InterpreterRuntime::resolve_ldc), rarg);
>>> 
>>> Then ?rarg" is the argument to the call to InterpreterRuntime::resolve_ldc, after which it is no longer referred to.
>>> 
>>> +#ifndef _LP64
>>> +  // borrow rdi from locals
>>> +  __ get_thread(rdi);
>>> +  __ get_vm_result_2(flags, rdi);
>>> +  __ restore_locals();
>>> +#else
>>> +  __ get_vm_result_2(flags, r15_thread);
>>> +#endif
>>> 
>>> The result from the call is then loaded into flags.
>>> 
>>> So i don?t think it matters in this case if rcx is aliased.
>>> 
>>> Paul.
>>> 
>>>> -Dmitry
>>>> 
>>>> 
>>>> On 10/26/2017 08:03 PM, Paul Sandoz wrote:
>>>>> Hi,
>>>>> 
>>>>> Please review the following patch for minimal dynamic constant support:
>>>>> 
>>>>> http://cr.openjdk.java.net/~psandoz/jdk10/JDK-8186046-minimal-condy-support-hs/webrev/ <http://cr.openjdk.java.net/~psandoz/jdk10/JDK-8186046-minimal-condy-support-hs/webrev/>
>>>>> 
>>>>> https://bugs.openjdk.java.net/browse/JDK-8186046 <https://bugs.openjdk.java.net/browse/JDK-8186046>
>>>>> https://bugs.openjdk.java.net/browse/JDK-8186209 <https://bugs.openjdk.java.net/browse/JDK-8186209>
>>>>> 
>>>>> This patch is based on the JDK 10 unified HotSpot repository. Testing so far looks good.
>>>>> 
>>>>> By minimal i mean just the support in the runtime for a dynamic constant pool entry to be referenced by a LDC instruction or a bootstrap method argument. Much of the work leverages the foundations built by invoke dynamic but is arguably simpler since resolution is less complex.
>>>>> 
>>>>> A small set of bootstrap methods will be proposed as a follow on issue for 10 (these are currently being refined in the amber repository).
>>>>> 
>>>>> Bootstrap method invocation has not changed (and the rules are the same for dynamic constants and indy). It is planned to enhance this in a further major release to support lazy resolution of bootstrap method arguments.
>>>>> 
>>>>> The CSR for the VM specification is here:
>>>>> 
>>>>> https://bugs.openjdk.java.net/browse/JDK-8189199 <https://bugs.openjdk.java.net/browse/JDK-8189199>
>>>>> 
>>>>> the j.l.invoke package documentation was also updated but please consider the VM specification as the definitive "source of truth" (we may clean up this area further later on so it becomes more informative, and that may also apply to duplicative text on MethodHandles/VarHandles).
>>>>> 
>>>>> Any AoT-related work will be deferred to a future release.
>>>>> 
>>>>> ?
>>>>> 
>>>>> This patch only supports x64 platforms. There is a small set of changes specific to x64 (specifically to support null and primitives constants, as prior to this patch null was used as a sentinel for resolution and certain primitives types would never have been encountered, such as say byte).
>>>>> 
>>>>> We will need to follow up with the SPARC platform and it is hoped/anticipated that OpenJDK members responsible for other platforms (namely ARM and PPC) will separately provide patches.
>>>>> 
>>>>> ?
>>>>> 
>>>>> Many of tests rely on an experimental byte code API that supports the generation of byte code with dynamic constants.
>>>>> 
>>>>> One test uses class file bytes produced from a modified version of asmtools.  The modifications have now been pushed but a new version of asmtools need to be rolled into jtreg before the test can operate directly on asmtools information rather than embedding class file bytes directly in the test.
>>>>> 
>>>>> ?
>>>>> 
>>>>> Paul.
>>>>> 
>>>> 
>>> 
>> 
> 
> 


From paul.sandoz at oracle.com  Tue Oct 31 19:32:28 2017
From: paul.sandoz at oracle.com (Paul Sandoz)
Date: Tue, 31 Oct 2017 12:32:28 -0700
Subject: [10] RFR 8186046 Minimal ConstantDynamic support
In-Reply-To: <A3420FD7-0556-4865-9CE7-2E5252DD1381@oracle.com>
References: <A3420FD7-0556-4865-9CE7-2E5252DD1381@oracle.com>
Message-ID: <A400F3A3-CA5D-441F-B201-10FBA7A07A4A@oracle.com>

Lois identified and fixed a bug found when running the JCK VM tests. I merged the changes below into the current webrev.

Paul.

--- old/src/hotspot/share/interpreter/linkResolver.cpp	2017-10-31 11:56:30.541287505 -0400
+++ new/src/hotspot/share/interpreter/linkResolver.cpp	2017-10-31 11:56:29.215676272 -0400
@@ -301,14 +301,14 @@
   if (vca_result != Reflection::ACCESS_OK) {
     ResourceMark rm(THREAD);
     char* msg = Reflection::verify_class_access_msg(ref_klass,
-                                                    InstanceKlass::cast(sel_klass),
+                                                    InstanceKlass::cast(base_klass),
                                                     vca_result);
     if (msg == NULL) {
       Exceptions::fthrow(
         THREAD_AND_LOCATION,
         vmSymbols::java_lang_IllegalAccessError(),
         "failed to access class %s from class %s",
-        sel_klass->external_name(),
+        base_klass->external_name(),
         ref_klass->external_name());
     } else {
       // Use module specific message returned by verify_class_access_msg().


> On 26 Oct 2017, at 10:03, Paul Sandoz <Paul.Sandoz at oracle.com> wrote:
> 
> Hi,
> 
> Please review the following patch for minimal dynamic constant support:
> 
>  http://cr.openjdk.java.net/~psandoz/jdk10/JDK-8186046-minimal-condy-support-hs/webrev/ <http://cr.openjdk.java.net/~psandoz/jdk10/JDK-8186046-minimal-condy-support-hs/webrev/>
> 
>  https://bugs.openjdk.java.net/browse/JDK-8186046 <https://bugs.openjdk.java.net/browse/JDK-8186046>
>  https://bugs.openjdk.java.net/browse/JDK-8186209 <https://bugs.openjdk.java.net/browse/JDK-8186209>
> 
> This patch is based on the JDK 10 unified HotSpot repository. Testing so far looks good.
> 
> By minimal i mean just the support in the runtime for a dynamic constant pool entry to be referenced by a LDC instruction or a bootstrap method argument. Much of the work leverages the foundations built by invoke dynamic but is arguably simpler since resolution is less complex.
> 
> A small set of bootstrap methods will be proposed as a follow on issue for 10 (these are currently being refined in the amber repository).
> 
> Bootstrap method invocation has not changed (and the rules are the same for dynamic constants and indy). It is planned to enhance this in a further major release to support lazy resolution of bootstrap method arguments.
> 
> The CSR for the VM specification is here:
> 
>  https://bugs.openjdk.java.net/browse/JDK-8189199 <https://bugs.openjdk.java.net/browse/JDK-8189199>
> 
> the j.l.invoke package documentation was also updated but please consider the VM specification as the definitive "source of truth" (we may clean up this area further later on so it becomes more informative, and that may also apply to duplicative text on MethodHandles/VarHandles).
> 
> Any AoT-related work will be deferred to a future release.
> 
> ?
> 
> This patch only supports x64 platforms. There is a small set of changes specific to x64 (specifically to support null and primitives constants, as prior to this patch null was used as a sentinel for resolution and certain primitives types would never have been encountered, such as say byte).
> 
> We will need to follow up with the SPARC platform and it is hoped/anticipated that OpenJDK members responsible for other platforms (namely ARM and PPC) will separately provide patches.
> 
> ?
> 
> Many of tests rely on an experimental byte code API that supports the generation of byte code with dynamic constants.
> 
> One test uses class file bytes produced from a modified version of asmtools.  The modifications have now been pushed but a new version of asmtools need to be rolled into jtreg before the test can operate directly on asmtools information rather than embedding class file bytes directly in the test.
> 
> ?
> 
> Paul.


From sundararajan.athijegannathan at oracle.com  Tue Oct 31 04:27:35 2017
From: sundararajan.athijegannathan at oracle.com (Sundararajan Athijegannathan)
Date: Tue, 31 Oct 2017 09:57:35 +0530
Subject: RFR: 8190287: Update JDK's internal ASM to ASMv6
In-Reply-To: <59F3690B.6070309@oracle.com>
References: <59F3690B.6070309@oracle.com>
Message-ID: <59F7FBB7.2080400@oracle.com>

jlink changes look good.

I ran jlink tests and all nashorn tests (jtreg as well as ant 
test/test262parallel) after applying the patch locally. All fine!

+1

-Sundar

On 27/10/17, 10:42 PM, Kumar Srinivasan wrote:
> Hello Remi, Sundar and others,
>
> Please review the webrev [1] to update JDK's internal ASM to v6.
>
> To help with review areas, you can use the browser to search for mq 
> patches commented with //
>
> Highlights of changes:
> 1. updated ASMv6 // jdk-new-asmv6.patch
> 2. changes to jlink and jar to add ModuleMainClass and ModulePackages 
> attributes //jdk-new-asm-update.patch
> 3. adjustments to jdk tests  //jdk-new-asm-test.patch
> 4. minor adjustments  to hotspot tests  //jdk-new-hotspot-test.patch
>
> Tests:
> jdk_tier1, jdk_tier2, testset hotspot, hotspot_tier1, nashorn ant tests,
> Alan has also run several tests.
>
> Big thanks to Alan for #2 and #3 as part of [3].
>
> Thanks
> Kumar
>
> [1] http://cr.openjdk.java.net/~ksrini/8190287/webrev.00/index.html
> [2] https://bugs.openjdk.java.net/browse/JDK-8190287
> [3] https://bugs.openjdk.java.net/browse/JDK-8186236
>

From mandy.chung at oracle.com  Tue Oct 31 21:43:59 2017
From: mandy.chung at oracle.com (mandy chung)
Date: Tue, 31 Oct 2017 14:43:59 -0700
Subject: [10] RFR 8186046 Minimal ConstantDynamic support
In-Reply-To: <A3420FD7-0556-4865-9CE7-2E5252DD1381@oracle.com>
References: <A3420FD7-0556-4865-9CE7-2E5252DD1381@oracle.com>
Message-ID: <230aad0f-8649-baf2-71e8-8efc75d0cb16@oracle.com>


On 10/26/17 10:03 AM, Paul Sandoz wrote:
> Hi,
>
> Please review the following patch for minimal dynamic constant support:
>
>    http://cr.openjdk.java.net/~psandoz/jdk10/JDK-8186046-minimal-condy-support-hs/webrev/ <http://cr.openjdk.java.net/~psandoz/jdk10/JDK-8186046-minimal-condy-support-hs/webrev/>


I reviewed the non-hotspot change as a learning exercise (I am not close 
to j.l.invoke implementation).? I assume DynamicConstant intends to be 
non-public in this patch, right?

   30 public final class DynamicConstant

Mandy


From paul.sandoz at oracle.com  Tue Oct 31 22:53:30 2017
From: paul.sandoz at oracle.com (Paul Sandoz)
Date: Tue, 31 Oct 2017 15:53:30 -0700
Subject: [10] RFR 8186046 Minimal ConstantDynamic support
In-Reply-To: <230aad0f-8649-baf2-71e8-8efc75d0cb16@oracle.com>
References: <A3420FD7-0556-4865-9CE7-2E5252DD1381@oracle.com>
 <230aad0f-8649-baf2-71e8-8efc75d0cb16@oracle.com>
Message-ID: <05E86643-EE91-49BB-9A57-B291AA087211@oracle.com>


> On 31 Oct 2017, at 14:43, mandy chung <mandy.chung at oracle.com> wrote:
> 
> 
> 
> On 10/26/17 10:03 AM, Paul Sandoz wrote:
>> Hi,
>> 
>> Please review the following patch for minimal dynamic constant support:
>> 
>>   
>> http://cr.openjdk.java.net/~psandoz/jdk10/JDK-8186046-minimal-condy-support-hs/webrev/ <http://cr.openjdk.java.net/~psandoz/jdk10/JDK-8186046-minimal-condy-support-hs/webrev/>
> 
> 
> I reviewed the non-hotspot change as a learning exercise (I am not close to j.l.invoke implementation).  I assume DynamicConstant intends to be non-public in this patch, right?
>   30 public final class DynamicConstant 
> 

Well spotted.

More likely to be renamed to ConstantBootstraps when a minimal set of dynamic constant bootstraps will be proposed (likely this week) as a follow on patch.

I?ll made it non-public in the updated webrev so as to keep this patch self-contained.

Paul.