From jini.george at oracle.com Sun Oct 1 15:51:26 2017 From: jini.george at oracle.com (Jini George) Date: Sun, 1 Oct 2017 21:21:26 +0530 Subject: RFR: 8187402: UnknownOopException is occurred on Stack Memory window in HSDB In-Reply-To: References: <4c8a242f-464e-297e-5779-a237691be495@gmail.com> <6a49c097-10dc-6a47-154e-a6f5ae2e96ec@oracle.com> Message-ID: <94dfc272-cf4a-b499-8616-3e55d1e7a2ea@oracle.com> Apologize for the delay in responding to this, Yasumasa. I tried my hand at creating a test case for this by attaching an SA process to jshell and by invoking a method to traverse the frame oopmaps for the 'output reader' thread -- Please do take a look at: http://cr.openjdk.java.net/~jgeorge/sponsorships/8187402_ysuenaga/TestFrameOopMap.java I think, in general, for the issues manifested through the GUI, we can probably try having unit test cases directly invoking the methods involved. Thanks, Jini. On 9/27/2017 4:10 AM, Yasumasa Suenaga wrote: > Hi Jini, > > IMHO this issue (JDK-8187402) and JDK-8187403 are too difficult to crate > test cases because they are problems in Stack Memory window in HSDB. > Can we add noreg-hard label to JBS? > > > Thanks, > > Yasumasa > > > On 2017/09/27 2:36, Jini George wrote: >> Hi Yasumasa, >> >> The changes look fine, but please do include the test case also for >> this. In general, it would be great if you could provide test cases also >> along with the code changes while sending for review. >> >> Thank you, >> Jini. >> >> On 9/26/2017 8:19 PM, Yasumasa Suenaga wrote: >>> Hi all, >>> >>> I uploaded new webrev to be adapted to jdk10/hs: >>> >>> ?? http://cr.openjdk.java.net/~ysuenaga/JDK-8187402/webrev.01/ >>> >>> >>> Thanks, >>> >>> Yasumasa >>> >>> >>> On 2017/09/21 7:47, Yasumasa Suenaga wrote: >>>> PING: >>>> >>>> Have you checked this issue? >>>> >>>>> ?? http://cr.openjdk.java.net/~ysuenaga/JDK-8187402/webrev.00/ >>>> >>>> >>>> Yasumasa >>>> >>>> >>>> On 2017/09/11 11:17, Yasumasa Suenaga wrote: >>>>> Hi all, >>>>> >>>>> This review request is a part of [1]. >>>>> >>>>> >>>>> JBS: >>>>> ?? https://bugs.openjdk.java.net/browse/JDK-8187402 >>>>> >>>>> webrev: >>>>> ?? http://cr.openjdk.java.net/~ysuenaga/JDK-8187402/webrev.00/ >>>>> >>>>> >>>>> I cannot access JPRT. So I need a sponsor. >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Yasumasa >>>>> >>>>> >>>>> [1] >>>>> http://mail.openjdk.java.net/pipermail/serviceability-dev/2017-September/021821.html >>>>> >>>>> >>>>> >> From yasuenag at gmail.com Sun Oct 1 23:33:58 2017 From: yasuenag at gmail.com (Yasumasa Suenaga) Date: Mon, 2 Oct 2017 08:33:58 +0900 Subject: RFR: 8187402: UnknownOopException is occurred on Stack Memory window in HSDB In-Reply-To: <94dfc272-cf4a-b499-8616-3e55d1e7a2ea@oracle.com> References: <4c8a242f-464e-297e-5779-a237691be495@gmail.com> <6a49c097-10dc-6a47-154e-a6f5ae2e96ec@oracle.com> <94dfc272-cf4a-b499-8616-3e55d1e7a2ea@oracle.com> Message-ID: Hi Jini, Thank you for sharing the testcase. However I concern about this as below: 1. This bug appears at JIT'ed frame. So it might not appear on test server. We can add -Xcomp or JSON for compiler control for this issue, but we cannot control compile level (TieredCompilation) AFAIK. 2. If JShell implementation is changed in the future, this testcase might not verify this bug. I will merge your testcase to the webrev if these concerns can be ignored. Thanks, Yasumasa 2017/10/02 0:51 "Jini George" : > Apologize for the delay in responding to this, Yasumasa. > > I tried my hand at creating a test case for this by attaching an SA > process to jshell and by invoking a method to traverse the frame oopmaps > for the 'output reader' thread -- Please do take a look at: > > http://cr.openjdk.java.net/~jgeorge/sponsorships/8187402_ysu > enaga/TestFrameOopMap.java > > I think, in general, for the issues manifested through the GUI, we can > probably try having unit test cases directly invoking the methods involved. > > Thanks, > Jini. > > On 9/27/2017 4:10 AM, Yasumasa Suenaga wrote: > >> Hi Jini, >> >> IMHO this issue (JDK-8187402) and JDK-8187403 are too difficult to crate >> test cases because they are problems in Stack Memory window in HSDB. >> Can we add noreg-hard label to JBS? >> >> >> Thanks, >> >> Yasumasa >> >> >> On 2017/09/27 2:36, Jini George wrote: >> >>> Hi Yasumasa, >>> >>> The changes look fine, but please do include the test case also for >>> this. In general, it would be great if you could provide test cases also >>> along with the code changes while sending for review. >>> >>> Thank you, >>> Jini. >>> >>> On 9/26/2017 8:19 PM, Yasumasa Suenaga wrote: >>> >>>> Hi all, >>>> >>>> I uploaded new webrev to be adapted to jdk10/hs: >>>> >>>> ? http://cr.openjdk.java.net/~ysuenaga/JDK-8187402/webrev.01/ >>>> >>>> >>>> Thanks, >>>> >>>> Yasumasa >>>> >>>> >>>> On 2017/09/21 7:47, Yasumasa Suenaga wrote: >>>> >>>>> PING: >>>>> >>>>> Have you checked this issue? >>>>> >>>>> ?? http://cr.openjdk.java.net/~ysuenaga/JDK-8187402/webrev.00/ >>>>>> >>>>> >>>>> >>>>> Yasumasa >>>>> >>>>> >>>>> On 2017/09/11 11:17, Yasumasa Suenaga wrote: >>>>> >>>>>> Hi all, >>>>>> >>>>>> This review request is a part of [1]. >>>>>> >>>>>> >>>>>> JBS: >>>>>> ?? https://bugs.openjdk.java.net/browse/JDK-8187402 >>>>>> >>>>>> webrev: >>>>>> ?? http://cr.openjdk.java.net/~ysuenaga/JDK-8187402/webrev.00/ >>>>>> >>>>>> >>>>>> I cannot access JPRT. So I need a sponsor. >>>>>> >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Yasumasa >>>>>> >>>>>> >>>>>> [1] >>>>>> http://mail.openjdk.java.net/pipermail/serviceability-dev/20 >>>>>> 17-September/021821.html >>>>>> >>>>>> >>>>>> >>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From jcbeyler at google.com Tue Oct 3 03:52:30 2017 From: jcbeyler at google.com (JC Beyler) Date: Mon, 2 Oct 2017 20:52:30 -0700 Subject: Low-Overhead Heap Profiling In-Reply-To: References: <2af975e6-3827-bd57-0c3d-fadd54867a67@oracle.com> <365499b6-3f4d-a4df-9e7e-e72a739fb26b@oracle.com> <102c59b8-25b6-8c21-8eef-1de7d0bbf629@oracle.com> <1497366226.2829.109.camel@oracle.com> <1498215147.2741.34.camel@oracle.com> <044f8c75-72f3-79fd-af47-7ee875c071fd@oracle.com> <23f4e6f5-c94e-01f7-ef1d-5e328d4823c8@oracle.com> Message-ID: Dear all, Small update to the webrev: http://cr.openjdk.java.net/~rasbold/8171119/webrev.09_10/ Full webrev is here: http://cr.openjdk.java.net/~rasbold/8171119/webrev.10/ I updated a bit of the naming, removed a TODO comment, and I added a test for testing the sampling rate. I also updated the maximum stack depth to 1024, there is no reason to keep it so small. I did a micro benchmark that tests the overhead and it seems relatively the same. I compared allocations from a stack depth of 10 and allocations from a stack depth of 1024 (allocations are from the same helper method in http://cr.openjdk.java.net/~rasbold/8171119/webrev.10/raw_files/new/test/hotspot/jtreg/serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatRateTest.java ): - For an array of 1 integer allocated in a loop; stack depth 1024 vs stack depth 10: 1% slower - For an array of 200k integers allocated in a loop; stack depth 1024 vs stack depth 10: 3% slower So basically now moving the maximum stack depth to 1024 but we only copy over the stack depths actually used. For the next webrev, I will be adding a stack depth test to show that it works and probably put back the mutex locking so that we can see how difficult it is to keep thread safe. Let me know what you think! Jc On Mon, Sep 25, 2017 at 3:02 PM, JC Beyler wrote: > Forgot to say that for my numbers: > - Not in the test are the actual numbers I got for the various array > sizes, I ran the program 30 times and parsed the output; here are the > averages and standard deviation: > 1000: 1.28% average; 1.13% standard deviation > 10000: 1.59% average; 1.25% standard deviation > 100000: 1.26% average; 1.26% standard deviation > > The 1000/10000/100000 are the sizes of the arrays being allocated. These > are allocated 100k times and the sampling rate is 111 times the size of the > array. > > Thanks! > Jc > > > On Mon, Sep 25, 2017 at 3:01 PM, JC Beyler wrote: > >> Hi all, >> >> After a bit of a break, I am back working on this :). As before, here are >> two webrevs: >> >> - Full change set: http://cr.openjdk.java.net/~rasbold/8171119/webrev.09/ >> - Compared to version 8: http://cr.openjdk.java.net/ >> ~rasbold/8171119/webrev.08_09/ >> (This version is compared to version 8 I last showed but ported to >> the new folder hierarchy) >> >> In this version I have: >> - Handled Thomas' comments from his email of 07/03: >> - Merged the logging to be standard >> - Fixed up the code a bit where asked >> - Added some notes about the code not being thread-safe yet >> - Removed additional dead code from the version that modifies >> interpreter/c1/c2 >> - Fixed compiler issues so that it compiles with >> --disable-precompiled-header >> - Tested with ./configure --with-boot-jdk= >> --with-debug-level=slowdebug --disable-precompiled-headers >> >> Additionally, I added a test to check the sanity of the sampler: >> HeapMonitorStatCorrectnessTest (http://cr.openjdk.java.net/~r >> asbold/8171119/webrev.08_09/test/hotspot/jtreg/serviceabilit >> y/jvmti/HeapMonitor/MyPackage/HeapMonitorStatCorrectnessTest.java.patch) >> - This allocates a number of arrays and checks that we obtain the >> number of samples we want with an accepted error of 5%. I tested it 100 >> times and it passed everytime, I can test more if wanted >> - Not in the test are the actual numbers I got for the various array >> sizes, I ran the program 30 times and parsed the output; here are the >> averages and standard deviation: >> 1000: 1.28% average; 1.13% standard deviation >> 10000: 1.59% average; 1.25% standard deviation >> 100000: 1.26% average; 1.26% standard deviation >> >> What this means is that we were always at about 1~2% of the number of >> samples the test expected. >> >> Let me know what you think, >> Jc >> >> >> >> On Wed, Jul 5, 2017 at 9:31 PM, JC Beyler wrote: >> >>> Hi all, >>> >>> I apologize, I have not yet handled your remarks but thought this new >>> webrev would also be useful to see and comment on perhaps. >>> >>> Here is the latest webrev, it is generated slightly different than the >>> others since now I'm using webrev.ksh without the -N option: >>> http://cr.openjdk.java.net/~rasbold/8171119/webrev.08/ >>> >>> And the webrev.07 to webrev.08 diff is here: >>> http://cr.openjdk.java.net/~rasbold/8171119/webrev.07_08/ >>> >>> (Let me know if it works well) >>> >>> It's a small change between versions but it: >>> - provides a fix that makes the average sample rate correct (more on >>> that below). >>> - fixes the code to actually have it play nicely with the fast tlab >>> refill >>> - cleaned up a bit the JVMTI text and now use jvmtiFrameInfo >>> - moved the capability to be onload solo >>> >>> With this webrev, I've done a small study of the random number generator >>> we use here for the sampling rate. I took a small program and it can be >>> simplified to: >>> >>> for (outer loop) >>> for (inner loop) >>> int[] tmp = new int[arraySize]; >>> >>> - I've fixed the outer and inner loops to being 800 for this experiment, >>> meaning we allocate 640000 times an array of a given array size. >>> >>> - Each program provides the average sample size used for the whole >>> execution >>> >>> - Then, I ran each variation 30 times and then calculated the average of >>> the average sample size used for various array sizes. I selected the array >>> size to be one of the following: 1, 10, 100, 1000. >>> >>> - When compared to 512kb, the average sample size of 30 runs: >>> 1: 4.62% of error >>> 10: 3.09% of error >>> 100: 0.36% of error >>> 1000: 0.1% of error >>> 10000: 0.03% of error >>> >>> What it shows is that, depending on the number of samples, the average >>> does become better. This is because with an allocation of 1 element per >>> array, it will take longer to hit one of the thresholds. This is seen by >>> looking at the sample count statistic I put in. For the same number of >>> iterations (800 * 800), the different array sizes provoke: >>> 1: 62 samples >>> 10: 125 samples >>> 100: 788 samples >>> 1000: 6166 samples >>> 10000: 57721 samples >>> >>> And of course, the more samples you have, the more sample rates you >>> pick, which means that your average gets closer using that math. >>> >>> Thanks, >>> Jc >>> >>> On Thu, Jun 29, 2017 at 10:01 PM, JC Beyler wrote: >>> >>>> Thanks Robbin, >>>> >>>> This seems to have worked. When I have the next webrev ready, we will >>>> find out but I'm fairly confident it will work! >>>> >>>> Thanks agian! >>>> Jc >>>> >>>> On Wed, Jun 28, 2017 at 11:46 PM, Robbin Ehn >>>> wrote: >>>> >>>>> Hi JC, >>>>> >>>>> On 06/29/2017 12:15 AM, JC Beyler wrote: >>>>> >>>>>> B) Incremental changes >>>>>> >>>>> >>>>> I guess the most common work flow here is using mq : >>>>> hg qnew fix_v1 >>>>> edit files >>>>> hg qrefresh >>>>> hg qnew fix_v2 >>>>> edit files >>>>> hg qrefresh >>>>> >>>>> if you do hg log you will see 2 commits >>>>> >>>>> webrev.ksh -r -2 -o my_inc_v1_v2 >>>>> webrev.ksh -o my_full_v2 >>>>> >>>>> >>>>> In your .hgrc you might need: >>>>> [extensions] >>>>> mq = >>>>> >>>>> /Robbin >>>>> >>>>> >>>>>> Again another newbiew question here... >>>>>> >>>>>> For showing the incremental changes, is there a link that explains >>>>>> how to do that? I apologize for my newbie questions all the time :) >>>>>> >>>>>> Right now, I do: >>>>>> >>>>>> ksh ../webrev.ksh -m -N >>>>>> >>>>>> That generates a webrev.zip and send it to Chuck Rasbold. He then >>>>>> uploads it to a new webrev. >>>>>> >>>>>> I tried commiting my change and adding a small change. Then if I just >>>>>> do ksh ../webrev.ksh without any options, it seems to produce a similar >>>>>> page but now with only the changes I had (so the 06-07 comparison you were >>>>>> talking about) and a changeset that has it all. I imagine that is what you >>>>>> meant. >>>>>> >>>>>> Which means that my workflow would become: >>>>>> >>>>>> 1) Make changes >>>>>> 2) Make a webrev without any options to show just the differences >>>>>> with the tip >>>>>> 3) Amend my changes to my local commit so that I have it done with >>>>>> 4) Go to 1 >>>>>> >>>>>> Does that seem correct to you? >>>>>> >>>>>> Note that when I do this, I only see the full change of a file in the >>>>>> full change set (Side note here: now the page says change set and not >>>>>> patch, which is maybe why Serguei was having issues?). >>>>>> >>>>>> Thanks! >>>>>> Jc >>>>>> >>>>>> >>>>>> >>>>>> On Wed, Jun 28, 2017 at 1:12 AM, Robbin Ehn >>>>> > wrote: >>>>>> >>>>>> Hi, >>>>>> >>>>>> On 06/28/2017 12:04 AM, JC Beyler wrote: >>>>>> >>>>>> Dear Thomas et al, >>>>>> >>>>>> Here is the newest webrev: >>>>>> http://cr.openjdk.java.net/~rasbold/8171119/webrev.07/ < >>>>>> http://cr.openjdk.java.net/~rasbold/8171119/webrev.07/> >>>>>> >>>>>> >>>>>> >>>>>> You have some more bits to in there but generally this looks good >>>>>> and really nice with more tests. >>>>>> I'll do and deep dive and re-test this when I get back from my >>>>>> long vacation with whatever patch version you have then. >>>>>> >>>>>> Also I think it's time you provide incremental (v06->07 changes) >>>>>> as well as complete change-sets. >>>>>> >>>>>> Thanks, Robbin >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> Thomas, I "think" I have answered all your remarks. The >>>>>> summary is: >>>>>> >>>>>> - The statistic system is up and provides insight on what the >>>>>> heap sampler is doing >>>>>> - I've noticed that, though the sampling rate is at the >>>>>> right mean, we are missing some samples, I have not yet tracked out why >>>>>> (details below) >>>>>> >>>>>> - I've run a tiny benchmark that is the worse case: it is a >>>>>> very tight loop and allocated a small array >>>>>> - In this case, I see no overhead when the system is off >>>>>> so that is a good start :) >>>>>> - I see right now a high overhead in this case when >>>>>> sampling is on. This is not a really too surprising but I'm going to see if >>>>>> this is consistent with our >>>>>> internal implementation. The benchmark is really allocation >>>>>> stressful so I'm not too surprised but I want to do the due diligence. >>>>>> >>>>>> - The statistic system up is up and I have a new test >>>>>> http://cr.openjdk.java.net/~rasbold/8171119/webrev.07/test/s >>>>>> erviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatTes >>>>>> t.java.patch >>>>>> >>>>> serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatTe >>>>>> st.java.patch> >>>>>> - I did a bit of a study about the random generator >>>>>> here, more details are below but basically it seems to work well >>>>>> >>>>>> - I added a capability but since this is the first time >>>>>> doing this, I was not sure I did it right >>>>>> - I did add a test though for it and the test seems to >>>>>> do what I expect (all methods are failing with the >>>>>> JVMTI_ERROR_MUST_POSSESS_CAPABILITY error). >>>>>> - http://cr.openjdk.java.net/~ra >>>>>> sbold/8171119/webrev.07/test/serviceability/jvmti/HeapMonito >>>>>> r/MyPackage/HeapMonitorNoCapabilityTest.java.patch >>>>>> >>>>> serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorNoCapa >>>>>> bilityTest.java.patch> >>>>>> >>>>>> - I still need to figure out what to do about the >>>>>> multi-agent vs single-agent issue >>>>>> >>>>>> - As far as measurements, it seems I still need to look at: >>>>>> - Why we do the 20 random calls first, are they >>>>>> necessary? >>>>>> - Look at the mean of the sampling rate that the random >>>>>> generator does and also what is actually sampled >>>>>> - What is the overhead in terms of memory/performance >>>>>> when on? >>>>>> >>>>>> I have inlined my answers, I think I got them all in the new >>>>>> webrev, let me know your thoughts. >>>>>> >>>>>> Thanks again! >>>>>> Jc >>>>>> >>>>>> >>>>>> On Fri, Jun 23, 2017 at 3:52 AM, Thomas Schatzl < >>>>>> thomas.schatzl at oracle.com >>>>> thomas.schatzl at oracle.com >>>>>> >>>>>> >> wrote: >>>>>> >>>>>> Hi, >>>>>> >>>>>> On Wed, 2017-06-21 at 13:45 -0700, JC Beyler wrote: >>>>>> > Hi all, >>>>>> > >>>>>> > First off: Thanks again to Robbin and Thomas for their >>>>>> reviews :) >>>>>> > >>>>>> > Next, I've uploaded a new webrev: >>>>>> > http://cr.openjdk.java.net/~rasbold/8171119/webrev.06/ >>>>>> >>>>>> >>>>> http://cr.openjdk.java.net/~rasbold/8171119/webrev.06/>> >>>>>> >>>>>> > >>>>>> > Here is an update: >>>>>> > >>>>>> > - @Robbin, I forgot to say that yes I need to look at >>>>>> implementing >>>>>> > this for the other architectures and testing it before >>>>>> it is all >>>>>> > ready to go. Is it common to have it working on all >>>>>> possible >>>>>> > combinations or is there a subset that I should be >>>>>> doing first and we >>>>>> > can do the others later? >>>>>> > - I've tested slowdebug, built and ran the JTreg tests >>>>>> I wrote with >>>>>> > slowdebug and fixed a few more issues >>>>>> > - I've refactored a bit of the code following Thomas' >>>>>> comments >>>>>> > - I think I've handled all the comments from Thomas >>>>>> (I put >>>>>> > comments inline below for the specifics) >>>>>> >>>>>> Thanks for handling all those. >>>>>> >>>>>> > - Following Thomas' comments on statistics, I want to >>>>>> add some >>>>>> > quality assurance tests and find that the easiest way >>>>>> would be to >>>>>> > have a few counters of what is happening in the >>>>>> sampler and expose >>>>>> > that to the user. >>>>>> > - I'll be adding that in the next version if no one >>>>>> sees any >>>>>> > objections to that. >>>>>> > - This will allow me to add a sanity test in JTreg >>>>>> about number of >>>>>> > samples and average of sampling rate >>>>>> > >>>>>> > @Thomas: I had a few questions that I inlined below >>>>>> but I will >>>>>> > summarize the "bigger ones" here: >>>>>> > - You mentioned constants are not using the right >>>>>> conventions, I >>>>>> > looked around and didn't see any convention except >>>>>> normal naming then >>>>>> > for static constants. Is that right? >>>>>> >>>>>> I looked through https://wiki.openjdk.java.net/ >>>>>> display/HotSpot/StyleGui >>>>> /display/HotSpot/StyleGui> >>>>>> >>>>> https://wiki.openjdk.java.net/display/HotSpot/StyleGui>> >>>>>> de and the rule is to "follow an existing pattern and >>>>>> must have a >>>>>> distinct appearance from other names". Which does not >>>>>> help a lot I >>>>>> guess :/ The GC team started using upper camel case, e.g. >>>>>> SomeOtherConstant, but very likely this is probably not >>>>>> applied >>>>>> consistently throughout. So I am fine with not adding >>>>>> another style >>>>>> (like kMaxStackDepth with the "k" in front with some >>>>>> unknown meaning) >>>>>> is fine. >>>>>> >>>>>> (Chances are you will find that style somewhere used >>>>>> anyway too, >>>>>> apologies if so :/) >>>>>> >>>>>> >>>>>> Thanks for that link, now I know where to look. I used the >>>>>> upper camel case in my code as well then :) I should have gotten them all. >>>>>> >>>>>> >>>>>> > PS: I've also inlined my answers to Thomas below: >>>>>> > >>>>>> > On Tue, Jun 13, 2017 at 8:03 AM, Thomas Schatzl >>>>>> >>>>> > e.com > wrote: >>>>>> > > Hi all, >>>>>> > > >>>>>> > > On Mon, 2017-06-12 at 11:11 -0700, JC Beyler wrote: >>>>>> > > > Dear all, >>>>>> > > > >>>>>> > > > I've continued working on this and have done the >>>>>> following >>>>>> > > webrev: >>>>>> > > > http://cr.openjdk.java.net/~ra >>>>>> sbold/8171119/webrev.05/ >>>>> asbold/8171119/webrev.05/> >>>>>> >>>>> http://cr.openjdk.java.net/~rasbold/8171119/webrev.05/>> >>>>>> >>>>>> > > >>>>>> > > [...] >>>>>> > > > Things I still need to do: >>>>>> > > > - Have to fix that TLAB case for the >>>>>> FastTLABRefill >>>>>> > > > - Have to start looking at the data to see >>>>>> that it is >>>>>> > > consistent and does gather the right samples, right >>>>>> frequency, etc. >>>>>> > > > - Have to check the GC elements and what that >>>>>> produces >>>>>> > > > - Run a slowdebug run and ensure I fixed all >>>>>> those issues you >>>>>> > > saw > Robbin >>>>>> > > > >>>>>> > > > Thanks for looking at the webrev and have a great >>>>>> week! >>>>>> > > >>>>>> > > scratching a bit on the surface of this change, >>>>>> so apologies for >>>>>> > > rather shallow comments: >>>>>> > > >>>>>> > > - macroAssembler_x86.cpp:5604: while this is >>>>>> compiler code, and I >>>>>> > > am not sure this is final, please avoid littering >>>>>> the code with >>>>>> > > TODO remarks :) They tend to be candidates for >>>>>> later wtf moments >>>>>> > > only. >>>>>> > > >>>>>> > > Just file a CR for that. >>>>>> > > >>>>>> > Newcomer question: what is a CR and not sure I have >>>>>> the rights to do >>>>>> > that yet ? :) >>>>>> >>>>>> Apologies. CR is a change request, this suggests to file >>>>>> a bug in the >>>>>> bug tracker. And you are right, you can't just create a >>>>>> new account in >>>>>> the OpenJDK JIRA yourselves. :( >>>>>> >>>>>> >>>>>> Ok good to know, I'll continue with my own todo list but I'll >>>>>> work hard on not letting it slip in the webrevs anymore :) >>>>>> >>>>>> >>>>>> I was mostly referring to the "... but it is a TODO" >>>>>> part of that >>>>>> comment in macroassembler_x86.cpp. Comments about the >>>>>> why of the code >>>>>> are appreciated. >>>>>> >>>>>> [Note that I now understand that this is to some degree >>>>>> still work in >>>>>> progress. As long as the final changeset does no contain >>>>>> TODO's I am >>>>>> fine (and it's not a hard objection, rather their use in >>>>>> "final" code >>>>>> is typically limited in my experience)] >>>>>> >>>>>> 5603 // Currently, if this happens, just set back the >>>>>> actual end to >>>>>> where it was. >>>>>> 5604 // We miss a chance to sample here. >>>>>> >>>>>> Would be okay, if explaining "this" and the "why" of >>>>>> missing a chance >>>>>> to sample here would be best. >>>>>> >>>>>> Like maybe: >>>>>> >>>>>> // If we needed to refill TLABs, just set the actual end >>>>>> point to >>>>>> // the end of the TLAB again. We do not sample here >>>>>> although we could. >>>>>> >>>>>> Done with your comment, it works well in my mind. >>>>>> >>>>>> I am not sure whether "miss a chance to sample" meant >>>>>> "we could, but >>>>>> consciously don't because it's not that useful" or "it >>>>>> would be >>>>>> necessary but don't because it's too complicated to do.". >>>>>> >>>>>> Looking at the original comment once more, I am also not >>>>>> sure if that >>>>>> comment shouldn't referring to the "end" variable (not >>>>>> actual_end) >>>>>> because that's the variable that is responsible for >>>>>> taking the sampling >>>>>> path? (Going from the member description of >>>>>> ThreadLocalAllocBuffer). >>>>>> >>>>>> >>>>>> I've moved this code and it no longer shows up here but the >>>>>> rationale and answer was: >>>>>> >>>>>> So.. Yes, end is the variable provoking the sampling. Actual >>>>>> end is the actual end of the TLAB. >>>>>> >>>>>> What was happening here is that the code is resetting _end to >>>>>> point towards the end of the new TLAB. Because, we now have the end for >>>>>> sampling and _actual_end for >>>>>> the actual end, we need to update the actual_end as well. >>>>>> >>>>>> Normally, were we to do the real work here, we would >>>>>> calculate the (end - start) offset, then do: >>>>>> >>>>>> - Set the new end to : start + (old_end - old_start) >>>>>> - Set the actual end like we do here now where it because it >>>>>> is the actual end. >>>>>> >>>>>> Why is this not done here now anymore? >>>>>> - I was still debating which path to take: >>>>>> - Do it in the fast refill code, it has its perks: >>>>>> - In a world where fast refills are happening all >>>>>> the time or a lot, we can augment there the code to do the sampling >>>>>> - Remember what we had as an end before leaving the >>>>>> slowpath and check on return >>>>>> - This is what I'm doing now, it removes the need >>>>>> to go fix up all fast refill paths but if you remain in fast refill paths, >>>>>> you won't get sampling. I >>>>>> have to think of the consequences of that, maybe a future >>>>>> change later on? >>>>>> - I have the statistics now so I'm going to >>>>>> study that >>>>>> -> By the way, though my statistics are >>>>>> showing I'm missing some samples, if I turn off FastTlabRefill, it is the >>>>>> same loss so for now, it seems >>>>>> this does not occur in my simple test. >>>>>> >>>>>> >>>>>> >>>>>> But maybe I am only confused and it's best to just leave >>>>>> the comment >>>>>> away. :) >>>>>> >>>>>> Thinking about it some more, doesn't this not-sampling >>>>>> in this case >>>>>> mean that sampling does not work in any collector that >>>>>> does inline TLAB >>>>>> allocation at the moment? (Or is inline TLAB alloc >>>>>> automatically >>>>>> disabled with sampling somehow?) >>>>>> >>>>>> That would indeed be a bigger TODO then :) >>>>>> >>>>>> >>>>>> Agreed, this remark made me think that perhaps as a first >>>>>> step the new way of doing it is better but I did have to: >>>>>> - Remove the const of the ThreadLocalBuffer remaining and >>>>>> hard_end methods >>>>>> - Move hard_end out of the header file to have a bit more >>>>>> logic there >>>>>> >>>>>> Please let me know what you think of that and if you prefer >>>>>> it this way or changing the fast refills. (I prefer this way now because it >>>>>> is more incremental). >>>>>> >>>>>> >>>>>> > > - calling HeapMonitoring::do_weak_oops() (which >>>>>> should probably be >>>>>> > > called weak_oops_do() like other similar methods) >>>>>> only if string >>>>>> > > deduplication is enabled (in >>>>>> g1CollectedHeap.cpp:4511) seems wrong. >>>>>> > >>>>>> > The call should be at least around 6 lines up outside >>>>>> the if. >>>>>> > >>>>>> > Preferentially in a method like >>>>>> process_weak_jni_handles(), including >>>>>> > additional logging. (No new (G1) gc phase without >>>>>> minimal logging >>>>>> > :)). >>>>>> > Done but really not sure because: >>>>>> > >>>>>> > I put for logging: >>>>>> > log_develop_trace(gc, freelist)("G1ConcRegionFreeing >>>>>> [other] : heap >>>>>> > monitoring"); >>>>>> >>>>>> I would think that "gc, ref" would be more appropriate >>>>>> log tags for >>>>>> this similar to jni handles. >>>>>> (I am als not sure what weak reference handling has to >>>>>> do with >>>>>> G1ConcRegionFreeing, so I am a bit puzzled) >>>>>> >>>>>> >>>>>> I was not sure what to put for the tags or really as the >>>>>> message. I cleaned it up a bit now to: >>>>>> log_develop_trace(gc, ref)("HeapSampling [other] : heap >>>>>> monitoring processing"); >>>>>> >>>>>> >>>>>> >>>>>> > Since weak_jni_handles didn't have logging for me to >>>>>> be inspired >>>>>> > from, I did that but unconvinced this is what should >>>>>> be done. >>>>>> >>>>>> The JNI handle processing does have logging, but only in >>>>>> ReferenceProcessor::process_discovered_references(). In >>>>>> process_weak_jni_handles() only overall time is measured >>>>>> (in a G1 >>>>>> specific way, since only G1 supports disabling reference >>>>>> procesing) :/ >>>>>> >>>>>> The code in ReferenceProcessor prints both time taken >>>>>> referenceProcessor.cpp:254, as well as the count, but >>>>>> strangely only in >>>>>> debug VMs. >>>>>> >>>>>> I have no idea why this logging is that unimportant to >>>>>> only print that >>>>>> in a debug VM. However there are reviews out for >>>>>> changing this area a >>>>>> bit, so it might be useful to wait for that >>>>>> (JDK-8173335). >>>>>> >>>>>> >>>>>> I cleaned it up a bit anyway and now it returns the count of >>>>>> objects that are in the system. >>>>>> >>>>>> >>>>>> > > - the change doubles the size of >>>>>> > > CollectedHeap::allocate_from_tlab_slow() above the >>>>>> "small and nice" >>>>>> > > threshold. Maybe it could be refactored a bit. >>>>>> > Done I think, it looks better to me :). >>>>>> >>>>>> In ThreadLocalAllocBuffer::handle_sample() I think the >>>>>> set_back_actual_end()/pick_next_sample() calls could be >>>>>> hoisted out of >>>>>> the "if" :) >>>>>> >>>>>> >>>>>> Done! >>>>>> >>>>>> >>>>>> > > - referenceProcessor.cpp:261: the change should add >>>>>> logging about >>>>>> > > the number of references encountered, maybe after >>>>>> the corresponding >>>>>> > > "JNI weak reference count" log message. >>>>>> > Just to double check, are you saying that you'd like >>>>>> to have the heap >>>>>> > sampler to keep in store how many sampled objects were >>>>>> encountered in >>>>>> > the HeapMonitoring::weak_oops_do? >>>>>> > - Would a return of the method with the number of >>>>>> handled >>>>>> > references and logging that work? >>>>>> >>>>>> Yes, it's fine if HeapMonitoring::weak_oops_do() only >>>>>> returned the >>>>>> number of processed weak oops. >>>>>> >>>>>> >>>>>> Done also (but I admit I have not tested the output yet) :) >>>>>> >>>>>> >>>>>> > - Additionally, would you prefer it in a separate >>>>>> block with its >>>>>> > GCTraceTime? >>>>>> >>>>>> Yes. Both kinds of information is interesting: while the >>>>>> time taken is >>>>>> typically more important, the next question would be >>>>>> why, and the >>>>>> number of references typically goes a long way there. >>>>>> >>>>>> See above though, it is probably best to wait a bit. >>>>>> >>>>>> >>>>>> Agreed that I "could" wait but, if it's ok, I'll just >>>>>> refactor/remove this when we get closer to something final. Either, >>>>>> JDK-8173335 >>>>>> has gone in and I will notice it now or it will soon and I >>>>>> can change it then. >>>>>> >>>>>> >>>>>> > > - threadLocalAllocBuffer.cpp:331: one more "TODO" >>>>>> > Removed it and added it to my personal todos to look >>>>>> at. >>>>>> > > > >>>>>> > > - threadLocalAllocBuffer.hpp: ThreadLocalAllocBuffer >>>>>> class >>>>>> > > documentation should be updated about the sampling >>>>>> additions. I >>>>>> > > would have no clue what the difference between >>>>>> "actual_end" and >>>>>> > > "end" would be from the given information. >>>>>> > If you are talking about the comments in this file, I >>>>>> made them more >>>>>> > clear I hope in the new webrev. If it was somewhere >>>>>> else, let me know >>>>>> > where to change. >>>>>> >>>>>> Thanks, that's much better. Maybe a note in the comment >>>>>> of the class >>>>>> that ThreadLocalBuffer provides some sampling facility >>>>>> by modifying the >>>>>> end() of the TLAB to cause "frequent" calls into the >>>>>> runtime call where >>>>>> actual sampling takes place. >>>>>> >>>>>> >>>>>> Done, I think it's better now. Added something about the >>>>>> slow_path_end as well. >>>>>> >>>>>> >>>>>> > > - in heapMonitoring.hpp: there are some random >>>>>> comments about some >>>>>> > > code that has been grabbed from >>>>>> "util/math/fastmath.[h|cc]". I >>>>>> > > can't tell whether this is code that can be used but >>>>>> I assume that >>>>>> > > Noam Shazeer is okay with that (i.e. that's all >>>>>> Google code). >>>>>> > Jeremy and I double checked and we can release that as >>>>>> I thought. I >>>>>> > removed the comment from that piece of code entirely. >>>>>> >>>>>> Thanks. >>>>>> >>>>>> > > - heapMonitoring.hpp/cpp static constant naming does >>>>>> not correspond >>>>>> > > to Hotspot's. Additionally, in Hotspot static >>>>>> methods are cased >>>>>> > > like other methods. >>>>>> > I think I fixed the methods to be cased the same way >>>>>> as all other >>>>>> > methods. For static constants, I was not sure. I fixed >>>>>> a few other >>>>>> > variables but I could not seem to really see a >>>>>> consistent trend for >>>>>> > constants. I made them as variables but I'm not sure >>>>>> now. >>>>>> >>>>>> Sorry again, style is a kind of mess. The goal of my >>>>>> suggestions here >>>>>> is only to prevent yet another style creeping in. >>>>>> >>>>>> > > - in heapMonitoring.cpp there are a few cryptic >>>>>> comments at the top >>>>>> > > that seem to refer to internal stuff that should >>>>>> probably be >>>>>> > > removed. >>>>>> > Sorry about that! My personal todos not cleared out. >>>>>> >>>>>> I am happy about comments, but I simply did not >>>>>> understand any of that >>>>>> and I do not know about other readers as well. >>>>>> >>>>>> If you think you will remember removing/updating them >>>>>> until the review >>>>>> proper (I misunderstood the review situation a little it >>>>>> seems). >>>>>> >>>>>> > > I did not think through the impact of the TLAB >>>>>> changes on collector >>>>>> > > behavior yet (if there are). Also I did not check >>>>>> for problems with >>>>>> > > concurrent mark and SATB/G1 (if there are). >>>>>> > I would love to know your thoughts on this, I think >>>>>> this is fine. I >>>>>> >>>>>> I think so too now. No objects are made live out of thin >>>>>> air :) >>>>>> >>>>>> > see issues with multiple threads right now hitting the >>>>>> stack storage >>>>>> > instance. Previous webrevs had a mutex lock here but >>>>>> we took it out >>>>>> > for simplificity (and only for now). >>>>>> >>>>>> :) When looking at this after some thinking I now assume >>>>>> for this >>>>>> review that this code is not MT safe at all. There seems >>>>>> to be more >>>>>> synchronization missing than just the one for the >>>>>> StackTraceStorage. So >>>>>> no comments about this here. >>>>>> >>>>>> >>>>>> I doubled checked a bit (quickly I admit) but it seems that >>>>>> synchronization in StackTraceStorage is really all you need (all methods >>>>>> lead to a StackTraceStorage one >>>>>> and can be multithreaded outside of that). >>>>>> There is a question about the initialization where the method >>>>>> HeapMonitoring::initialize_profiling is not thread safe. >>>>>> It would work (famous last words) and not crash if there was >>>>>> a race but we could add a synchronization point there as well (and >>>>>> therefore on the stop as well). >>>>>> >>>>>> But anyway I will really check and do this once we add back >>>>>> synchronization. >>>>>> >>>>>> >>>>>> Also, this would require some kind of specification of >>>>>> what is allowed >>>>>> to be called when and where. >>>>>> >>>>>> >>>>>> Would we specify this with the methods in the jvmti.xml file? >>>>>> We could start by specifying in each that they are not thread safe but I >>>>>> saw no mention of that for >>>>>> other methods. >>>>>> >>>>>> >>>>>> One potentially relevant observation about locking here: >>>>>> depending on >>>>>> sampling frequency, StackTraceStore::add_trace() may be >>>>>> rather >>>>>> frequently called. I assume that you are going to do >>>>>> measurements :) >>>>>> >>>>>> >>>>>> Though we don't have the TLAB implementation in our code, the >>>>>> compiler generated sampler uses 2% of overhead with a 512k sampling rate. I >>>>>> can do real measurements >>>>>> when the code settles and we can see how costly this is as a >>>>>> TLAB implementation. >>>>>> However, my theory is that if the rate is 512k, the >>>>>> memory/performance overhead should be minimal since it is what we saw with >>>>>> our code/workloads (though not called >>>>>> the same way, we call it essentially at the same rate). >>>>>> If you have a benchmark you'd like me to test, let me know! >>>>>> >>>>>> Right now, with my really small test, this does use a bit of >>>>>> overhead even for a 512k sample size. I don't know yet why, I'm going to >>>>>> see what is going on. >>>>>> >>>>>> Finally, I think it is not reasonable to suppose the overhead >>>>>> to be negligible if the sampling rate used is too low. The user should know >>>>>> that the lower the rate, >>>>>> the higher the overhead (documentation TODO?). >>>>>> >>>>>> >>>>>> I am not sure what the expected usage of the API is, but >>>>>> StackTraceStore::add_trace() seems to be able to grow >>>>>> without bounds. >>>>>> Only a GC truncates them to the live ones. That in >>>>>> itself seems to be >>>>>> problematic (GCs can be *wide* apart), and of course >>>>>> some of the API >>>>>> methods add to that because they duplicate that >>>>>> unbounded array. Do you >>>>>> have any concerns/measurements about this? >>>>>> >>>>>> >>>>>> So, the theory is that yes add_trace can be able to grow >>>>>> without bounds but it grows at a sample per 512k of allocated space. The >>>>>> stacks it gathers are currently >>>>>> maxed at 64 (I'd like to expand that to an option to the user >>>>>> though at some point). So I have no concerns because: >>>>>> >>>>>> - If really this is taking a lot of space, that means the job >>>>>> is keeping a lot of objects in memory as well, therefore the entire heap is >>>>>> getting huge >>>>>> - If this is the case, you will be triggering a GC at some >>>>>> point anyway. >>>>>> >>>>>> (I'm putting under the rug the issue of "What if we set the >>>>>> rate to 1 for example" because as you lower the sampling rate, we cannot >>>>>> guarantee low overhead; the >>>>>> idea behind this feature is to have a means of having >>>>>> meaningful allocated samples at a low overhead) >>>>>> >>>>>> I have no measurements really right now but since I now have >>>>>> some statistics I can poll, I will look a bit more at this question. >>>>>> >>>>>> I have the same last sentence than above: the user should >>>>>> expect this to happen if the sampling rate is too small. That probably can >>>>>> be reflected in the >>>>>> StartHeapSampling as a note : careful this might impact your >>>>>> performance. >>>>>> >>>>>> >>>>>> Also, these stack traces might hold on to huge arrays. >>>>>> Any >>>>>> consideration of that? Particularly it might be the >>>>>> cause for OOMEs in >>>>>> tight memory situations. >>>>>> >>>>>> >>>>>> There is a stack size maximum that is set to 64 so it should >>>>>> not hold huge arrays. I don't think this is an issue but I can double check >>>>>> with a test or two. >>>>>> >>>>>> >>>>>> - please consider adding a safepoint check in >>>>>> HeapMonitoring::weak_oops_do to prevent accidental >>>>>> misuse. >>>>>> >>>>>> - in struct StackTraceStorage, the public fields may >>>>>> also need >>>>>> underscores. At least some files in the runtime >>>>>> directory have structs >>>>>> with underscored public members (and some don't). The >>>>>> runtime team >>>>>> should probably comment on that. >>>>>> >>>>>> >>>>>> Agreed I did not know. I looked around and a lot of structs >>>>>> did not have them it seemed so I left it as is. I will happily change it if >>>>>> someone prefers (I was not >>>>>> sure if you really preferred or not, your sentence seemed to >>>>>> be more a note of "this might need to change but I don't know if the >>>>>> runtime team enforces that", let >>>>>> me know if I read that wrongly). >>>>>> >>>>>> >>>>>> - In StackTraceStorage::weak_oops_do(), when examining >>>>>> the >>>>>> StackTraceData, maybe it is useful to consider having a >>>>>> non-NULL >>>>>> reference outside of the heap's reserved space an error. >>>>>> There should >>>>>> be no oop outside of the heap's reserved space ever. >>>>>> >>>>>> Unless you allow storing random values in >>>>>> StackTraceData::obj, which I >>>>>> would not encourage. >>>>>> >>>>>> >>>>>> I suppose you are talking about this part: >>>>>> if ((value != NULL && Universe::heap()->is_in_reserved(value)) >>>>>> && >>>>>> (is_alive == NULL || >>>>>> is_alive->do_object_b(value))) { >>>>>> >>>>>> What you are saying is that I could have something like: >>>>>> if (value != my_non_null_reference && >>>>>> (is_alive == NULL || >>>>>> is_alive->do_object_b(value))) { >>>>>> >>>>>> Is that what you meant? Is there really a reason to do so? >>>>>> When I look at the code, is_in_reserved seems like a O(1) method call. I'm >>>>>> not even sure we can have a >>>>>> NULL value to be honest. I might have to study that to see if >>>>>> this was not a paranoid test to begin with. >>>>>> >>>>>> The is_alive code has now morphed due to the comment below. >>>>>> >>>>>> >>>>>> >>>>>> - HeapMonitoring::weak_oops_do() does not seem to use the >>>>>> passed AbstractRefProcTaskExecutor. >>>>>> >>>>>> >>>>>> It did use it: >>>>>> size_t HeapMonitoring::weak_oops_do( >>>>>> AbstractRefProcTaskExecutor *task_executor, >>>>>> BoolObjectClosure* is_alive, >>>>>> OopClosure *f, >>>>>> VoidClosure *complete_gc) { >>>>>> assert(SafepointSynchronize::is_at_safepoint(), "must be >>>>>> at safepoint"); >>>>>> >>>>>> if (task_executor != NULL) { >>>>>> task_executor->set_single_threaded_mode(); >>>>>> } >>>>>> return StackTraceStorage::storage()->weak_oops_do(is_alive, >>>>>> f, complete_gc); >>>>>> } >>>>>> >>>>>> But due to the comment below, I refactored this, so this is >>>>>> no longer here. Now I have an always true closure that is passed. >>>>>> >>>>>> >>>>>> - I do not understand allowing to call this method with >>>>>> a NULL >>>>>> complete_gc closure. This would mean that objects >>>>>> referenced from the >>>>>> object that is referenced by the StackTraceData are not >>>>>> pulled, meaning >>>>>> they would get stale. >>>>>> >>>>>> - same with is_alive parameter value of NULL >>>>>> >>>>>> >>>>>> So these questions made me look a bit closer at this code. >>>>>> This code I think was written this way to have a very small impact on the >>>>>> file but you are right, there >>>>>> is no reason for this here. I've simplified the code by >>>>>> making in referenceProcessor.cpp a process_HeapSampling method that handles >>>>>> everything there. >>>>>> >>>>>> The code allowed NULLs because it depended on where you were >>>>>> coming from and how the code was being called. >>>>>> >>>>>> - I added a static always_true variable and pass that now to >>>>>> be more consistent with the rest of the code. >>>>>> - I moved the complete_gc into process_phaseHeapSampling now >>>>>> (new method) and handle the task_executor and the complete_gc there >>>>>> - Newbie question: in our code we did a >>>>>> set_single_threaded_mode but I see that process_phaseJNI does it right >>>>>> before its call, do I need to do it for the >>>>>> process_phaseHeapSample? >>>>>> That API is much cleaner (in my mind) and is consistent with >>>>>> what is done around it (again in my mind). >>>>>> >>>>>> >>>>>> - heapMonitoring.cpp:590: I do not completely understand >>>>>> the purpose of >>>>>> this code: in the end this results in a fixed value >>>>>> directly dependent >>>>>> on the Thread address anyway? In the end this results in >>>>>> a fixed value >>>>>> directly dependent on the Thread address anyway? >>>>>> IOW, what is special about exactly 20 rounds? >>>>>> >>>>>> >>>>>> So we really want a fast random number generator that has a >>>>>> specific mean (512k is the default we use). The code uses the thread >>>>>> address as the start number of the >>>>>> sequence (why not, it is random enough is rationale). Then >>>>>> instead of just starting there, we prime the sequence and really only start >>>>>> at the 21st number, it is >>>>>> arbitrary and I have not done a study to see if we could do >>>>>> more or less of that. >>>>>> >>>>>> As I have the statistics of the system up and running, I'll >>>>>> run some experiments to see if this is needed, is 20 good, or not. >>>>>> >>>>>> >>>>>> - also I would consider stripping a few bits of the >>>>>> threads' address as >>>>>> initialization value for your rng. The last three bits >>>>>> (and probably >>>>>> more, check whether the Thread object is allocated on >>>>>> special >>>>>> boundaries) are always zero for them. >>>>>> Not sure if the given "random" value is random enough >>>>>> before/after, >>>>>> this method, so just skip that comment if you think this >>>>>> is not >>>>>> required. >>>>>> >>>>>> >>>>>> I don't know is the honest answer. I think what is important >>>>>> is that we tend towards a mean and it is random "enough" to not fall in >>>>>> pitfalls of only sampling a >>>>>> subset of objects due to their allocation order. I added that >>>>>> as test to do to see if it changes the mean in any way for the 512k default >>>>>> value and/or if the first >>>>>> 1000 elements look better. >>>>>> >>>>>> >>>>>> Some more random nits I did not find a place to put >>>>>> anywhere: >>>>>> >>>>>> - ThreadLocalAllocBuffer::_extra_space does not seem to >>>>>> be used >>>>>> anywhere? >>>>>> >>>>>> >>>>>> Good catch :). >>>>>> >>>>>> >>>>>> - Maybe indent the declaration of >>>>>> ThreadLocalAllocBuffer::_bytes_until_sample to align below the other >>>>>> members of that group. >>>>>> >>>>>> >>>>>> Done moved it up a bit to have non static members together >>>>>> and static separate. >>>>>> >>>>>> Thanks, >>>>>> Thomas >>>>>> >>>>>> >>>>>> Thanks for your review! >>>>>> Jc >>>>>> >>>>>> >>>>>> >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From yasuenag at gmail.com Tue Oct 3 04:18:00 2017 From: yasuenag at gmail.com (Yasumasa Suenaga) Date: Tue, 3 Oct 2017 13:18:00 +0900 Subject: RFR: JDK-8151815: Could not parse core image with JSnap. In-Reply-To: References: <7b897b36-1824-606d-b206-df577a6afe02@gmail.com> Message-ID: Hi all, I added gtest unit test case for this change in new webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.05/ Could you review it? Thanks, Yasumasa 2017-09-27 0:01 GMT+09:00 Yasumasa Suenaga : > Hi all, > > I uploaded new webrev to be adapted to jdk10/hs: > > http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.04/ > > > Thanks, > > Yasumasa > > > On 2017/09/21 7:45, Yasumasa Suenaga wrote: >> >> PING: >> >> Have you checked this issue? >> >>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.03/ >> >> >> >> Yasumasa >> >> >> On 2017/07/01 23:43, Yasumasa Suenaga wrote: >>> >>> PING: >>> >>> Have you checked this issue? >>> >>> >>> Yasumasa >>> >>> >>> On 2017/06/13 14:10, Yasumasa Suenaga wrote: >>>> >>>> Hi all, >>>> >>>> I want to discuss about JDK-8151815: Could not parse core image with >>>> JSnap. >>>> >>>> >>>> In last year, I found JSnap cannot parse coredump and I've sent review >>>> request for it as JDK-8151815. However it has not been reviewed yet >>>> [1]. >>>> >>>> We've discussed about safety implementation, but we could not get >>>> consensus. >>>> IMHO all SA tools should be handled java processes and core images, >>>> and PerfCounter value is useful. So I fix this issue. >>>> >>>> I uploaded new webrev for this issue. I think this patch is safety >>>> because new flag PerfMemory::_destroyed guards double free, and all >>>> members in PerfMemory is accessible (they are not munmap'ed) >>>> >>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.03/ >>>> >>>> >>>> Can you cooperate? >>>> >>>> >>>> Thanks, >>>> >>>> Yasumasa >>>> >>>> >>>> [1] >>>> http://mail.openjdk.java.net/pipermail/serviceability-dev/2016-April/019480.html >>>> > From harsha.wardhana.b at oracle.com Tue Oct 3 19:47:12 2017 From: harsha.wardhana.b at oracle.com (Harsha Wardhana B) Date: Wed, 4 Oct 2017 01:17:12 +0530 Subject: RFE Review : JDK-5016517 - Replace plaintext passwords by hashed passwords for out-of-the-box JMX Agent In-Reply-To: <6a193cdf-efea-e4e9-4faf-72d8960cd72b@Oracle.com> References: <6a193cdf-efea-e4e9-4faf-72d8960cd72b@Oracle.com> Message-ID: <0483f2c9-892e-f842-bb5f-2b522b81b199@oracle.com> Hi Roger, Thanks for the detailed review. Below is the webrev addressing all the review comments. http://cr.openjdk.java.net/~hb/5016517/webrev.01/ -Harsha On Tuesday 25 April 2017 10:56 PM, Roger Riggs wrote: > Hi Harsha, > > Thanks for this important improvement. Comments: > > > * jmxremote.password.template: > ? "Passwords will be hashed by server if they are in clear." Perhaps > should be more explicit: > > ?? "The jmxremote.passwords file will be re-written by the server to > replace all plain text passwords with hashed passwords when the file > is read by the server." > > line 35: "Base64 encoded hash"? -> drop the "Base64" in this line > isn't needed and > make it seems like it should appear as 1 field instead of 2 or 3. > > 37+: The syntax of the file may be clearer if it includes the complete > syntax in (line 39) not > just the password/hash fragment. > > Line 41:? "W = spaces"; above "tabs" are allowed as a delimiter; it > would be good to be consistent > and include the usualy white-space characters in the set, be as > specific as possible. > Is this the same set of whitespace used by Regex '\\s'. Only spaces and tabs are allowed. '\s' matches newline as well hence not allowed. > > 45: "java platform""? ->?? "MD5, SDA-1, SHA-256 are supported > algorithms." > > 49: be more specific about 'hashing is requested'? how?? Refer to the > management.properties > ? com.sun.management.jmxremote.password.hash value. > > > > 51:? "replace hashed" -> "replace *the *hashed" > 52: "with clear text or new" -> "with the clear text or the new" > 52: "If new password" -> "If the new password" > 53: "when new login" -> "when a new login" > > 60: "User generated" -> "A User generated" > > 67: Will the file be ignored if it has the wrong permissions. (With a > logged message) Addressed all the above review comments. > > * management.properties > > 306: "(Case for true/false ignored)"? - what does this mean; I think > it can be removed. > > 307: missing period at the end of the sentence. > 309: "in password file" -> "in the password file" > Done. > > * FileLoginModule.java > > 102: can this match better the similar name in the > management.properties if it has the same function: > ??? com.sun.management.jmxremote.password.hash Are you suggesting that 'hashPassword' be renamed to something similar to com.sun.management.jmxremote.password.hash? Variable names cannot be similar to property names since property names are long and provide complete context which local variables need not have to do. > 103: "replaces clear text passwords" -> "replaces each clear text > password" > 104: indent to match previous
enteries. > > * JMXPluggableAuthenticator.java > > 119: There is no need to copy the password to a new local It is required since variables accessed from inner class must be final or effectively final. > > 128: add a space after "," > > 256 private static final String HASH_PASSWORDS = > 257 "jmx.remote.x.password.file.hash"; > > The name ".hash" part does not clearly communicate that passwords are > to be hashed. > "hashPasswords" might be more self explanatory. Changed it to "jmx.remote.x.password.file.hashpassword". > Also, can this be NOT duplicated here and in ConnectorBootStrap.java? The property names used in ConnectorBootStrap follows the convention used in management.properties file - 'com.sun.management.*'. For environment variables for a JMXConnector "jmx.remote.x.*" convention is used . Hence they cannot be duplicated. > > > * ConnectorBootStrap.java: > ?482: Add space after ","s; no spaces before. > > 770: use the same name for the option/property if possible to avoid > confusion. Not possible as explained above. > > 770:? if the HASH_PASSWORDS static is appropriate use it instead of > literal "true". DefaultValues.HASH_PASSWORDS static is set to 'true' and can be used. However using literal "true" is more readable than using the static. > > * HashedPasswordManager > > 80-83: The fields can be final and use the constructor to initialize > in all cases and make the class final > to avoid unintentional subclassing. > > > 113: canWriteToFile:?? It should be made clear in the template that > *both* the Security policy > ?? and the file access value are used to check that the file can be > updated. Made it explicit in template as well as code comments. > > 200: loadPasswords() - should this confirm the access to the file is > allowed and it has > the correct file access before reading? Not really required. Appropriate exceptions are thrown if file cannot be accessed. > > Is the re-writing of the passwords intended to be done by a > 'priveleged' system. > Does this need doPrivileged? I am not sure. Maybe it will be covered in the security review. > > * HashedPasswordFileTest: > > 88: should use the TestLibrary Utils.getRandomInstance so it logs the > seed and can be replayed if necessary. > > Done > Thanks, Roger > Thanks Harsha > > On 4/23/2017 6:20 AM, Harsha Wardhana B wrote: >> >> Hi All, >> >> Please review this enhancement to replace plain-text password for JMX >> agent with SHA-256 hash. >> >> Issue: https://bugs.openjdk.java.net/browse/JDK-5016517 >> >> >> webrev: http://cr.openjdk.java.net/~hb/5016517/webrev.00/ >> >> Overview of implementation: >> >> Currently, the JMX agent password file used to authenticate user, >> stores user name and password as clear text. Though system level >> restrictions are recommended for jmx password file, passwords are >> vulnerable since they are stored in clear. The current RFE proposes >> to store passwords as SHA256 hash instead of clear text. >> >> In current implementation, if password file is writable, and if >> passwords are in clear, they will be replaced by SHA256 hash upon >> agent boot-up or when login attempt is made. >> >> The file, >> src/jdk.management.agent/share/conf/jmxremote.password.template >> contains more details about the implementation. >> >> - Harsha >> >> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Roger.Riggs at Oracle.com Tue Oct 3 19:24:57 2017 From: Roger.Riggs at Oracle.com (Roger Riggs) Date: Tue, 3 Oct 2017 15:24:57 -0400 Subject: RFE Review : JDK-5016517 - Replace plaintext passwords by hashed passwords for out-of-the-box JMX Agent In-Reply-To: <0483f2c9-892e-f842-bb5f-2b522b81b199@oracle.com> References: <6a193cdf-efea-e4e9-4faf-72d8960cd72b@Oracle.com> <0483f2c9-892e-f842-bb5f-2b522b81b199@oracle.com> Message-ID: <6ec697c3-cb0d-ef5b-4a0c-e26e0cfc92b0@Oracle.com> Hi Harsha, FileLoginModule.java:? 104:? Add a period at the end of the the sentence. JMXPluggableAuthenticator.java: line 306:? Is the difference between singular and plural significant? ? It would be less confusing if both were plural (hashPasswords). ConnectorBootstrap: 134: ...password.file.hash" and HashedPasswordManager disagree on the exact string. I would propose 'hashpasswords' as the suffix in all places to be consistent in ConnectorBootstrap.java, HashedPasswordManager (except for capitalization), jmxremote.password.template, and management.properties As is you have a mix of "...password.hash", "...password.file.hash", "...hashpassword"; that's not good for knowing there is only one semantic. line 482:? " ," -> ", "? space after comma, not before line: 771: is it intentional to discard the reference to the new HashedPasswordManager? If the intention is only to use the side effect of loadPasswords, then please create a static method in HashedPasswordManager for that purpose. (Even if just does the same code; it would be clear that's the purpose). (It probably also implies that the password file will be read a second time somewhere else in the initialization). line:770:? the string constant would be nicer as a final static string somewhere. ? "jmx.remote.x.password.file.hashpassword" Roger On 10/3/2017 3:47 PM, Harsha Wardhana B wrote: > > Hi Roger, > > Thanks for the detailed review. Below is the webrev addressing all the > review comments. > > http://cr.openjdk.java.net/~hb/5016517/webrev.01/ > > -Harsha > > > On Tuesday 25 April 2017 10:56 PM, Roger Riggs wrote: >> Hi Harsha, >> >> Thanks for this important improvement. Comments: >> >> >> * jmxremote.password.template: >> ? "Passwords will be hashed by server if they are in clear." Perhaps >> should be more explicit: >> >> ?? "The jmxremote.passwords file will be re-written by the server to >> replace all plain text passwords with hashed passwords when the file >> is read by the server." >> >> line 35: "Base64 encoded hash"? -> drop the "Base64" in this line >> isn't needed and >> make it seems like it should appear as 1 field instead of 2 or 3. >> >> 37+: The syntax of the file may be clearer if it includes the >> complete syntax in (line 39) not >> just the password/hash fragment. >> >> Line 41:? "W = spaces"; above "tabs" are allowed as a delimiter; it >> would be good to be consistent >> and include the usualy white-space characters in the set, be as >> specific as possible. >> Is this the same set of whitespace used by Regex '\\s'. > Only spaces and tabs are allowed. '\s' matches newline as well hence > not allowed. >> >> 45: "java platform""? ->?? "MD5, SDA-1, SHA-256 are supported >> algorithms." >> >> 49: be more specific about 'hashing is requested'? how?? Refer to the >> management.properties >> ? com.sun.management.jmxremote.password.hash value. >> >> >> >> 51:? "replace hashed" -> "replace *the *hashed" >> 52: "with clear text or new" -> "with the clear text or the new" >> 52: "If new password" -> "If the new password" >> 53: "when new login" -> "when a new login" >> >> 60: "User generated" -> "A User generated" >> >> 67: Will the file be ignored if it has the wrong permissions. (With a >> logged message) > Addressed all the above review comments. >> >> * management.properties >> >> 306: "(Case for true/false ignored)"? - what does this mean; I think >> it can be removed. >> >> 307: missing period at the end of the sentence. >> 309: "in password file" -> "in the password file" >> > Done. >> >> * FileLoginModule.java >> >> 102: can this match better the similar name in the >> management.properties if it has the same function: >> ??? com.sun.management.jmxremote.password.hash > Are you suggesting that 'hashPassword' be renamed to something similar > to com.sun.management.jmxremote.password.hash? Variable names cannot > be similar to property names since property names are long and provide > complete context which local variables need not have to do. the suffix should be the same in all places since it is a single semantic. >> 103: "replaces clear text passwords" -> "replaces each clear text >> password" >> 104: indent to match previous
enteries. >> >> * JMXPluggableAuthenticator.java >> >> 119: There is no need to copy the password to a new local > It is required since variables accessed from inner class must be final > or effectively final. right >> >> 128: add a space after "," >> >> 256 private static final String HASH_PASSWORDS = >> 257 "jmx.remote.x.password.file.hash"; >> >> The name ".hash" part does not clearly communicate that passwords are >> to be hashed. >> "hashPasswords" might be more self explanatory. > Changed it to "jmx.remote.x.password.file.hashpassword". drop the "file." >> Also, can this be NOT duplicated here and in ConnectorBootStrap.java? > The property names used in ConnectorBootStrap follows the convention > used in management.properties file - 'com.sun.management.*'. For > environment variables for a JMXConnector "jmx.remote.x.*" convention > is used . Hence they cannot be duplicated. The differing prefix'es are fine as is; no change except to make the new keys consistent. >> >> >> * ConnectorBootStrap.java: >> ?482: Add space after ","s; no spaces before. >> >> 770: use the same name for the option/property if possible to avoid >> confusion. > Not possible as explained above. >> >> 770:? if the HASH_PASSWORDS static is appropriate use it instead of >> literal "true". > DefaultValues.HASH_PASSWORDS static is set to 'true' and can be used. > However using literal "true" is more readable than using the static. >> >> * HashedPasswordManager >> >> 80-83: The fields can be final and use the constructor to initialize >> in all cases and make the class final >> to avoid unintentional subclassing. >> >> >> 113: canWriteToFile:?? It should be made clear in the template that >> *both* the Security policy >> ?? and the file access value are used to check that the file can be >> updated. > Made it explicit in template as well as code comments. >> >> 200: loadPasswords() - should this confirm the access to the file is >> allowed and it has >> the correct file access before reading? > Not really required. Appropriate exceptions are thrown if file cannot > be accessed. >> >> Is the re-writing of the passwords intended to be done by a >> 'priveleged' system. >> Does this need doPrivileged? > I am not sure. Maybe it will be covered in the security review. >> >> * HashedPasswordFileTest: >> >> 88: should use the TestLibrary Utils.getRandomInstance so it logs the >> seed and can be replayed if necessary. >> >> > Done >> Thanks, Roger >> > Thanks > Harsha >> >> On 4/23/2017 6:20 AM, Harsha Wardhana B wrote: >>> >>> Hi All, >>> >>> Please review this enhancement to replace plain-text password for >>> JMX agent with SHA-256 hash. >>> >>> Issue: https://bugs.openjdk.java.net/browse/JDK-5016517 >>> >>> >>> webrev: http://cr.openjdk.java.net/~hb/5016517/webrev.00/ >>> >>> Overview of implementation: >>> >>> Currently, the JMX agent password file used to authenticate user, >>> stores user name and password as clear text. Though system level >>> restrictions are recommended for jmx password file, passwords are >>> vulnerable since they are stored in clear. The current RFE proposes >>> to store passwords as SHA256 hash instead of clear text. >>> >>> In current implementation, if password file is writable, and if >>> passwords are in clear, they will be replaced by SHA256 hash upon >>> agent boot-up or when login attempt is made. >>> >>> The file, >>> src/jdk.management.agent/share/conf/jmxremote.password.template >>> contains more details about the implementation. >>> >>> - Harsha >>> >>> >>> >>> >> > From harsha.wardhana.b at oracle.com Wed Oct 4 08:23:34 2017 From: harsha.wardhana.b at oracle.com (Harsha Wardhana B) Date: Wed, 4 Oct 2017 13:53:34 +0530 Subject: RFE Review : JDK-5016517 - Replace plaintext passwords by hashed passwords for out-of-the-box JMX Agent In-Reply-To: <6ec697c3-cb0d-ef5b-4a0c-e26e0cfc92b0@Oracle.com> References: <6a193cdf-efea-e4e9-4faf-72d8960cd72b@Oracle.com> <0483f2c9-892e-f842-bb5f-2b522b81b199@oracle.com> <6ec697c3-cb0d-ef5b-4a0c-e26e0cfc92b0@Oracle.com> Message-ID: <1ceb614a-585d-2ccd-0877-f99202c43718@oracle.com> Hi Roger, Below is the webrev incorporating changes suggested by you. http://cr.openjdk.java.net/~hb/5016517/webrev.02/ -Harsha On Wednesday 04 October 2017 12:54 AM, Roger Riggs wrote: > Hi Harsha, > > FileLoginModule.java:? 104:? Add a period at the end of the the sentence. > > JMXPluggableAuthenticator.java: line 306:? Is the difference between > singular and plural significant? > ? It would be less confusing if both were plural (hashPasswords). Ok. > ConnectorBootstrap: > 134: ...password.file.hash" and HashedPasswordManager disagree on the > exact string. > I would propose 'hashpasswords' as the suffix in all places to be > consistent > in ConnectorBootstrap.java, HashedPasswordManager (except for > capitalization), > jmxremote.password.template, and management.properties Do you want to rename HashedPasswordManager class? > > As is you have a mix of "...password.hash", "...password.file.hash", > "...hashpassword"; > that's not good for knowing there is only one semantic. > > line 482:? " ," -> ", "? space after comma, not before > Will incorporate above comments. > line: 771: is it intentional to discard the reference to the new > HashedPasswordManager? > If the intention is only to use the side effect of loadPasswords, then > please > create a static method in HashedPasswordManager for that purpose. > (Even if just does the same code; it would be clear that's the purpose). > (It probably also implies that the password file will be read a second > time somewhere else in the initialization). Static methods just to hash passwords can be created but HashedPasswordManager class will have to be re-factored since almost all methods are using instance variables. Not sure if we want instance methods and look-alike static methods side-by-side. Wouldn't that be more confusing than current implementation? > > line:770:? the string constant would be nicer as a final static string > somewhere. > ? "jmx.remote.x.password.file.hashpassword" All of "jmx.remote.x.*" don't have static strings. They are used 'as is' all over the code to maintain isolation between pluggable login authenticator and JDK code. > > Roger > > Harsha > > On 10/3/2017 3:47 PM, Harsha Wardhana B wrote: >> >> Hi Roger, >> >> Thanks for the detailed review. Below is the webrev addressing all >> the review comments. >> >> http://cr.openjdk.java.net/~hb/5016517/webrev.01/ >> >> -Harsha >> >> >> On Tuesday 25 April 2017 10:56 PM, Roger Riggs wrote: >>> Hi Harsha, >>> >>> Thanks for this important improvement. Comments: >>> >>> >>> * jmxremote.password.template: >>> ? "Passwords will be hashed by server if they are in clear." Perhaps >>> should be more explicit: >>> >>> ?? "The jmxremote.passwords file will be re-written by the server to >>> replace all plain text passwords with hashed passwords when the file >>> is read by the server." >>> >>> line 35: "Base64 encoded hash"? -> drop the "Base64" in this line >>> isn't needed and >>> make it seems like it should appear as 1 field instead of 2 or 3. >>> >>> 37+: The syntax of the file may be clearer if it includes the >>> complete syntax in (line 39) not >>> just the password/hash fragment. >>> >>> Line 41:? "W = spaces"; above "tabs" are allowed as a delimiter; it >>> would be good to be consistent >>> and include the usualy white-space characters in the set, be as >>> specific as possible. >>> Is this the same set of whitespace used by Regex '\\s'. >> Only spaces and tabs are allowed. '\s' matches newline as well hence >> not allowed. >>> >>> 45: "java platform""? ->?? "MD5, SDA-1, SHA-256 are supported >>> algorithms." >>> >>> 49: be more specific about 'hashing is requested'? how?? Refer to >>> the management.properties >>> ? com.sun.management.jmxremote.password.hash value. >>> >>> >>> >>> 51:? "replace hashed" -> "replace *the *hashed" >>> 52: "with clear text or new" -> "with the clear text or the new" >>> 52: "If new password" -> "If the new password" >>> 53: "when new login" -> "when a new login" >>> >>> 60: "User generated" -> "A User generated" >>> >>> 67: Will the file be ignored if it has the wrong permissions. (With >>> a logged message) >> Addressed all the above review comments. >>> >>> * management.properties >>> >>> 306: "(Case for true/false ignored)"? - what does this mean; I think >>> it can be removed. >>> >>> 307: missing period at the end of the sentence. >>> 309: "in password file" -> "in the password file" >>> >> Done. >>> >>> * FileLoginModule.java >>> >>> 102: can this match better the similar name in the >>> management.properties if it has the same function: >>> ??? com.sun.management.jmxremote.password.hash >> Are you suggesting that 'hashPassword' be renamed to something >> similar to com.sun.management.jmxremote.password.hash? Variable names >> cannot be similar to property names since property names are long and >> provide complete context which local variables need not have to do. > the suffix should be the same in all places since it is a single > semantic. Done. >>> 103: "replaces clear text passwords" -> "replaces each clear text >>> password" >>> 104: indent to match previous
enteries. >>> >>> * JMXPluggableAuthenticator.java >>> >>> 119: There is no need to copy the password to a new local >> It is required since variables accessed from inner class must be >> final or effectively final. > right >>> >>> 128: add a space after "," >>> >>> 256 private static final String HASH_PASSWORDS = >>> 257 "jmx.remote.x.password.file.hash"; >>> >>> The name ".hash" part does not clearly communicate that passwords >>> are to be hashed. >>> "hashPasswords" might be more self explanatory. >> Changed it to "jmx.remote.x.password.file.hashpassword". > drop the "file." Done. >>> Also, can this be NOT duplicated here and in ConnectorBootStrap.java? >> The property names used in ConnectorBootStrap follows the convention >> used in management.properties file - 'com.sun.management.*'. For >> environment variables for a JMXConnector "jmx.remote.x.*" convention >> is used . Hence they cannot be duplicated. > The differing prefix'es are fine as is; no change except to make the > new keys consistent. > >>> >>> >>> * ConnectorBootStrap.java: >>> ?482: Add space after ","s; no spaces before. >>> >>> 770: use the same name for the option/property if possible to avoid >>> confusion. >> Not possible as explained above. >>> >>> 770:? if the HASH_PASSWORDS static is appropriate use it instead of >>> literal "true". >> DefaultValues.HASH_PASSWORDS static is set to 'true' and can be used. >> However using literal "true" is more readable than using the static. >>> >>> * HashedPasswordManager >>> >>> 80-83: The fields can be final and use the constructor to initialize >>> in all cases and make the class final >>> to avoid unintentional subclassing. >>> >>> >>> 113: canWriteToFile:?? It should be made clear in the template that >>> *both* the Security policy >>> ?? and the file access value are used to check that the file can be >>> updated. >> Made it explicit in template as well as code comments. >>> >>> 200: loadPasswords() - should this confirm the access to the file is >>> allowed and it has >>> the correct file access before reading? >> Not really required. Appropriate exceptions are thrown if file cannot >> be accessed. >>> >>> Is the re-writing of the passwords intended to be done by a >>> 'priveleged' system. >>> Does this need doPrivileged? >> I am not sure. Maybe it will be covered in the security review. >>> >>> * HashedPasswordFileTest: >>> >>> 88: should use the TestLibrary Utils.getRandomInstance so it logs >>> the seed and can be replayed if necessary. >>> >>> >> Done >>> Thanks, Roger >>> >> Thanks >> Harsha >>> >>> On 4/23/2017 6:20 AM, Harsha Wardhana B wrote: >>>> >>>> Hi All, >>>> >>>> Please review this enhancement to replace plain-text password for >>>> JMX agent with SHA-256 hash. >>>> >>>> Issue: https://bugs.openjdk.java.net/browse/JDK-5016517 >>>> >>>> >>>> webrev: http://cr.openjdk.java.net/~hb/5016517/webrev.00/ >>>> >>>> Overview of implementation: >>>> >>>> Currently, the JMX agent password file used to authenticate user, >>>> stores user name and password as clear text. Though system level >>>> restrictions are recommended for jmx password file, passwords are >>>> vulnerable since they are stored in clear. The current RFE proposes >>>> to store passwords as SHA256 hash instead of clear text. >>>> >>>> In current implementation, if password file is writable, and if >>>> passwords are in clear, they will be replaced by SHA256 hash upon >>>> agent boot-up or when login attempt is made. >>>> >>>> The file, >>>> src/jdk.management.agent/share/conf/jmxremote.password.template >>>> contains more details about the implementation. >>>> >>>> - Harsha >>>> >>>> >>>> >>>> >>> >> > From jini.george at oracle.com Thu Oct 5 09:36:48 2017 From: jini.george at oracle.com (Jini George) Date: Thu, 5 Oct 2017 15:06:48 +0530 Subject: RFR: 8187402: UnknownOopException is occurred on Stack Memory window in HSDB In-Reply-To: References: <4c8a242f-464e-297e-5779-a237691be495@gmail.com> <6a49c097-10dc-6a47-154e-a6f5ae2e96ec@oracle.com> <94dfc272-cf4a-b499-8616-3e55d1e7a2ea@oracle.com> Message-ID: <2631c09e-fc9a-436f-41c6-7c4f1c5e6eb8@oracle.com> Hi Yasumasa, I think you could create an extension of LingeredApp with the "-Xcomp" option or the "-XX:CompileCommand=compileonly," option to ensure JIT-ing of a method and maybe turn off TieredCompilation using "-XX:-TieredCompilation" for ease, too. If that proves to be too cumbersome, the JShell test case can still be used as a fallback option. I agree with the possibility of JShell implementation change in the future. But I might say it is better to have the test running till however long it can, rather than not having any testing done for that part at all. Thank you, Jini. On 10/2/2017 5:03 AM, Yasumasa Suenaga wrote: > Hi Jini, > > Thank you for sharing the testcase. > However I concern about this as below: > > 1. This bug appears at JIT'ed frame. So it might not appear on test server. > We can add -Xcomp or JSON for compiler control for this issue, but we > cannot control compile level (TieredCompilation) AFAIK. > > 2. If JShell implementation is changed in the future, this testcase > might not verify this bug. > > > I will merge your testcase to the webrev if these concerns can be ignored. > > > Thanks, > > Yasumasa > > > > 2017/10/02 0:51 "Jini George" >: > > Apologize for the delay in responding to this, Yasumasa. > > I tried my hand at creating a test case for this by attaching an SA > process to jshell and by invoking a method to traverse the frame > oopmaps? for the 'output reader' thread -- Please do take a look at: > > http://cr.openjdk.java.net/~jgeorge/sponsorships/8187402_ysuenaga/TestFrameOopMap.java > > > I think, in general, for the issues manifested through the GUI, we > can? probably try having unit test cases directly invoking the > methods involved. > > Thanks, > Jini. > > On 9/27/2017 4:10 AM, Yasumasa Suenaga wrote: > > Hi Jini, > > IMHO this issue (JDK-8187402) and JDK-8187403 are too difficult > to crate test cases because they are problems in Stack Memory > window in HSDB. > Can we add noreg-hard label to JBS? > > > Thanks, > > Yasumasa > > > On 2017/09/27 2:36, Jini George wrote: > > Hi Yasumasa, > > The changes look fine, but please do include the test case > also for > this. In general, it would be great if you could provide > test cases also > along with the code changes while sending for review. > > Thank you, > Jini. > > On 9/26/2017 8:19 PM, Yasumasa Suenaga wrote: > > Hi all, > > I uploaded new webrev to be adapted to jdk10/hs: > > ?? > http://cr.openjdk.java.net/~ysuenaga/JDK-8187402/webrev.01/ > > > > Thanks, > > Yasumasa > > > On 2017/09/21 7:47, Yasumasa Suenaga wrote: > > PING: > > Have you checked this issue? > > ?? > http://cr.openjdk.java.net/~ysuenaga/JDK-8187402/webrev.00/ > > > > > Yasumasa > > > On 2017/09/11 11:17, Yasumasa Suenaga wrote: > > Hi all, > > This review request is a part of [1]. > > > JBS: > ?? > https://bugs.openjdk.java.net/browse/JDK-8187402 > > > webrev: > ?? > http://cr.openjdk.java.net/~ysuenaga/JDK-8187402/webrev.00/ > > > > I cannot access JPRT. So I need a sponsor. > > > Thanks, > > Yasumasa > > > [1] > http://mail.openjdk.java.net/pipermail/serviceability-dev/2017-September/021821.html > > > > > From serguei.spitsyn at oracle.com Thu Oct 5 10:03:42 2017 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 5 Oct 2017 03:03:42 -0700 Subject: RFR: 8187402: UnknownOopException is occurred on Stack Memory window in HSDB In-Reply-To: <2631c09e-fc9a-436f-41c6-7c4f1c5e6eb8@oracle.com> References: <4c8a242f-464e-297e-5779-a237691be495@gmail.com> <6a49c097-10dc-6a47-154e-a6f5ae2e96ec@oracle.com> <94dfc272-cf4a-b499-8616-3e55d1e7a2ea@oracle.com> <2631c09e-fc9a-436f-41c6-7c4f1c5e6eb8@oracle.com> Message-ID: <6f0df92e-a917-f105-8ef8-444883c1434b@oracle.com> Hi Jini, Sorry, I've already pushed the fix for 8187402. I understood that you are Ok with the fix itself. Please, let me know if you still have some concerns. David suggested to skip unit test development and add label noreg-hard for the 8187401 (please, look at the email thread for 8187401). The 8187402 is similar to 8187401 so that adding noreg-hard should be also enough. My understanding is that we have very few tests for HSDB at this point, so that the coverage is very limited anyway. I've not pushed the 8187401 and 8187403 yet because one more review is required. Thanks, Serguei On 10/5/17 02:36, Jini George wrote: > Hi Yasumasa, > > I think you could create an extension of LingeredApp with the "-Xcomp" > option or the "-XX:CompileCommand=compileonly," option to > ensure JIT-ing of a method and maybe turn off TieredCompilation using > "-XX:-TieredCompilation" for ease, too. If that proves to be too > cumbersome, the JShell test case can still be used as a fallback option. > > I agree with the possibility of JShell implementation change in the > future. But I might say it is better to have the test running till > however long it can, rather than not having any testing done for that > part at all. > > Thank you, > Jini. > > On 10/2/2017 5:03 AM, Yasumasa Suenaga wrote: >> Hi Jini, >> >> Thank you for sharing the testcase. >> However I concern about this as below: >> >> 1. This bug appears at JIT'ed frame. So it might not appear on test >> server. >> We can add -Xcomp or JSON for compiler control for this issue, but we >> cannot control compile level (TieredCompilation) AFAIK. >> >> 2. If JShell implementation is changed in the future, this testcase >> might not verify this bug. >> >> >> I will merge your testcase to the webrev if these concerns can be >> ignored. >> >> >> Thanks, >> >> Yasumasa >> >> >> >> 2017/10/02 0:51 "Jini George" > >: >> >> ??? Apologize for the delay in responding to this, Yasumasa. >> >> ??? I tried my hand at creating a test case for this by attaching an >> SA ??? process to jshell and by invoking a method to traverse the frame >> ??? oopmaps? for the 'output reader' thread -- Please do take a look at: >> >> http://cr.openjdk.java.net/~jgeorge/sponsorships/8187402_ysuenaga/TestFrameOopMap.java >> >> >> ??? I think, in general, for the issues manifested through the GUI, we >> ??? can? probably try having unit test cases directly invoking the >> ??? methods involved. >> >> ??? Thanks, >> ??? Jini. >> >> ??? On 9/27/2017 4:10 AM, Yasumasa Suenaga wrote: >> >> ??????? Hi Jini, >> >> ??????? IMHO this issue (JDK-8187402) and JDK-8187403 are too difficult >> ??????? to crate test cases because they are problems in Stack Memory >> ??????? window in HSDB. >> ??????? Can we add noreg-hard label to JBS? >> >> >> ??????? Thanks, >> >> ??????? Yasumasa >> >> >> ??????? On 2017/09/27 2:36, Jini George wrote: >> >> ??????????? Hi Yasumasa, >> >> ??????????? The changes look fine, but please do include the test case >> ??????????? also for >> ??????????? this. In general, it would be great if you could provide >> ??????????? test cases also >> ??????????? along with the code changes while sending for review. >> >> ??????????? Thank you, >> ??????????? Jini. >> >> ??????????? On 9/26/2017 8:19 PM, Yasumasa Suenaga wrote: >> >> ??????????????? Hi all, >> >> ??????????????? I uploaded new webrev to be adapted to jdk10/hs: >> >> ???????????????? ?? >> http://cr.openjdk.java.net/~ysuenaga/JDK-8187402/webrev.01/ >> >> >> >> ??????????????? Thanks, >> >> ??????????????? Yasumasa >> >> >> ??????????????? On 2017/09/21 7:47, Yasumasa Suenaga wrote: >> >> ??????????????????? PING: >> >> ??????????????????? Have you checked this issue? >> >> ??????????????????????? ?? >> http://cr.openjdk.java.net/~ysuenaga/JDK-8187402/webrev.00/ >> >> >> >> >> ??????????????????? Yasumasa >> >> >> ??????????????????? On 2017/09/11 11:17, Yasumasa Suenaga wrote: >> >> ??????????????????????? Hi all, >> >> ??????????????????????? This review request is a part of [1]. >> >> >> ??????????????????????? JBS: >> ??????????????????????? ?? >> https://bugs.openjdk.java.net/browse/JDK-8187402 >> >> >> ??????????????????????? webrev: >> ??????????????????????? ?? >> http://cr.openjdk.java.net/~ysuenaga/JDK-8187402/webrev.00/ >> >> >> >> ??????????????????????? I cannot access JPRT. So I need a sponsor. >> >> >> ??????????????????????? Thanks, >> >> ??????????????????????? Yasumasa >> >> >> ??????????????????????? [1] >> http://mail.openjdk.java.net/pipermail/serviceability-dev/2017-September/021821.html >> >> >> >> >> From harsha.wardhana.b at oracle.com Fri Oct 6 05:25:56 2017 From: harsha.wardhana.b at oracle.com (Harsha Wardhana B) Date: Fri, 6 Oct 2017 10:55:56 +0530 Subject: RFE Review : JDK-5016517 - Replace plaintext passwords by hashed passwords for out-of-the-box JMX Agent In-Reply-To: <1ceb614a-585d-2ccd-0877-f99202c43718@oracle.com> References: <6a193cdf-efea-e4e9-4faf-72d8960cd72b@Oracle.com> <0483f2c9-892e-f842-bb5f-2b522b81b199@oracle.com> <6ec697c3-cb0d-ef5b-4a0c-e26e0cfc92b0@Oracle.com> <1ceb614a-585d-2ccd-0877-f99202c43718@oracle.com> Message-ID: <5892e0b2-51c6-6a22-5a39-d4ab93ca2fcf@oracle.com> Hi All, Previously, for default agent, hashing of the passwords was done during the agent boot-up (ConnectorBootstrap.java). That was an error since login configuration could be different and is determined only when a login attempt is made. It would be then pointless to hash the password file. The fix for above and some off-list comments are incorporated in webrev below. http://cr.openjdk.java.net/~hb/5016517/webrev.03/ -Harsha On Wednesday 04 October 2017 01:53 PM, Harsha Wardhana B wrote: > Hi Roger, > > Below is the webrev incorporating changes suggested by you. > > http://cr.openjdk.java.net/~hb/5016517/webrev.02/ > > -Harsha > > On Wednesday 04 October 2017 12:54 AM, Roger Riggs wrote: >> Hi Harsha, >> >> FileLoginModule.java:? 104:? Add a period at the end of the the >> sentence. >> >> JMXPluggableAuthenticator.java: line 306:? Is the difference between >> singular and plural significant? >> ? It would be less confusing if both were plural (hashPasswords). > Ok. >> ConnectorBootstrap: >> 134: ...password.file.hash" and HashedPasswordManager disagree on the >> exact string. >> I would propose 'hashpasswords' as the suffix in all places to be >> consistent >> in ConnectorBootstrap.java, HashedPasswordManager (except for >> capitalization), >> jmxremote.password.template, and management.properties > Do you want to rename HashedPasswordManager class? >> >> As is you have a mix of "...password.hash", "...password.file.hash", >> "...hashpassword"; >> that's not good for knowing there is only one semantic. >> >> line 482:? " ," -> ", "? space after comma, not before >> > Will incorporate above comments. >> line: 771: is it intentional to discard the reference to the new >> HashedPasswordManager? >> If the intention is only to use the side effect of loadPasswords, >> then please >> create a static method in HashedPasswordManager for that purpose. >> (Even if just does the same code; it would be clear that's the purpose). >> (It probably also implies that the password file will be read a >> second time somewhere else in the initialization). > Static methods just to hash passwords can be created but > HashedPasswordManager class will have to be re-factored since almost > all methods are using instance variables. Not sure if we want instance > methods and look-alike static methods side-by-side. Wouldn't that be > more confusing than current implementation? >> >> line:770:? the string constant would be nicer as a final static >> string somewhere. >> ? "jmx.remote.x.password.file.hashpassword" > All of "jmx.remote.x.*" don't have static strings. They are used 'as > is' all over the code to maintain isolation between pluggable login > authenticator and JDK code. >> >> Roger >> >> > Harsha >> >> On 10/3/2017 3:47 PM, Harsha Wardhana B wrote: >>> >>> Hi Roger, >>> >>> Thanks for the detailed review. Below is the webrev addressing all >>> the review comments. >>> >>> http://cr.openjdk.java.net/~hb/5016517/webrev.01/ >>> >>> -Harsha >>> >>> >>> On Tuesday 25 April 2017 10:56 PM, Roger Riggs wrote: >>>> Hi Harsha, >>>> >>>> Thanks for this important improvement. Comments: >>>> >>>> >>>> * jmxremote.password.template: >>>> ? "Passwords will be hashed by server if they are in clear." >>>> Perhaps should be more explicit: >>>> >>>> ?? "The jmxremote.passwords file will be re-written by the server >>>> to replace all plain text passwords with hashed passwords when the >>>> file is read by the server." >>>> >>>> line 35: "Base64 encoded hash"? -> drop the "Base64" in this line >>>> isn't needed and >>>> make it seems like it should appear as 1 field instead of 2 or 3. >>>> >>>> 37+: The syntax of the file may be clearer if it includes the >>>> complete syntax in (line 39) not >>>> just the password/hash fragment. >>>> >>>> Line 41:? "W = spaces"; above "tabs" are allowed as a delimiter; it >>>> would be good to be consistent >>>> and include the usualy white-space characters in the set, be as >>>> specific as possible. >>>> Is this the same set of whitespace used by Regex '\\s'. >>> Only spaces and tabs are allowed. '\s' matches newline as well hence >>> not allowed. >>>> >>>> 45: "java platform""? ->?? "MD5, SDA-1, SHA-256 are supported >>>> algorithms." >>>> >>>> 49: be more specific about 'hashing is requested'? how? Refer to >>>> the management.properties >>>> ? com.sun.management.jmxremote.password.hash value. >>>> >>>> >>>> >>>> 51:? "replace hashed" -> "replace *the *hashed" >>>> 52: "with clear text or new" -> "with the clear text or the new" >>>> 52: "If new password" -> "If the new password" >>>> 53: "when new login" -> "when a new login" >>>> >>>> 60: "User generated" -> "A User generated" >>>> >>>> 67: Will the file be ignored if it has the wrong permissions. (With >>>> a logged message) >>> Addressed all the above review comments. >>>> >>>> * management.properties >>>> >>>> 306: "(Case for true/false ignored)"? - what does this mean; I >>>> think it can be removed. >>>> >>>> 307: missing period at the end of the sentence. >>>> 309: "in password file" -> "in the password file" >>>> >>> Done. >>>> >>>> * FileLoginModule.java >>>> >>>> 102: can this match better the similar name in the >>>> management.properties if it has the same function: >>>> ??? com.sun.management.jmxremote.password.hash >>> Are you suggesting that 'hashPassword' be renamed to something >>> similar to com.sun.management.jmxremote.password.hash? Variable >>> names cannot be similar to property names since property names are >>> long and provide complete context which local variables need not >>> have to do. >> the suffix should be the same in all places since it is a single >> semantic. > Done. >>>> 103: "replaces clear text passwords" -> "replaces each clear text >>>> password" >>>> 104: indent to match previous
enteries. >>>> >>>> * JMXPluggableAuthenticator.java >>>> >>>> 119: There is no need to copy the password to a new local >>> It is required since variables accessed from inner class must be >>> final or effectively final. >> right >>>> >>>> 128: add a space after "," >>>> >>>> 256 private static final String HASH_PASSWORDS = >>>> 257 "jmx.remote.x.password.file.hash"; >>>> >>>> The name ".hash" part does not clearly communicate that passwords >>>> are to be hashed. >>>> "hashPasswords" might be more self explanatory. >>> Changed it to "jmx.remote.x.password.file.hashpassword". >> drop the "file." > Done. >>>> Also, can this be NOT duplicated here and in ConnectorBootStrap.java? >>> The property names used in ConnectorBootStrap follows the convention >>> used in management.properties file - 'com.sun.management.*'. For >>> environment variables for a JMXConnector "jmx.remote.x.*" convention >>> is used . Hence they cannot be duplicated. >> The differing prefix'es are fine as is; no change except to make the >> new keys consistent. >> >>>> >>>> >>>> * ConnectorBootStrap.java: >>>> ?482: Add space after ","s; no spaces before. >>>> >>>> 770: use the same name for the option/property if possible to avoid >>>> confusion. >>> Not possible as explained above. >>>> >>>> 770:? if the HASH_PASSWORDS static is appropriate use it instead of >>>> literal "true". >>> DefaultValues.HASH_PASSWORDS static is set to 'true' and can be >>> used. However using literal "true" is more readable than using the >>> static. >>>> >>>> * HashedPasswordManager >>>> >>>> 80-83: The fields can be final and use the constructor to >>>> initialize in all cases and make the class final >>>> to avoid unintentional subclassing. >>>> >>>> >>>> 113: canWriteToFile:?? It should be made clear in the template that >>>> *both* the Security policy >>>> ?? and the file access value are used to check that the file can be >>>> updated. >>> Made it explicit in template as well as code comments. >>>> >>>> 200: loadPasswords() - should this confirm the access to the file >>>> is allowed and it has >>>> the correct file access before reading? >>> Not really required. Appropriate exceptions are thrown if file >>> cannot be accessed. >>>> >>>> Is the re-writing of the passwords intended to be done by a >>>> 'priveleged' system. >>>> Does this need doPrivileged? >>> I am not sure. Maybe it will be covered in the security review. >>>> >>>> * HashedPasswordFileTest: >>>> >>>> 88: should use the TestLibrary Utils.getRandomInstance so it logs >>>> the seed and can be replayed if necessary. >>>> >>>> >>> Done >>>> Thanks, Roger >>>> >>> Thanks >>> Harsha >>>> >>>> On 4/23/2017 6:20 AM, Harsha Wardhana B wrote: >>>>> >>>>> Hi All, >>>>> >>>>> Please review this enhancement to replace plain-text password for >>>>> JMX agent with SHA-256 hash. >>>>> >>>>> Issue: https://bugs.openjdk.java.net/browse/JDK-5016517 >>>>> >>>>> >>>>> webrev: http://cr.openjdk.java.net/~hb/5016517/webrev.00/ >>>>> >>>>> Overview of implementation: >>>>> >>>>> Currently, the JMX agent password file used to authenticate user, >>>>> stores user name and password as clear text. Though system level >>>>> restrictions are recommended for jmx password file, passwords are >>>>> vulnerable since they are stored in clear. The current RFE >>>>> proposes to store passwords as SHA256 hash instead of clear text. >>>>> >>>>> In current implementation, if password file is writable, and if >>>>> passwords are in clear, they will be replaced by SHA256 hash upon >>>>> agent boot-up or when login attempt is made. >>>>> >>>>> The file, >>>>> src/jdk.management.agent/share/conf/jmxremote.password.template >>>>> contains more details about the implementation. >>>>> >>>>> - Harsha >>>>> >>>>> >>>>> >>>>> >>>> >>> >> > From jini.george at oracle.com Fri Oct 6 05:31:10 2017 From: jini.george at oracle.com (Jini George) Date: Fri, 6 Oct 2017 11:01:10 +0530 Subject: PING: RFR: 8187401: Java Stack cannot be shown on HSDB In-Reply-To: References: <3d25fa66-085d-11a7-74f4-eba9fa0360b5@oracle.com> <71273032-0ad5-8aed-350a-b53d0da625b3@oracle.com> <05acbfe2-5d8a-b492-8f30-6180568ad26d@oracle.com> <4457260a-a6e3-57a4-ef40-5b83db09f3b4@oracle.com> Message-ID: <9d4c6c05-d927-1011-bf91-d35629d20800@oracle.com> Hi Yasumasa, Your changes look good. One point I want to make is that we have the enum BasicTypeSize redefined in SA as public static final values, and this makes it error prone when existing enum values change, just as in this case. An ideal solution would be to include this in vmStructs.cpp as a declare_constant() macro, and read this in SA with the db.lookupIntConstant() method. This would insulate SA from enum value changes in hotspot to some extent -- but this is just a suggestion, and you can chose to ignore this, since this would mean changing all the following values in BasicType. public static final int tBoolean = 4; public static final int tChar = 5; public static final int tFloat = 6; public static final int tDouble = 7; public static final int tByte = 8; public static final int tShort = 9; public static final int tInt = 10; public static final int tLong = 11; public static final int tObject = 12; public static final int tArray = 13; public static final int tVoid = 14; public static final int tAddress = 15; public static final int tNarrowOop = 16; public static final int tMetadata = 17; public static final int tNarrowKlass = 18; public static final int tConflict = 19; public static final int tIllegal = 99; Thank you, Jini (Not a Reviewer). On 9/29/2017 2:58 PM, Yasumasa Suenaga wrote: > Hi all, > > This change has been reviewed by Serguei. > I'm waiting for another reviewer. > > > Thanks, > > Yasumasa > > > 2017/09/27 ??9:49 "Yasumasa Suenaga" >: > > Hi David, Serguei, > > I added noreg-hard label and how to reproduce to JBS: > > https://bugs.openjdk.java.net/browse/JDK-8187401 > > > > Also I uploaded new webrev for jdk10/hs: > > http://cr.openjdk.java.net/~ysuenaga/JDK-8187401/webrev.01/ > > > > Thanks, > > Yasumasa > > > > 2017-09-27 8:25 GMT+09:00 serguei.spitsyn at oracle.com > > >: > > On 9/26/17 16:22, David Holmes wrote: > >> > >> On 27/09/2017 8:52 AM, serguei.spitsyn at oracle.com > wrote: > >>> > >>> Hi David, > >>> > >>> > >>> On 9/26/17 15:09, David Holmes wrote: > >>>> > >>>> Hi Sergeui, > >>>> > >>>> On 27/09/2017 3:51 AM, serguei.spitsyn at oracle.com > wrote: > >>>>> > >>>>> Hi Yasumasa, > >>>>> > >>>>> > >>>>> On 9/26/17 02:41, Yasumasa Suenaga wrote: > >>>>>> > >>>>>> Hi Serguei, > >>>>>> > >>>>>> Thank you for your comment! > >>>>>> > >>>>>>> This fix looks Ok to me but you need to add a unit test. > >>>>>> > >>>>>>? ?I guess it is caused by inlined method which is generated > by JIT > >>>>>> compiler. I don't know how to reproduce it on jtreg test. > >>>>>> Do you have any idea for it? > >>>>> > >>>>> > >>>>> I'm not sure what exact problem you have with jtreg. > >>>>> You may want to try to use other jtreg tests as examples. > >>>> > >>>> > >>>> I see two problems: > >>>> > >>>> 1. hsdb is an interactive GUI tool > >>> > >>> > >>> There is already at least one jtreg hsdb test: > >>> open/test/hotspot/jtreg/serviceability/sa/JhsdbThreadInfoTest.java > >>> > >>> Not sure, if this example would help in this case though. > >>> > >>>> 2. The problem seems related to JIT inlining - so how do you > force that > >>>> in a test? > >>> > >>> > >>> Then I wonder how was it forced in the manual reproducer? > >>> The fact it is fixed has to be verified anyway. > >> > >> > >> Well the reproducer happens to hit the issue, so we can use it > to manually > >> verify. > >> > >>>> I would think this is a noreg-hard situation. As long as there > is a > >>>> manual reproducer that can be used to verify the fix - as per > the bug report > >>>> - that should be okay IMHO. > >>> > >>> > >>> I'm Ok with adding noreg-hard label if it is hard to develop. > >> > >> > >> Sounds good to me. The manual verification steps should be very > clearly > >> spelt out in the bug report so that even someone unfamiliar with > hsdb (like > >> me!) can follow them easily. > > > > > > Sounds good, thanks. > > > > Serguei > > > >> > >> Cheers, > >> David > >> > >>> Thanks, > >>> Serguei > >>> > >>>> Cheers, > >>>> David > >>>> > >>>>> Thanks, > >>>>> Serguei > >>>>> > >>>>>> Yasumasa > >>>>>> > >>>>>> > >>>>>> 2017-09-26 18:15 GMT+09:00 serguei.spitsyn at oracle.com > > >>>>>> >: > >>>>>>> > >>>>>>> Hi Yasumasa, > >>>>>>> > >>>>>>> This fix looks Ok to me but you need to add a unit test. > >>>>>>> > >>>>>>> Thanks, > >>>>>>> Serguei > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> On 9/20/17 15:47, Yasumasa Suenaga wrote: > >>>>>>>> > >>>>>>>> PING: > >>>>>>>> > >>>>>>>> Have you checked this issue? > >>>>>>>> > >>>>>>>>> > http://cr.openjdk.java.net/~ysuenaga/JDK-8187401/webrev.00/ > > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> Yasumasa > >>>>>>>> > >>>>>>>> > >>>>>>>> On 2017/09/11 11:16, Yasumasa Suenaga wrote: > >>>>>>>>> > >>>>>>>>> Hi all, > >>>>>>>>> > >>>>>>>>> This review request is a part of [1]. > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> JBS: > >>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8187401 > > >>>>>>>>> > >>>>>>>>> webrev: > >>>>>>>>> > http://cr.openjdk.java.net/~ysuenaga/JDK-8187401/webrev.00/ > > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> I cannot access JPRT. So I need a sponsor. > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> Thanks, > >>>>>>>>> > >>>>>>>>> Yasumasa > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> [1] > >>>>>>>>> > >>>>>>>>> > http://mail.openjdk.java.net/pipermail/serviceability-dev/2017-September/021821.html > > >>>>>>>>> > >>>>> > >>> > > > From daniel.fuchs at oracle.com Fri Oct 6 10:08:38 2017 From: daniel.fuchs at oracle.com (Daniel Fuchs) Date: Fri, 6 Oct 2017 11:08:38 +0100 Subject: RFE Review : JDK-5016517 - Replace plaintext passwords by hashed passwords for out-of-the-box JMX Agent In-Reply-To: <5892e0b2-51c6-6a22-5a39-d4ab93ca2fcf@oracle.com> References: <6a193cdf-efea-e4e9-4faf-72d8960cd72b@Oracle.com> <0483f2c9-892e-f842-bb5f-2b522b81b199@oracle.com> <6ec697c3-cb0d-ef5b-4a0c-e26e0cfc92b0@Oracle.com> <1ceb614a-585d-2ccd-0877-f99202c43718@oracle.com> <5892e0b2-51c6-6a22-5a39-d4ab93ca2fcf@oracle.com> Message-ID: <85633749-9830-dcd5-97a5-411f684776bd@oracle.com> Hi Harsha, Good work! > http://cr.openjdk.java.net/~hb/5016517/webrev.03/ long standing typo in management.properties at line 90: measureRole => monitorRole HashedPasswordManager.java: loadPasswords() It seems this function will add the header to the file even if it already contains the header. So every time a user/administrator wants to change/add a password, the header will be inserted again. HashedPasswordFileTest should probably have a test for this scenario as well: generate password file with clear text password load it, then verify passwords have been hashed (properly) add some new user/name password to the same file load it again, verify all passwords are hashed (do this a number of times - to make sure it doesn't break the second or third time) and finally verify the header is only present once ;-) I'm surprised no other tests had to be modified. Is password hash disabled by default in the default agent? If not then you should try (locally) running jtreg more than once over the default agent tests. Just make sure running the same test twice doesn't make the legacy tests that use password files failing the second time when they discover that passwords have been hashed under their feet (the client part of the test might be reading the password file too to see which password it should send to the agent). Otherwise I think it looks good to me - provided all tests are passing! best regards, -- daniel On 06/10/2017 06:25, Harsha Wardhana B wrote: > Hi All, > > Previously, for default agent, hashing of the passwords was done during > the agent boot-up (ConnectorBootstrap.java). That was an error since > login configuration could be different and is determined only when a > login attempt is made. It would be then pointless to hash the password > file. The fix for above and some off-list comments are incorporated in > webrev below. > > http://cr.openjdk.java.net/~hb/5016517/webrev.03/ > > -Harsha > > > On Wednesday 04 October 2017 01:53 PM, Harsha Wardhana B wrote: >> Hi Roger, >> >> Below is the webrev incorporating changes suggested by you. >> >> http://cr.openjdk.java.net/~hb/5016517/webrev.02/ >> >> -Harsha >> >> On Wednesday 04 October 2017 12:54 AM, Roger Riggs wrote: >>> Hi Harsha, >>> >>> FileLoginModule.java:? 104:? Add a period at the end of the the >>> sentence. >>> >>> JMXPluggableAuthenticator.java: line 306:? Is the difference between >>> singular and plural significant? >>> ? It would be less confusing if both were plural (hashPasswords). >> Ok. >>> ConnectorBootstrap: >>> 134: ...password.file.hash" and HashedPasswordManager disagree on the >>> exact string. >>> I would propose 'hashpasswords' as the suffix in all places to be >>> consistent >>> in ConnectorBootstrap.java, HashedPasswordManager (except for >>> capitalization), >>> jmxremote.password.template, and management.properties >> Do you want to rename HashedPasswordManager class? >>> >>> As is you have a mix of "...password.hash", "...password.file.hash", >>> "...hashpassword"; >>> that's not good for knowing there is only one semantic. >>> >>> line 482:? " ," -> ", "? space after comma, not before >>> >> Will incorporate above comments. >>> line: 771: is it intentional to discard the reference to the new >>> HashedPasswordManager? >>> If the intention is only to use the side effect of loadPasswords, >>> then please >>> create a static method in HashedPasswordManager for that purpose. >>> (Even if just does the same code; it would be clear that's the purpose). >>> (It probably also implies that the password file will be read a >>> second time somewhere else in the initialization). >> Static methods just to hash passwords can be created but >> HashedPasswordManager class will have to be re-factored since almost >> all methods are using instance variables. Not sure if we want instance >> methods and look-alike static methods side-by-side. Wouldn't that be >> more confusing than current implementation? >>> >>> line:770:? the string constant would be nicer as a final static >>> string somewhere. >>> ? "jmx.remote.x.password.file.hashpassword" >> All of "jmx.remote.x.*" don't have static strings. They are used 'as >> is' all over the code to maintain isolation between pluggable login >> authenticator and JDK code. >>> >>> Roger >>> >>> >> Harsha >>> >>> On 10/3/2017 3:47 PM, Harsha Wardhana B wrote: >>>> >>>> Hi Roger,>>> Thanks for the detailed review. Below is the webrev >>>> addressing all the review comments. >>>> >>>> http://cr.openjdk.java.net/~hb/5016517/webrev.01/ >>>> >>>> -Harsha >>>> >>>> >>>> On Tuesday 25 April 2017 10:56 PM, Roger Riggs wrote: >>>>> Hi Harsha, >>>>> >>>>> Thanks for this important improvement. Comments: >>>>> >>>>> >>>>> * jmxremote.password.template: >>>>> ? "Passwords will be hashed by server if they are in clear." >>>>> Perhaps should be more explicit: >>>>> >>>>> ?? "The jmxremote.passwords file will be re-written by the server >>>>> to replace all plain text passwords with hashed passwords when the >>>>> file is read by the server." >>>>> >>>>> line 35: "Base64 encoded hash"? -> drop the "Base64" in this line >>>>> isn't needed and >>>>> make it seems like it should appear as 1 field instead of 2 or 3. >>>>> >>>>> 37+: The syntax of the file may be clearer if it includes the >>>>> complete syntax in (line 39) not >>>>> just the password/hash fragment. >>>>> >>>>> Line 41:? "W = spaces"; above "tabs" are allowed as a delimiter; it >>>>> would be good to be consistent >>>>> and include the usualy white-space characters in the set, be as >>>>> specific as possible. >>>>> Is this the same set of whitespace used by Regex '\\s'. >>>> Only spaces and tabs are allowed. '\s' matches newline as well hence >>>> not allowed. >>>>> >>>>> 45: "java platform""? ->?? "MD5, SDA-1, SHA-256 are supported >>>>> algorithms." >>>>> >>>>> 49: be more specific about 'hashing is requested'? how? Refer to >>>>> the management.properties >>>>> ? com.sun.management.jmxremote.password.hash value. >>>>> >>>>> >>>>> >>>>> 51:? "replace hashed" -> "replace *the *hashed" >>>>> 52: "with clear text or new" -> "with the clear text or the new" >>>>> 52: "If new password" -> "If the new password" >>>>> 53: "when new login" -> "when a new login" >>>>> >>>>> 60: "User generated" -> "A User generated" >>>>> >>>>> 67: Will the file be ignored if it has the wrong permissions. (With >>>>> a logged message) >>>> Addressed all the above review comments. >>>>> >>>>> * management.properties >>>>> >>>>> 306: "(Case for true/false ignored)"? - what does this mean; I >>>>> think it can be removed. >>>>> >>>>> 307: missing period at the end of the sentence. >>>>> 309: "in password file" -> "in the password file" >>>>> >>>> Done. >>>>> >>>>> * FileLoginModule.java >>>>> >>>>> 102: can this match better the similar name in the >>>>> management.properties if it has the same function: >>>>> ??? com.sun.management.jmxremote.password.hash >>>> Are you suggesting that 'hashPassword' be renamed to something >>>> similar to com.sun.management.jmxremote.password.hash? Variable >>>> names cannot be similar to property names since property names are >>>> long and provide complete context which local variables need not >>>> have to do. >>> the suffix should be the same in all places since it is a single >>> semantic. >> Done. >>>>> 103: "replaces clear text passwords" -> "replaces each clear text >>>>> password" >>>>> 104: indent to match previous
enteries. >>>>> >>>>> * JMXPluggableAuthenticator.java >>>>> >>>>> 119: There is no need to copy the password to a new local >>>> It is required since variables accessed from inner class must be >>>> final or effectively final. >>> right >>>>> >>>>> 128: add a space after "," >>>>> >>>>> 256 private static final String HASH_PASSWORDS = >>>>> 257 "jmx.remote.x.password.file.hash"; >>>>> >>>>> The name ".hash" part does not clearly communicate that passwords >>>>> are to be hashed. >>>>> "hashPasswords" might be more self explanatory. >>>> Changed it to "jmx.remote.x.password.file.hashpassword". >>> drop the "file." >> Done. >>>>> Also, can this be NOT duplicated here and in ConnectorBootStrap.java? >>>> The property names used in ConnectorBootStrap follows the convention >>>> used in management.properties file - 'com.sun.management.*'. For >>>> environment variables for a JMXConnector "jmx.remote.x.*" convention >>>> is used . Hence they cannot be duplicated. >>> The differing prefix'es are fine as is; no change except to make the >>> new keys consistent. >>> >>>>> >>>>> >>>>> * ConnectorBootStrap.java: >>>>> ?482: Add space after ","s; no spaces before. >>>>> >>>>> 770: use the same name for the option/property if possible to avoid >>>>> confusion. >>>> Not possible as explained above. >>>>> >>>>> 770:? if the HASH_PASSWORDS static is appropriate use it instead of >>>>> literal "true". >>>> DefaultValues.HASH_PASSWORDS static is set to 'true' and can be >>>> used. However using literal "true" is more readable than using the >>>> static. >>>>> >>>>> * HashedPasswordManager >>>>> >>>>> 80-83: The fields can be final and use the constructor to >>>>> initialize in all cases and make the class final >>>>> to avoid unintentional subclassing. >>>>> >>>>> >>>>> 113: canWriteToFile:?? It should be made clear in the template that >>>>> *both* the Security policy >>>>> ?? and the file access value are used to check that the file can be >>>>> updated. >>>> Made it explicit in template as well as code comments. >>>>> >>>>> 200: loadPasswords() - should this confirm the access to the file >>>>> is allowed and it has >>>>> the correct file access before reading? >>>> Not really required. Appropriate exceptions are thrown if file >>>> cannot be accessed. >>>>> >>>>> Is the re-writing of the passwords intended to be done by a >>>>> 'priveleged' system. >>>>> Does this need doPrivileged? >>>> I am not sure. Maybe it will be covered in the security review. >>>>> >>>>> * HashedPasswordFileTest: >>>>> >>>>> 88: should use the TestLibrary Utils.getRandomInstance so it logs >>>>> the seed and can be replayed if necessary. >>>>> >>>>> >>>> Done >>>>> Thanks, Roger >>>>> >>>> Thanks >>>> Harsha >>>>> >>>>> On 4/23/2017 6:20 AM, Harsha Wardhana B wrote: >>>>>> >>>>>> Hi All, >>>>>> >>>>>> Please review this enhancement to replace plain-text password for >>>>>> JMX agent with SHA-256 hash. >>>>>> >>>>>> Issue: https://bugs.openjdk.java.net/browse/JDK-5016517 >>>>>> >>>>>> >>>>>> webrev: http://cr.openjdk.java.net/~hb/5016517/webrev.00/ >>>>>> >>>>>> Overview of implementation: >>>>>> >>>>>> Currently, the JMX agent password file used to authenticate user, >>>>>> stores user name and password as clear text. Though system level >>>>>> restrictions are recommended for jmx password file, passwords are >>>>>> vulnerable since they are stored in clear. The current RFE >>>>>> proposes to store passwords as SHA256 hash instead of clear text. >>>>>> >>>>>> In current implementation, if password file is writable, and if >>>>>> passwords are in clear, they will be replaced by SHA256 hash upon >>>>>> agent boot-up or when login attempt is made. >>>>>> >>>>>> The file, >>>>>> src/jdk.management.agent/share/conf/jmxremote.password.template >>>>>> contains more details about the implementation. >>>>>> >>>>>> - Harsha >>>>>> >>>>>> >>>>>> >>>>>> >>>>> >>>> >>> >> > From robin.westberg at oracle.com Fri Oct 6 10:22:04 2017 From: robin.westberg at oracle.com (Robin Westberg) Date: Fri, 6 Oct 2017 12:22:04 +0200 Subject: RFR: 8187042: Events to show which objects are associated with biased object revocations Message-ID: <61AD2E91-A4E1-4575-A065-7A4F5B0FBFAF@oracle.com> Hi all, Please review this change to add event-based tracing events for biased lock revocations: Issue: https://bugs.openjdk.java.net/browse/JDK-8187042 Webrev (courtesy of Erik Gahlin): http://cr.openjdk.java.net/~egahlin/8187042/ Best regards, Robin -------------- next part -------------- An HTML attachment was scrubbed... URL: From yasuenag at gmail.com Fri Oct 6 12:42:52 2017 From: yasuenag at gmail.com (Yasumasa Suenaga) Date: Fri, 6 Oct 2017 21:42:52 +0900 Subject: PING: RFR: 8187401: Java Stack cannot be shown on HSDB In-Reply-To: <9d4c6c05-d927-1011-bf91-d35629d20800@oracle.com> References: <3d25fa66-085d-11a7-74f4-eba9fa0360b5@oracle.com> <71273032-0ad5-8aed-350a-b53d0da625b3@oracle.com> <05acbfe2-5d8a-b492-8f30-6180568ad26d@oracle.com> <4457260a-a6e3-57a4-ef40-5b83db09f3b4@oracle.com> <9d4c6c05-d927-1011-bf91-d35629d20800@oracle.com> Message-ID: <90bf8490-5f97-c5fb-cc22-4787452fe28e@gmail.com> Hi Jini, > Your changes look good. Thanks! > One point I want to make is that we have the > enum BasicTypeSize redefined in SA as public static final values, I will update BasicTypeSize in new webrev. > An ideal solution would be to include this in vmStructs.cpp > as a declare_constant() macro, and read this in SA with the > db.lookupIntConstant() method. I agree with you, but I think it should be another issue (enhancement). I've proposed to change of BasicType in 8185796 [1]. I can file it to JBS now, but I want to create a patch after 8185796 and 8187401 (They are waiting for reviewer(s)). Thanks, Yasumasa [1] http://mail.openjdk.java.net/pipermail/serviceability-dev/2017-September/021926.html On 2017/10/06 14:31, Jini George wrote: > Hi Yasumasa, > > Your changes look good. One point I want to make is that we have the > enum BasicTypeSize redefined in SA as public static final values, and > this makes it error prone when existing enum values change, just as in > this case. An ideal solution would be to include this in vmStructs.cpp > as a declare_constant() macro, and read this in SA with the > db.lookupIntConstant() method. This would insulate SA from enum value > changes in hotspot to some extent -- but this is just a suggestion, and > you can chose to ignore this, since this would mean changing all the > following values in BasicType. > > public static final int tBoolean = 4; > public static final int tChar = 5; > public static final int tFloat = 6; > public static final int tDouble = 7; > public static final int tByte = 8; > public static final int tShort = 9; > public static final int tInt = 10; > public static final int tLong = 11; > public static final int tObject = 12; > public static final int tArray = 13; > public static final int tVoid = 14; > public static final int tAddress = 15; > public static final int tNarrowOop = 16; > public static final int tMetadata = 17; > public static final int tNarrowKlass = 18; > public static final int tConflict = 19; > public static final int tIllegal = 99; > > Thank you, > Jini (Not a Reviewer). > > On 9/29/2017 2:58 PM, Yasumasa Suenaga wrote: >> Hi all, >> >> This change has been reviewed by Serguei. >> I'm waiting for another reviewer. >> >> >> Thanks, >> >> Yasumasa >> >> >> 2017/09/27 ??9:49 "Yasumasa Suenaga" > >: >> >> Hi David, Serguei, >> >> I added noreg-hard label and how to reproduce to JBS: >> >> https://bugs.openjdk.java.net/browse/JDK-8187401 >> >> >> >> Also I uploaded new webrev for jdk10/hs: >> >> http://cr.openjdk.java.net/~ysuenaga/JDK-8187401/webrev.01/ >> >> >> >> Thanks, >> >> Yasumasa >> >> >> >> 2017-09-27 8:25 GMT+09:00 serguei.spitsyn at oracle.com >> >> >: >> > On 9/26/17 16:22, David Holmes wrote: >> >> >> >> On 27/09/2017 8:52 AM, serguei.spitsyn at oracle.com >> wrote: >> >>> >> >>> Hi David, >> >>> >> >>> >> >>> On 9/26/17 15:09, David Holmes wrote: >> >>>> >> >>>> Hi Sergeui, >> >>>> >> >>>> On 27/09/2017 3:51 AM, serguei.spitsyn at oracle.com >> wrote: >> >>>>> >> >>>>> Hi Yasumasa, >> >>>>> >> >>>>> >> >>>>> On 9/26/17 02:41, Yasumasa Suenaga wrote: >> >>>>>> >> >>>>>> Hi Serguei, >> >>>>>> >> >>>>>> Thank you for your comment! >> >>>>>> >> >>>>>>> This fix looks Ok to me but you need to add a unit test. >> >>>>>> >> >>>>>>? ?I guess it is caused by inlined method which is generated >> by JIT >> >>>>>> compiler. I don't know how to reproduce it on jtreg test. >> >>>>>> Do you have any idea for it? >> >>>>> >> >>>>> >> >>>>> I'm not sure what exact problem you have with jtreg. >> >>>>> You may want to try to use other jtreg tests as examples. >> >>>> >> >>>> >> >>>> I see two problems: >> >>>> >> >>>> 1. hsdb is an interactive GUI tool >> >>> >> >>> >> >>> There is already at least one jtreg hsdb test: >> >>> open/test/hotspot/jtreg/serviceability/sa/JhsdbThreadInfoTest.java >> >>> >> >>> Not sure, if this example would help in this case though. >> >>> >> >>>> 2. The problem seems related to JIT inlining - so how do you >> force that >> >>>> in a test? >> >>> >> >>> >> >>> Then I wonder how was it forced in the manual reproducer? >> >>> The fact it is fixed has to be verified anyway. >> >> >> >> >> >> Well the reproducer happens to hit the issue, so we can use it >> to manually >> >> verify. >> >> >> >>>> I would think this is a noreg-hard situation. As long as there >> is a >> >>>> manual reproducer that can be used to verify the fix - as per >> the bug report >> >>>> - that should be okay IMHO. >> >>> >> >>> >> >>> I'm Ok with adding noreg-hard label if it is hard to develop. >> >> >> >> >> >> Sounds good to me. The manual verification steps should be very >> clearly >> >> spelt out in the bug report so that even someone unfamiliar with >> hsdb (like >> >> me!) can follow them easily. >> > >> > >> > Sounds good, thanks. >> > >> > Serguei >> > >> >> >> >> Cheers, >> >> David >> >> >> >>> Thanks, >> >>> Serguei >> >>> >> >>>> Cheers, >> >>>> David >> >>>> >> >>>>> Thanks, >> >>>>> Serguei >> >>>>> >> >>>>>> Yasumasa >> >>>>>> >> >>>>>> >> >>>>>> 2017-09-26 18:15 GMT+09:00 serguei.spitsyn at oracle.com >> >> >>>>>> > >: >> >>>>>>> >> >>>>>>> Hi Yasumasa, >> >>>>>>> >> >>>>>>> This fix looks Ok to me but you need to add a unit test. >> >>>>>>> >> >>>>>>> Thanks, >> >>>>>>> Serguei >> >>>>>>> >> >>>>>>> >> >>>>>>> >> >>>>>>> On 9/20/17 15:47, Yasumasa Suenaga wrote: >> >>>>>>>> >> >>>>>>>> PING: >> >>>>>>>> >> >>>>>>>> Have you checked this issue? >> >>>>>>>> >> >>>>>>>>> >> http://cr.openjdk.java.net/~ysuenaga/JDK-8187401/webrev.00/ >> >> >>>>>>>> >> >>>>>>>> >> >>>>>>>> >> >>>>>>>> Yasumasa >> >>>>>>>> >> >>>>>>>> >> >>>>>>>> On 2017/09/11 11:16, Yasumasa Suenaga wrote: >> >>>>>>>>> >> >>>>>>>>> Hi all, >> >>>>>>>>> >> >>>>>>>>> This review request is a part of [1]. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> JBS: >> >>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8187401 >> >> >>>>>>>>> >> >>>>>>>>> webrev: >> >>>>>>>>> >> http://cr.openjdk.java.net/~ysuenaga/JDK-8187401/webrev.00/ >> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> I cannot access JPRT. So I need a sponsor. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> Thanks, >> >>>>>>>>> >> >>>>>>>>> Yasumasa >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> [1] >> >>>>>>>>> >> >>>>>>>>> >> http://mail.openjdk.java.net/pipermail/serviceability-dev/2017-September/021821.html >> >> >>>>>>>>> >> >>>>> >> >>> >> > >> From markus.gronlund at oracle.com Fri Oct 6 13:41:52 2017 From: markus.gronlund at oracle.com (Markus Gronlund) Date: Fri, 6 Oct 2017 06:41:52 -0700 (PDT) Subject: RFR: 8187042: Events to show which objects are associated with biased object revocations In-Reply-To: <61AD2E91-A4E1-4575-A065-7A4F5B0FBFAF@oracle.com> References: <61AD2E91-A4E1-4575-A065-7A4F5B0FBFAF@oracle.com> Message-ID: <7a1f4e1d-c2a2-4b0f-bdb6-e33f6e5248cb@default> Hi Robin, This looks good, thank you very much for adding this! Cheers Markus From: Robin Westberg Sent: den 6 oktober 2017 12:22 To: serviceability-dev at openjdk.java.net Subject: RFR: 8187042: Events to show which objects are associated with biased object revocations Hi all, Please review this change to add event-based tracing events for biased lock revocations: Issue: https://bugs.openjdk.java.net/browse/JDK-8187042 Webrev (courtesy of Erik Gahlin): http://cr.openjdk.java.net/~egahlin/8187042/ Best regards, Robin -------------- next part -------------- An HTML attachment was scrubbed... URL: From jini.george at oracle.com Fri Oct 6 15:09:48 2017 From: jini.george at oracle.com (Jini George) Date: Fri, 6 Oct 2017 20:39:48 +0530 Subject: PING: RFR: 8187401: Java Stack cannot be shown on HSDB In-Reply-To: <90bf8490-5f97-c5fb-cc22-4787452fe28e@gmail.com> References: <3d25fa66-085d-11a7-74f4-eba9fa0360b5@oracle.com> <71273032-0ad5-8aed-350a-b53d0da625b3@oracle.com> <05acbfe2-5d8a-b492-8f30-6180568ad26d@oracle.com> <4457260a-a6e3-57a4-ef40-5b83db09f3b4@oracle.com> <9d4c6c05-d927-1011-bf91-d35629d20800@oracle.com> <90bf8490-5f97-c5fb-cc22-4787452fe28e@gmail.com> Message-ID: <5ca7c727-4c40-d056-df5d-13ecad3898c6@oracle.com> >> One point I want to make is that we have the >> enum BasicTypeSize redefined in SA as public static final values, > > I will update BasicTypeSize in new webrev. Sorry Yasumasa, I meant BasicType only. Thanks, Jini. From yasuenag at gmail.com Fri Oct 6 15:22:11 2017 From: yasuenag at gmail.com (Yasumasa Suenaga) Date: Sat, 7 Oct 2017 00:22:11 +0900 Subject: PING: RFR: 8187401: Java Stack cannot be shown on HSDB In-Reply-To: References: <3d25fa66-085d-11a7-74f4-eba9fa0360b5@oracle.com> <71273032-0ad5-8aed-350a-b53d0da625b3@oracle.com> <05acbfe2-5d8a-b492-8f30-6180568ad26d@oracle.com> <4457260a-a6e3-57a4-ef40-5b83db09f3b4@oracle.com> <9d4c6c05-d927-1011-bf91-d35629d20800@oracle.com> <90bf8490-5f97-c5fb-cc22-4787452fe28e@gmail.com> <5ca7c727-4c40-d056-df5d-13ecad3898c6@oracle.com> Message-ID: 2017/10/07 0:09 "Jini George" : One point I want to make is that we have the >> enum BasicTypeSize redefined in SA as public static final values, >> > > I will update BasicTypeSize in new webrev. > Sorry Yasumasa, I meant BasicType only. Ok, I continue to wait for second reviewer. Thanks, Yasumasa Thanks, Jini. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jini.george at oracle.com Fri Oct 6 15:28:25 2017 From: jini.george at oracle.com (Jini George) Date: Fri, 6 Oct 2017 20:58:25 +0530 Subject: RFR: 8187403: [Unknown generation] is shown in Stack Memory on HSDB In-Reply-To: References: <2ee4d0c9-bdd5-19f3-e036-0aa096b59946@gmail.com> <1869b689-d6ba-bfcd-6bc2-a4a5e0ca6695@oracle.com> <481be3d6-d19d-8fd0-6446-c8aea02488a6@oracle.com> Message-ID: <007f0dfb-1cf0-d30b-4b6b-bbc76278f0a8@oracle.com> Your changes look good, Yasumasa. Thanks, Jini (Not a Reviewer). On 9/29/2017 2:49 PM, Yasumasa Suenaga wrote: > Thanks Serguei, > > I'm waiting another reviewer. > > > Yasumasa > > > 2017/09/29 ??6:00 "serguei.spitsyn at oracle.com > " >: > > > > On 9/29/17 01:25, Yasumasa Suenaga wrote: > > Thanks Serguei, > > I've uploaded new webrev. Could you review again? > http://cr.openjdk.java.net/~ysuenaga/JDK-8187403/webrev.02/ > > > > Looks good. > I can sponsor it, but you probably need another review. > > Thanks, > Serguei > > > Yasumasa > > > > 2017-09-29 16:49 GMT+09:00 serguei.spitsyn at oracle.com > > >: > > Hi Yasumasa, > > > On 9/28/17 23:21, Yasumasa Suenaga wrote: > > Hi Serguet, > > Thank you for your comment. > > http://cr.openjdk.java.net/~ysuenaga/JDK-8187403/webrev.01/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/gc/g1/HeapRegion.java.frames.html > > > ? ?It seems, there is no reason for renaming 'type' to 't' > in the > initialize() method. > > I added new private member "type" as HeapRegionType. > > http://cr.openjdk.java.net/~ysuenaga/JDK-8187403/webrev.01/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/gc/g1/HeapRegion.java.udiff.html > > > So I renamed to "t" to avoid conflict. > > > There is no conflict. > Only local is used in the initialize() method. > Also, the initialize() method is static so that the instance > field 'type' is > not in its scope. > Otherwise, you could use this.type to avoid such a conflict. > > > http://cr.openjdk.java.net/~ysuenaga/JDK-8187403/webrev.01/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/gc/g1/HeapRegionManager.java.frames.html > > > ? ?89? ? ?public HeapRegion addrToRegion(Address addr) { > ? ?90? ? ? ?return regions().getByAddress(addr); > ? ?91? ? ?} > > ? ?A suggestion: replace 'addrToRegion' with 'getByAddress'. > ? ?It will look similar to the 'heapRegionIterator.' > > I've implemented it to follow HotSpot implementation. > > http://hg.openjdk.java.net/jdk10/hs/file/3a45532a1854/src/hotspot/share/gc/g1/heapRegionManager.inline.hpp#l32 > > > I think current proposal is easy to understand if other > people check > this with HotSpot. > Should I rename to "getByAddress" ? > > > ? ?81? ? ?public Iterator heapRegionIterator() { > ? ?82? ? ? ? ?return regions().heapRegionIterator(length()); > ? ?83? ? ?} > ? . . . > ? ?89? ? ?public HeapRegion addrToRegion(Address addr) { > ? ?90? ? ? ?return regions().getByAddress(addr); > ? ?91? ? ?} > > > ? ?There is already regions().getByAddress(addr), so I'm > suggesting to follow > this local pattern. > ? ?Renaming 'addrToRegion' to 'getByAddress' will unify it > with the > 'heapRegionIterator()'. > ? ?Otherwise, you would need to rename 'getByAddress' to > 'addrToRegion' > everywhere. > ? ?But I guess, it is better to avoid. > > Thanks, > Serguei > > > > Other your comment will be fixed in new webrev later. > > Thanks, > > Yasumasa > > > 2017-09-29 14:35 GMT+09:00 serguei.spitsyn at oracle.com > > >: > > Hi Yasumasa, > > Just some minor comments. > > http://cr.openjdk.java.net/~ysuenaga/JDK-8187403/webrev.01/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/gc/g1/G1HeapRegionTable.java.frames.html > > > I'd suggest to make the lines 144-145 a one-liner. > It won't be that big. Otherwise, the indent is not right. > > http://cr.openjdk.java.net/~ysuenaga/JDK-8187403/webrev.01/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/gc/g1/HeapRegion.java.frames.html > > > ? ?The same as above for lines 85-86. > ? ?It seems, there is no reason for renaming 'type' to 't' > in the > initialize() method. > > > http://cr.openjdk.java.net/~ysuenaga/JDK-8187403/webrev.01/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/gc/g1/HeapRegionManager.java.frames.html > > > ? ?89? ? ?public HeapRegion addrToRegion(Address addr) { > ? ?90? ? ? ?return regions().getByAddress(addr); > ? ?91? ? ?} > > ? ?A suggestion: replace 'addrToRegion' with 'getByAddress'. > ? ?It will look similar to the 'heapRegionIterator.' > > > http://cr.openjdk.java.net/~ysuenaga/JDK-8187403/webrev.01/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/gc/g1/HeapRegionType.java.html > > > ? ?41? ? ?private static int freeTag; > ? ?42 > ? ?43? ? ?private static int youngMask; > ? ?44 > ? ?45? ? ?private static int humongousMask; > ? ?46 > ? ?47? ? ?private static int pinnedMask; > ? ?48 > ? ?49? ? ?private static int oldMask; > ? ?50 > ? ?51? ? ?private static CIntegerField tagField; > > ? ?Unneeded empty lines. > > ? ?Also, it looks like the fields 'freeTag' and > 'pinnedMask' are never > initialized. > ? ?Not sure, if it is intentional. > > Otherwise, the fix looks good to me. > > Thanks, > Serguei > > > On 9/28/17 18:20, Yasumasa Suenaga wrote: > > Hi Serguei, > > I added it to JBS: > > https://bugs.openjdk.java.net/browse/JDK-8187403?focusedCommentId=14119248&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14119248 > > > Sorry for my English. I'm not good at English... > > > Please, don't worry about your English. > Your description looks good. > Thank you for the bug report update! > > Thanks, > Serguei > > > Yasumasa > > > > 2017-09-29 8:27 GMT+09:00 serguei.spitsyn at oracle.com > > >: > > Hi Yasumasa, > > Could you, please, also add some evaluation to the bug > report about what is > the root cause and how do you fix it? > > Thanks, > Serguei > > > > On 9/26/17 18:10, Yasumasa Suenaga wrote: > > Hi all, > > I added noreg-hard label to JBS because this issue appears Stack > Memory window on HSDB (GUI application). So it is hard to test. > > https://bugs.openjdk.java.net/browse/JDK-8187403 > > > > Thanks, > > Yasumasa > > > > 2017-09-26 23:55 GMT+09:00 Yasumasa Suenaga > >: > > Hi all, > > I uploaded new webrev to be adapted to jdk10/hs: > > http://cr.openjdk.java.net/~ysuenaga/JDK-8187403/webrev.01/ > > > > Thanks, > > Yasumasa > > > On 2017/09/21 7:48, Yasumasa Suenaga wrote: > > PING: > > Have you checked this issue? > > http://cr.openjdk.java.net/~ysuenaga/JDK-8187403/webrev.00/ > > > > Yasumasa > > > > On 2017/09/11 11:18, Yasumasa Suenaga wrote: > > Hi all, > > This review request is a part of [1]. > > > JBS: > https://bugs.openjdk.java.net/browse/JDK-8187403 > > > webrev: > http://cr.openjdk.java.net/~ysuenaga/JDK-8187403/webrev.00/ > > > > I cannot access JPRT. So I need a sponsor. > > > Thanks, > > Yasumasa > > > [1] > > http://mail.openjdk.java.net/pipermail/serviceability-dev/2017-September/021821.html > > > > > From erik.gahlin at oracle.com Sat Oct 7 18:31:03 2017 From: erik.gahlin at oracle.com (Erik Gahlin) Date: Sat, 7 Oct 2017 20:31:03 +0200 Subject: RFR: 8187042: Events to show which objects are associated with biased object revocations In-Reply-To: <61AD2E91-A4E1-4575-A065-7A4F5B0FBFAF@oracle.com> References: <61AD2E91-A4E1-4575-A065-7A4F5B0FBFAF@oracle.com> Message-ID: Looks good! Erik > On 6 Oct 2017, at 12:22, Robin Westberg wrote: > > Hi all, > > Please review this change to add event-based tracing events for biased lock revocations: > > Issue: https://bugs.openjdk.java.net/browse/JDK-8187042 > Webrev (courtesy of Erik Gahlin): http://cr.openjdk.java.net/~egahlin/8187042/ > > Best regards, > Robin -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.holmes at oracle.com Mon Oct 9 01:26:54 2017 From: david.holmes at oracle.com (David Holmes) Date: Mon, 9 Oct 2017 11:26:54 +1000 Subject: RFR: 8187042: Events to show which objects are associated with biased object revocations In-Reply-To: <61AD2E91-A4E1-4575-A065-7A4F5B0FBFAF@oracle.com> References: <61AD2E91-A4E1-4575-A065-7A4F5B0FBFAF@oracle.com> Message-ID: Hi Robin, On 6/10/2017 8:22 PM, Robin Westberg wrote: > Hi all, > > Please review this change to add event-based tracing events for biased > lock revocations: > > Issue: https://bugs.openjdk.java.net/browse/JDK-8187042 > Webrev (courtesy of Erik Gahlin): > http://cr.openjdk.java.net/~egahlin/8187042/ I have a few queries: First, why is there no event for the self-revocation path? Second, is there a reason you can't put the event management inside the VM operation code and so avoid the need to adjust the safepoint counter? Third, I would have expected to see more detail in the event such as which thread (id) the object was biased to and which thread revoked the bias. Even perhaps some notion of which instance was involved (though that's harder to shows). Thanks, David > Best regards, > Robin From harsha.wardhana.b at oracle.com Mon Oct 9 05:34:52 2017 From: harsha.wardhana.b at oracle.com (Harsha Wardhana B) Date: Mon, 9 Oct 2017 11:04:52 +0530 Subject: RFE Review : JDK-5016517 - Replace plaintext passwords by hashed passwords for out-of-the-box JMX Agent In-Reply-To: <85633749-9830-dcd5-97a5-411f684776bd@oracle.com> References: <6a193cdf-efea-e4e9-4faf-72d8960cd72b@Oracle.com> <0483f2c9-892e-f842-bb5f-2b522b81b199@oracle.com> <6ec697c3-cb0d-ef5b-4a0c-e26e0cfc92b0@Oracle.com> <1ceb614a-585d-2ccd-0877-f99202c43718@oracle.com> <5892e0b2-51c6-6a22-5a39-d4ab93ca2fcf@oracle.com> <85633749-9830-dcd5-97a5-411f684776bd@oracle.com> Message-ID: <83690f04-4cdb-ce11-a744-7d4372ec7500@oracle.com> Hi Daniel, Below is the webrev addressing the review comments. http://cr.openjdk.java.net/~hb/5016517/webrev.04/ On Friday 06 October 2017 03:38 PM, Daniel Fuchs wrote: > Hi Harsha, > > Good work! > > > http://cr.openjdk.java.net/~hb/5016517/webrev.03/ > > long standing typo in management.properties at line 90: > > ? measureRole => monitorRole Done. > > HashedPasswordManager.java: loadPasswords() > > It seems this function will add the header to the file even > if it already contains the header. > > So every time a user/administrator wants to change/add a password, > the header will be inserted again. > Yes. It is fixed in the new webrev. > > HashedPasswordFileTest should probably have a test for this > scenario as well: > > generate password file with clear text password > load it, then verify passwords have been hashed (properly) > add some new user/name password to the same file > load it again, verify all passwords are hashed > (do this a number of times - to make sure it doesn't > ?break the second or third time) > and finally verify the header is only present once ;-) > Done. Added a testcase for the same. > I'm surprised no other tests had to be modified. > Is password hash disabled by default in the default agent? > Password hashing is enabled by default. But it is only the implementation that is changed. The pluggable JAAS mechanism isolates interfaces from implementation. So in theory, all tests should pass. > If not then you should try (locally) running jtreg > more than once over the default agent tests. > Just make sure running the same test twice doesn't > make the legacy tests that use password files failing the > second time when they discover that passwords have been > hashed under their feet (the client part of the test > might be reading the password file too to see which > password it should send to the agent). > > Otherwise I think it looks good to me - provided all > tests are passing! > Done. Had a few test failures but nothing related to this enhancement. > best regards, > > -- daniel Thanks Harsha > > On 06/10/2017 06:25, Harsha Wardhana B wrote: >> Hi All, >> >> Previously, for default agent, hashing of the passwords was done >> during the agent boot-up (ConnectorBootstrap.java). That was an error >> since login configuration could be different and is determined only >> when a login attempt is made. It would be then pointless to hash the >> password file. The fix for above and some off-list comments are >> incorporated in webrev below. >> >> http://cr.openjdk.java.net/~hb/5016517/webrev.03/ >> >> -Harsha >> >> >> On Wednesday 04 October 2017 01:53 PM, Harsha Wardhana B wrote: >>> Hi Roger, >>> >>> Below is the webrev incorporating changes suggested by you. >>> >>> http://cr.openjdk.java.net/~hb/5016517/webrev.02/ >>> >>> -Harsha >>> >>> On Wednesday 04 October 2017 12:54 AM, Roger Riggs wrote: >>>> Hi Harsha, >>>> >>>> FileLoginModule.java:? 104:? Add a period at the end of the the >>>> sentence. >>>> >>>> JMXPluggableAuthenticator.java: line 306:? Is the difference >>>> between singular and plural significant? >>>> ? It would be less confusing if both were plural (hashPasswords). >>> Ok. >>>> ConnectorBootstrap: >>>> 134: ...password.file.hash" and HashedPasswordManager disagree on >>>> the exact string. >>>> I would propose 'hashpasswords' as the suffix in all places to be >>>> consistent >>>> in ConnectorBootstrap.java, HashedPasswordManager (except for >>>> capitalization), >>>> jmxremote.password.template, and management.properties >>> Do you want to rename HashedPasswordManager class? >>>> >>>> As is you have a mix of "...password.hash", >>>> "...password.file.hash", "...hashpassword"; >>>> that's not good for knowing there is only one semantic. >>>> >>>> line 482:? " ," -> ", "? space after comma, not before >>>> >>> Will incorporate above comments. >>>> line: 771: is it intentional to discard the reference to the new >>>> HashedPasswordManager? >>>> If the intention is only to use the side effect of loadPasswords, >>>> then please >>>> create a static method in HashedPasswordManager for that purpose. >>>> (Even if just does the same code; it would be clear that's the >>>> purpose). >>>> (It probably also implies that the password file will be read a >>>> second time somewhere else in the initialization). >>> Static methods just to hash passwords can be created but >>> HashedPasswordManager class will have to be re-factored since almost >>> all methods are using instance variables. Not sure if we want >>> instance methods and look-alike static methods side-by-side. >>> Wouldn't that be more confusing than current implementation? >>>> >>>> line:770:? the string constant would be nicer as a final static >>>> string somewhere. >>>> ? "jmx.remote.x.password.file.hashpassword" >>> All of "jmx.remote.x.*" don't have static strings. They are used 'as >>> is' all over the code to maintain isolation between pluggable login >>> authenticator and JDK code. >>>> >>>> Roger >>>> >>>> >>> Harsha >>>> >>>> On 10/3/2017 3:47 PM, Harsha Wardhana B wrote: >>>>> >>>>> Hi Roger,>>> Thanks for the detailed review. Below is the webrev >>>>> addressing all the review comments. >>>>> >>>>> http://cr.openjdk.java.net/~hb/5016517/webrev.01/ >>>>> >>>>> -Harsha >>>>> >>>>> >>>>> On Tuesday 25 April 2017 10:56 PM, Roger Riggs wrote: >>>>>> Hi Harsha, >>>>>> >>>>>> Thanks for this important improvement. Comments: >>>>>> >>>>>> >>>>>> * jmxremote.password.template: >>>>>> ? "Passwords will be hashed by server if they are in clear." >>>>>> Perhaps should be more explicit: >>>>>> >>>>>> ?? "The jmxremote.passwords file will be re-written by the server >>>>>> to replace all plain text passwords with hashed passwords when >>>>>> the file is read by the server." >>>>>> >>>>>> line 35: "Base64 encoded hash"? -> drop the "Base64" in this line >>>>>> isn't needed and >>>>>> make it seems like it should appear as 1 field instead of 2 or 3. >>>>>> >>>>>> 37+: The syntax of the file may be clearer if it includes the >>>>>> complete syntax in (line 39) not >>>>>> just the password/hash fragment. >>>>>> >>>>>> Line 41:? "W = spaces"; above "tabs" are allowed as a delimiter; >>>>>> it would be good to be consistent >>>>>> and include the usualy white-space characters in the set, be as >>>>>> specific as possible. >>>>>> Is this the same set of whitespace used by Regex '\\s'. >>>>> Only spaces and tabs are allowed. '\s' matches newline as well >>>>> hence not allowed. >>>>>> >>>>>> 45: "java platform""? ->?? "MD5, SDA-1, SHA-256 are supported >>>>>> algorithms." >>>>>> >>>>>> 49: be more specific about 'hashing is requested'? how? Refer to >>>>>> the management.properties >>>>>> ? com.sun.management.jmxremote.password.hash value. >>>>>> >>>>>> >>>>>> >>>>>> 51:? "replace hashed" -> "replace *the *hashed" >>>>>> 52: "with clear text or new" -> "with the clear text or the new" >>>>>> 52: "If new password" -> "If the new password" >>>>>> 53: "when new login" -> "when a new login" >>>>>> >>>>>> 60: "User generated" -> "A User generated" >>>>>> >>>>>> 67: Will the file be ignored if it has the wrong permissions. >>>>>> (With a logged message) >>>>> Addressed all the above review comments. >>>>>> >>>>>> * management.properties >>>>>> >>>>>> 306: "(Case for true/false ignored)"? - what does this mean; I >>>>>> think it can be removed. >>>>>> >>>>>> 307: missing period at the end of the sentence. >>>>>> 309: "in password file" -> "in the password file" >>>>>> >>>>> Done. >>>>>> >>>>>> * FileLoginModule.java >>>>>> >>>>>> 102: can this match better the similar name in the >>>>>> management.properties if it has the same function: >>>>>> ??? com.sun.management.jmxremote.password.hash >>>>> Are you suggesting that 'hashPassword' be renamed to something >>>>> similar to com.sun.management.jmxremote.password.hash? Variable >>>>> names cannot be similar to property names since property names are >>>>> long and provide complete context which local variables need not >>>>> have to do. >>>> the suffix should be the same in all places since it is a single >>>> semantic. >>> Done. >>>>>> 103: "replaces clear text passwords" -> "replaces each clear text >>>>>> password" >>>>>> 104: indent to match previous
enteries. >>>>>> >>>>>> * JMXPluggableAuthenticator.java >>>>>> >>>>>> 119: There is no need to copy the password to a new local >>>>> It is required since variables accessed from inner class must be >>>>> final or effectively final. >>>> right >>>>>> >>>>>> 128: add a space after "," >>>>>> >>>>>> 256 private static final String HASH_PASSWORDS = >>>>>> 257 "jmx.remote.x.password.file.hash"; >>>>>> >>>>>> The name ".hash" part does not clearly communicate that passwords >>>>>> are to be hashed. >>>>>> "hashPasswords" might be more self explanatory. >>>>> Changed it to "jmx.remote.x.password.file.hashpassword". >>>> drop the "file." >>> Done. >>>>>> Also, can this be NOT duplicated here and in >>>>>> ConnectorBootStrap.java? >>>>> The property names used in ConnectorBootStrap follows the >>>>> convention used in management.properties file - >>>>> 'com.sun.management.*'. For environment variables for a >>>>> JMXConnector "jmx.remote.x.*" convention is used . Hence they >>>>> cannot be duplicated. >>>> The differing prefix'es are fine as is; no change except to make >>>> the new keys consistent. >>>> >>>>>> >>>>>> >>>>>> * ConnectorBootStrap.java: >>>>>> ?482: Add space after ","s; no spaces before. >>>>>> >>>>>> 770: use the same name for the option/property if possible to >>>>>> avoid confusion. >>>>> Not possible as explained above. >>>>>> >>>>>> 770:? if the HASH_PASSWORDS static is appropriate use it instead >>>>>> of literal "true". >>>>> DefaultValues.HASH_PASSWORDS static is set to 'true' and can be >>>>> used. However using literal "true" is more readable than using the >>>>> static. >>>>>> >>>>>> * HashedPasswordManager >>>>>> >>>>>> 80-83: The fields can be final and use the constructor to >>>>>> initialize in all cases and make the class final >>>>>> to avoid unintentional subclassing. >>>>>> >>>>>> >>>>>> 113: canWriteToFile:?? It should be made clear in the template >>>>>> that *both* the Security policy >>>>>> ?? and the file access value are used to check that the file can >>>>>> be updated. >>>>> Made it explicit in template as well as code comments. >>>>>> >>>>>> 200: loadPasswords() - should this confirm the access to the file >>>>>> is allowed and it has >>>>>> the correct file access before reading? >>>>> Not really required. Appropriate exceptions are thrown if file >>>>> cannot be accessed. >>>>>> >>>>>> Is the re-writing of the passwords intended to be done by a >>>>>> 'priveleged' system. >>>>>> Does this need doPrivileged? >>>>> I am not sure. Maybe it will be covered in the security review. >>>>>> >>>>>> * HashedPasswordFileTest: >>>>>> >>>>>> 88: should use the TestLibrary Utils.getRandomInstance so it logs >>>>>> the seed and can be replayed if necessary. >>>>>> >>>>>> >>>>> Done >>>>>> Thanks, Roger >>>>>> >>>>> Thanks >>>>> Harsha >>>>>> >>>>>> On 4/23/2017 6:20 AM, Harsha Wardhana B wrote: >>>>>>> >>>>>>> Hi All, >>>>>>> >>>>>>> Please review this enhancement to replace plain-text password >>>>>>> for JMX agent with SHA-256 hash. >>>>>>> >>>>>>> Issue: https://bugs.openjdk.java.net/browse/JDK-5016517 >>>>>>> >>>>>>> >>>>>>> webrev: http://cr.openjdk.java.net/~hb/5016517/webrev.00/ >>>>>>> >>>>>>> Overview of implementation: >>>>>>> >>>>>>> Currently, the JMX agent password file used to authenticate >>>>>>> user, stores user name and password as clear text. Though system >>>>>>> level restrictions are recommended for jmx password file, >>>>>>> passwords are vulnerable since they are stored in clear. The >>>>>>> current RFE proposes to store passwords as SHA256 hash instead >>>>>>> of clear text. >>>>>>> >>>>>>> In current implementation, if password file is writable, and if >>>>>>> passwords are in clear, they will be replaced by SHA256 hash >>>>>>> upon agent boot-up or when login attempt is made. >>>>>>> >>>>>>> The file, >>>>>>> src/jdk.management.agent/share/conf/jmxremote.password.template >>>>>>> contains more details about the implementation. >>>>>>> >>>>>>> - Harsha >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From yasuenag at gmail.com Mon Oct 9 14:19:50 2017 From: yasuenag at gmail.com (Yasumasa Suenaga) Date: Mon, 9 Oct 2017 23:19:50 +0900 Subject: [10] RFR: 8185796: jstack and clhsdb jstack should show lock objects In-Reply-To: <66c16569-d7bd-3f4c-6dff-b5b27a77d38f@gmail.com> References: <1734e8a3-5178-2953-2b42-b443177e9cc2@gmail.com> <33bf35e5-7d8b-db21-4209-edd675c42d85@oracle.com> <66c16569-d7bd-3f4c-6dff-b5b27a77d38f@gmail.com> Message-ID: <56759c30-f67c-3f1d-e056-70a090c916b0@gmail.com> Hi all, I uploaded new webrev to be adapted to current jdk10/hs: http://cr.openjdk.java.net/~ysuenaga/JDK-8185796/webrev.03/ Please review and sponsor it. Thanks, Yasumasa On 2017/09/27 0:31, Yasumasa Suenaga wrote: > Hi all, > > I uploaded new webrev to be adapted to jdk10/hs: > > ? http://cr.openjdk.java.net/~ysuenaga/JDK-8185796/webrev.02/ > > > Thanks, > > Yasumasa > > > On 2017/08/24 22:59, Yasumasa Suenaga wrote: >> Thanks Jini! >> >> I uploaded new webrev: >> >> ?? http://cr.openjdk.java.net/~ysuenaga/JDK-8185796/webrev.01/ >> >> This webrev has been ported print_lock_info() to JavaVFrame.java, and I've added new testcase for `jhsdb jstack` and jstack command on `jhsdb clhsdb`. >> >> >> Yasumasa >> >> >> On 2017/08/24 18:01, Jini George wrote: >>> Apologize for the late reply, Yasumasa. >>> >>> >>>> I think so, but I guess it is difficult. >>>> For example, test for CLHSDB command is provided as test/serviceability/sa/TestPrintMdo.java . >>>> But target process seems to be fixed to "LingeredApp". >>>> Can we change it to another program which generates lock contention? >>> >>> You can take a look at any of the hotspot/test/serviceability/sa/LingeredAppWith*.java files for this. The target process does not have to be be fixed to LingeredApp -- in these LingeredAppWith* cases, the targets are test-specific variations built on top of LingeredApp for ease of implementation. >>> >>> Thanks, >>> Jini. From robin.westberg at oracle.com Mon Oct 9 14:57:58 2017 From: robin.westberg at oracle.com (Robin Westberg) Date: Mon, 9 Oct 2017 16:57:58 +0200 Subject: RFR: 8187042: Events to show which objects are associated with biased object revocations In-Reply-To: References: <61AD2E91-A4E1-4575-A065-7A4F5B0FBFAF@oracle.com> Message-ID: <3059930F-FB64-43A2-A52E-AED01B95796F@oracle.com> Hi David, Thanks for taking the time to look at this! > On 9 Oct 2017, at 03:26, David Holmes wrote: > > Hi Robin, > > On 6/10/2017 8:22 PM, Robin Westberg wrote: >> Hi all, >> Please review this change to add event-based tracing events for biased lock revocations: >> Issue: https://bugs.openjdk.java.net/browse/JDK-8187042 >> Webrev (courtesy of Erik Gahlin): http://cr.openjdk.java.net/~egahlin/8187042/ > > I have a few queries: > > First, why is there no event for the self-revocation path? That?s a good question, I did not want to add events for revocations done without consulting the bulk revocation heuristics (as that could generate a very large amount of events), so I may have mistakenly thought of the self-revocation as a part of that.. But certainly makes sense to have an event there as well, I?ll do a bit of testing and add one. > Second, is there a reason you can't put the event management inside the VM operation code and so avoid the need to adjust the safepoint counter? Well yes, the event itself is configured to record the current thread together with a stack trace, but that requires that the event is actually generated from the thread that should be recorded. > Third, I would have expected to see more detail in the event such as which thread (id) the object was biased to and which thread revoked the bias. Even perhaps some notion of which instance was involved (though that's harder to shows). Right, I?ve been looking at capturing which thread the object was biased towards, but I was afraid of the possible races there as the thread pointer in the mark would have to be saved before executing the VM operation. For that to work 100% reliably I suspect it would have to be done inside the safepoint. I will create an updated webrev after looking into adding an event for the self-revocation path. Best regards, Robin From jini.george at oracle.com Mon Oct 9 16:15:55 2017 From: jini.george at oracle.com (Jini George) Date: Mon, 9 Oct 2017 21:45:55 +0530 Subject: RFR: SA: MacOS X: 8184042: several serviceability/sa tests timed out on MacOS X In-Reply-To: <081d276d-4616-364c-2f9c-acda35b4eaf4@oracle.com> References: <2121618c-2878-385e-72bb-d9dc6c25b65a@oracle.com> <7b5ba4a7-2463-d9e3-b0a4-76957c630719@oracle.com> <081d276d-4616-364c-2f9c-acda35b4eaf4@oracle.com> Message-ID: <6b4cb988-a4e4-de22-1728-c87dccfde972@oracle.com> Hi all, I have created a webrev restoring the PT_ATTACH: http://cr.openjdk.java.net/~jgeorge/8184042/webrev.01/ Have included Dmitry's comments on disabling the the deprecation warning. I would like to request for reviews for this. Thank you, Jini. On 9/8/2017 3:09 AM, serguei.spitsyn at oracle.com wrote: > On 8/25/17 02:24, serguei.spitsyn at oracle.com wrote: >> Hi Jini, >> >> >> On 8/18/17 04:00, David Holmes wrote: >>> Hi Jini, >>> >>> Just reading the bug report and your description below this seems >>> like a major change to try and use a facility (mach exceptions) that >>> no one seems to have any experience with! That isn't something to be >>> rushed. >> >>> Even if PT_ATTACH has been deprecated restoring its use may be the >>> quick way forward instead of trying to rush in something like this. >> >> This approach looks reasonable to me. > > I've just realized that my statement might sound incorrectly. > I meant that the David's suggestion to restore the use of the deprecated > PT_ATTACH looks reasonable. > > Sorry, if it caused any confusion. > > Thanks, > Serguei > > >> Otherwise, it would be nice to hear why it is not good. >> How much would it break the fix of the JDK-8182299? >> >> Thanks, >> Serguei >> >>> >>> Just my 2c. >>> >>> Cheers, >>> David >>> >>> On 18/08/2017 8:00 PM, Jini George wrote: >>>> Hi all, >>>> >>>> Requesting reviews for: >>>> https://bugs.openjdk.java.net/browse/JDK-8184042 >>>> >>>> Webrev: http://cr.openjdk.java.net/~jgeorge/8184042/webrev.00/ >>>> >>>> Problem gist: The deprecated ptrace() command, PT_ATTACH was changed >>>> to PT_ATTACHEXC, which causes mach exceptions (and not UNIX signals) >>>> to be delivered via mach messages.This caused SA to hang at >>>> waitpid() waiting for a signal, which does not arrive. >>>> >>>> Solution in a nutshell: The solution is to make the required changes >>>> to handle mach 'soft signal' exceptions in the form of mach messages >>>> instead of signals, while attaching to and detaching from the target >>>> process. The detailed steps are outlined in JBS. >>>> >>>> The changes appear huge due to the inclusion of pre-generated mach >>>> exception handling files (mach_exc*). Since this is an integration >>>> blocker, it would be great to get quick reviews on this. >>>> >>>> Thank you, >>>> Jini. >>>> >>>> >>>> >>>> >>>> >>>> >> > From daniel.daugherty at oracle.com Mon Oct 9 19:41:24 2017 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Mon, 9 Oct 2017 13:41:24 -0600 Subject: RFR(XL): 8167108 - SMR and JavaThread Lifecycle Message-ID: <1e50bb73-840c-fc3a-81ad-31f83037093f@oracle.com> Greetings, We have a (eXtra Large) fix for the following bug: 8167108 inconsistent handling of SR_lock can lead to crashes https://bugs.openjdk.java.net/browse/JDK-8167108 This fix adds a Safe Memory Reclamation (SMR) mechanism based on Hazard Pointers to manage JavaThread lifecycle. Here's a PDF for the internal wiki that we've been using to describe and track the work on this project: http://cr.openjdk.java.net/~dcubed/8167108-webrev/SMR_and_JavaThread_Lifecycle-JDK10-04.pdf Dan has noticed that the indenting is wrong in some of the code quotes in the PDF that are not present in the internal wiki. We don't have a solution for that problem yet. Here's the webrev for current JDK10 version of this fix: http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-04-full This fix has been run through many rounds of JPRT and Mach5 tier[2-5] testing, additional stress testing on Dan's Solaris X64 server, and additional testing on Erik and Robbin's machines. We welcome comments, suggestions and feedback. Daniel Daugherty Erik Osterlund Robbin Ehn From daniel.daugherty at oracle.com Mon Oct 9 21:23:23 2017 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Mon, 9 Oct 2017 15:23:23 -0600 Subject: RFR(XL): 8167108 - SMR and JavaThread Lifecycle In-Reply-To: <1e50bb73-840c-fc3a-81ad-31f83037093f@oracle.com> References: <1e50bb73-840c-fc3a-81ad-31f83037093f@oracle.com> Message-ID: <546f3f48-47cf-73d1-30b1-b388418ae0bf@oracle.com> Many thanks to the folks that reviewed this internally and provided much appreciated feedback: - Daniel Daugherty - David Holmes - Erik Osterlund - Jerry Thornbrugh - Karen Kinnear - Kim Barrett - Robbin Ehn - Serguei Spitsyn - Stefan Karlson Since there are three contributing authors, we have been reviewing (and arguing over) each other's code. It has been an adventure! Dan, Erik, and Robbin On 10/9/17 1:41 PM, Daniel D. Daugherty wrote: > Greetings, > > We have a (eXtra Large) fix for the following bug: > > 8167108 inconsistent handling of SR_lock can lead to crashes > https://bugs.openjdk.java.net/browse/JDK-8167108 > > This fix adds a Safe Memory Reclamation (SMR) mechanism based on > Hazard Pointers to manage JavaThread lifecycle. > > Here's a PDF for the internal wiki that we've been using to describe > and track the work on this project: > > http://cr.openjdk.java.net/~dcubed/8167108-webrev/SMR_and_JavaThread_Lifecycle-JDK10-04.pdf > > > Dan has noticed that the indenting is wrong in some of the code quotes > in the PDF that are not present in the internal wiki. We don't have a > solution for that problem yet. > > Here's the webrev for current JDK10 version of this fix: > > http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-04-full > > This fix has been run through many rounds of JPRT and Mach5 tier[2-5] > testing, additional stress testing on Dan's Solaris X64 server, and > additional testing on Erik and Robbin's machines. > > We welcome comments, suggestions and feedback. > > Daniel Daugherty > Erik Osterlund > Robbin Ehn > From david.holmes at oracle.com Mon Oct 9 21:59:44 2017 From: david.holmes at oracle.com (David Holmes) Date: Tue, 10 Oct 2017 07:59:44 +1000 Subject: RFR: 8187042: Events to show which objects are associated with biased object revocations In-Reply-To: <3059930F-FB64-43A2-A52E-AED01B95796F@oracle.com> References: <61AD2E91-A4E1-4575-A065-7A4F5B0FBFAF@oracle.com> <3059930F-FB64-43A2-A52E-AED01B95796F@oracle.com> Message-ID: Hi Robin, On 10/10/2017 12:57 AM, Robin Westberg wrote: > Hi David, > > Thanks for taking the time to look at this! > >> On 9 Oct 2017, at 03:26, David Holmes wrote: >> >> Hi Robin, >> >> On 6/10/2017 8:22 PM, Robin Westberg wrote: >>> Hi all, >>> Please review this change to add event-based tracing events for biased lock revocations: >>> Issue: https://bugs.openjdk.java.net/browse/JDK-8187042 >>> Webrev (courtesy of Erik Gahlin): http://cr.openjdk.java.net/~egahlin/8187042/ >> >> I have a few queries: >> >> First, why is there no event for the self-revocation path? > > That?s a good question, I did not want to add events for revocations done without consulting the bulk revocation heuristics (as that could generate a very large amount of events), so I may have mistakenly thought of the self-revocation as a part of that.. But certainly makes sense to have an event there as well, I?ll do a bit of testing and add one. Ok. >> Second, is there a reason you can't put the event management inside the VM operation code and so avoid the need to adjust the safepoint counter? > > Well yes, the event itself is configured to record the current thread together with a stack trace, but that requires that the event is actually generated from the thread that should be recorded. Ah - good point. And good to know the requesting thread is captured - that wasn't clear to me from the event snippets in the webrev. >> Third, I would have expected to see more detail in the event such as which thread (id) the object was biased to and which thread revoked the bias. Even perhaps some notion of which instance was involved (though that's harder to shows). > > Right, I?ve been looking at capturing which thread the object was biased towards, but I was afraid of the possible races there as the thread pointer in the mark would have to be saved before executing the VM operation. For that to work 100% reliably I suspect it would have to be done inside the safepoint. Right the thread holding the bias may not even exist any more! This may need to utilise the new Thread-SMR work (as a future RFE of course). :) > I will create an updated webrev after looking into adding an event for the self-revocation path. Thanks, David > Best regards, > Robin > From jcbeyler at google.com Mon Oct 9 22:57:45 2017 From: jcbeyler at google.com (JC Beyler) Date: Mon, 9 Oct 2017 15:57:45 -0700 Subject: Low-Overhead Heap Profiling In-Reply-To: References: <2af975e6-3827-bd57-0c3d-fadd54867a67@oracle.com> <365499b6-3f4d-a4df-9e7e-e72a739fb26b@oracle.com> <102c59b8-25b6-8c21-8eef-1de7d0bbf629@oracle.com> <1497366226.2829.109.camel@oracle.com> <1498215147.2741.34.camel@oracle.com> <044f8c75-72f3-79fd-af47-7ee875c071fd@oracle.com> <23f4e6f5-c94e-01f7-ef1d-5e328d4823c8@oracle.com> Message-ID: Dear all, Thread-safety is back!! Here is the update webrev: http://cr.openjdk.java.net/~rasbold/8171119/webrev.10_11/ Full webrev is here: http://cr.openjdk.java.net/~rasbold/8171119/webrev.11/ In order to really test this, I needed to add this so thought now was a good time. It required a few changes here for the creation to ensure correctness and safety. Now we keep the static pointer but clear the data internally so on re-initialize, it will be a bit more costly than before. I don't think this is a huge use-case so I did not think it was a problem. I used the internal MutexLocker, I think I used it well, let me know. I also added three tests: 1) Stack depth test: http://cr.openjdk.java.net/~rasbold/8171119/webrev.10_11/test/hotspot/jtreg/serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStackDepthTest.java.patch This test shows that the maximum stack depth system is working. 2) Thread safety: http://cr.openjdk.java.net/~rasbold/8171119/webrev.10_11/test/hotspot/jtreg/serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorThreadTest.java.patch The test creates 24 threads and they all allocate at the same time. The test then checks it does find samples from all the threads. 3) Thread on/off safety http://cr.openjdk.java.net/~rasbold/8171119/webrev.10_11/test/hotspot/jtreg/serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorThreadOnOffTest.java.patch The test creates 24 threads that all allocate a bunch of memory. Then another thread turns the sampling on/off. Btw, both tests 2 & 3 failed without the locks. As I worked on this, I saw a lot of places where the tests are doing very similar things, I'm going to clean up the code a bit and make a HeapAllocator class that all tests can call directly. This will greatly simplify the code. Thanks for any comments/criticisms! Jc On Mon, Oct 2, 2017 at 8:52 PM, JC Beyler wrote: > Dear all, > > Small update to the webrev: > http://cr.openjdk.java.net/~rasbold/8171119/webrev.09_10/ > > Full webrev is here: > http://cr.openjdk.java.net/~rasbold/8171119/webrev.10/ > > I updated a bit of the naming, removed a TODO comment, and I added a test > for testing the sampling rate. I also updated the maximum stack depth to > 1024, there is no reason to keep it so small. I did a micro benchmark that > tests the overhead and it seems relatively the same. > > I compared allocations from a stack depth of 10 and allocations from a > stack depth of 1024 (allocations are from the same helper method in > http://cr.openjdk.java.net/~rasbold/8171119/webrev.10/ > raw_files/new/test/hotspot/jtreg/serviceability/jvmti/ > HeapMonitor/MyPackage/HeapMonitorStatRateTest.java): > - For an array of 1 integer allocated in a loop; stack depth > 1024 vs stack depth 10: 1% slower > - For an array of 200k integers allocated in a loop; stack depth > 1024 vs stack depth 10: 3% slower > > So basically now moving the maximum stack depth to 1024 but we only copy > over the stack depths actually used. > > For the next webrev, I will be adding a stack depth test to show that it > works and probably put back the mutex locking so that we can see how > difficult it is to keep thread safe. > > Let me know what you think! > Jc > > > > On Mon, Sep 25, 2017 at 3:02 PM, JC Beyler wrote: > >> Forgot to say that for my numbers: >> - Not in the test are the actual numbers I got for the various array >> sizes, I ran the program 30 times and parsed the output; here are the >> averages and standard deviation: >> 1000: 1.28% average; 1.13% standard deviation >> 10000: 1.59% average; 1.25% standard deviation >> 100000: 1.26% average; 1.26% standard deviation >> >> The 1000/10000/100000 are the sizes of the arrays being allocated. These >> are allocated 100k times and the sampling rate is 111 times the size of the >> array. >> >> Thanks! >> Jc >> >> >> On Mon, Sep 25, 2017 at 3:01 PM, JC Beyler wrote: >> >>> Hi all, >>> >>> After a bit of a break, I am back working on this :). As before, here >>> are two webrevs: >>> >>> - Full change set: http://cr.openjdk.java.ne >>> t/~rasbold/8171119/webrev.09/ >>> - Compared to version 8: http://cr.openjdk.java.net/ >>> ~rasbold/8171119/webrev.08_09/ >>> (This version is compared to version 8 I last showed but ported to >>> the new folder hierarchy) >>> >>> In this version I have: >>> - Handled Thomas' comments from his email of 07/03: >>> - Merged the logging to be standard >>> - Fixed up the code a bit where asked >>> - Added some notes about the code not being thread-safe yet >>> - Removed additional dead code from the version that modifies >>> interpreter/c1/c2 >>> - Fixed compiler issues so that it compiles with >>> --disable-precompiled-header >>> - Tested with ./configure --with-boot-jdk= >>> --with-debug-level=slowdebug --disable-precompiled-headers >>> >>> Additionally, I added a test to check the sanity of the sampler: >>> HeapMonitorStatCorrectnessTest (http://cr.openjdk.java.net/~r >>> asbold/8171119/webrev.08_09/test/hotspot/jtreg/serviceabilit >>> y/jvmti/HeapMonitor/MyPackage/HeapMonitorStatCorrectnessTest.java.patch) >>> - This allocates a number of arrays and checks that we obtain the >>> number of samples we want with an accepted error of 5%. I tested it 100 >>> times and it passed everytime, I can test more if wanted >>> - Not in the test are the actual numbers I got for the various array >>> sizes, I ran the program 30 times and parsed the output; here are the >>> averages and standard deviation: >>> 1000: 1.28% average; 1.13% standard deviation >>> 10000: 1.59% average; 1.25% standard deviation >>> 100000: 1.26% average; 1.26% standard deviation >>> >>> What this means is that we were always at about 1~2% of the number of >>> samples the test expected. >>> >>> Let me know what you think, >>> Jc >>> >>> >>> >>> On Wed, Jul 5, 2017 at 9:31 PM, JC Beyler wrote: >>> >>>> Hi all, >>>> >>>> I apologize, I have not yet handled your remarks but thought this new >>>> webrev would also be useful to see and comment on perhaps. >>>> >>>> Here is the latest webrev, it is generated slightly different than the >>>> others since now I'm using webrev.ksh without the -N option: >>>> http://cr.openjdk.java.net/~rasbold/8171119/webrev.08/ >>>> >>>> And the webrev.07 to webrev.08 diff is here: >>>> http://cr.openjdk.java.net/~rasbold/8171119/webrev.07_08/ >>>> >>>> (Let me know if it works well) >>>> >>>> It's a small change between versions but it: >>>> - provides a fix that makes the average sample rate correct (more on >>>> that below). >>>> - fixes the code to actually have it play nicely with the fast tlab >>>> refill >>>> - cleaned up a bit the JVMTI text and now use jvmtiFrameInfo >>>> - moved the capability to be onload solo >>>> >>>> With this webrev, I've done a small study of the random number >>>> generator we use here for the sampling rate. I took a small program and it >>>> can be simplified to: >>>> >>>> for (outer loop) >>>> for (inner loop) >>>> int[] tmp = new int[arraySize]; >>>> >>>> - I've fixed the outer and inner loops to being 800 for this >>>> experiment, meaning we allocate 640000 times an array of a given array >>>> size. >>>> >>>> - Each program provides the average sample size used for the whole >>>> execution >>>> >>>> - Then, I ran each variation 30 times and then calculated the average >>>> of the average sample size used for various array sizes. I selected the >>>> array size to be one of the following: 1, 10, 100, 1000. >>>> >>>> - When compared to 512kb, the average sample size of 30 runs: >>>> 1: 4.62% of error >>>> 10: 3.09% of error >>>> 100: 0.36% of error >>>> 1000: 0.1% of error >>>> 10000: 0.03% of error >>>> >>>> What it shows is that, depending on the number of samples, the average >>>> does become better. This is because with an allocation of 1 element per >>>> array, it will take longer to hit one of the thresholds. This is seen by >>>> looking at the sample count statistic I put in. For the same number of >>>> iterations (800 * 800), the different array sizes provoke: >>>> 1: 62 samples >>>> 10: 125 samples >>>> 100: 788 samples >>>> 1000: 6166 samples >>>> 10000: 57721 samples >>>> >>>> And of course, the more samples you have, the more sample rates you >>>> pick, which means that your average gets closer using that math. >>>> >>>> Thanks, >>>> Jc >>>> >>>> On Thu, Jun 29, 2017 at 10:01 PM, JC Beyler >>>> wrote: >>>> >>>>> Thanks Robbin, >>>>> >>>>> This seems to have worked. When I have the next webrev ready, we will >>>>> find out but I'm fairly confident it will work! >>>>> >>>>> Thanks agian! >>>>> Jc >>>>> >>>>> On Wed, Jun 28, 2017 at 11:46 PM, Robbin Ehn >>>>> wrote: >>>>> >>>>>> Hi JC, >>>>>> >>>>>> On 06/29/2017 12:15 AM, JC Beyler wrote: >>>>>> >>>>>>> B) Incremental changes >>>>>>> >>>>>> >>>>>> I guess the most common work flow here is using mq : >>>>>> hg qnew fix_v1 >>>>>> edit files >>>>>> hg qrefresh >>>>>> hg qnew fix_v2 >>>>>> edit files >>>>>> hg qrefresh >>>>>> >>>>>> if you do hg log you will see 2 commits >>>>>> >>>>>> webrev.ksh -r -2 -o my_inc_v1_v2 >>>>>> webrev.ksh -o my_full_v2 >>>>>> >>>>>> >>>>>> In your .hgrc you might need: >>>>>> [extensions] >>>>>> mq = >>>>>> >>>>>> /Robbin >>>>>> >>>>>> >>>>>>> Again another newbiew question here... >>>>>>> >>>>>>> For showing the incremental changes, is there a link that explains >>>>>>> how to do that? I apologize for my newbie questions all the time :) >>>>>>> >>>>>>> Right now, I do: >>>>>>> >>>>>>> ksh ../webrev.ksh -m -N >>>>>>> >>>>>>> That generates a webrev.zip and send it to Chuck Rasbold. He then >>>>>>> uploads it to a new webrev. >>>>>>> >>>>>>> I tried commiting my change and adding a small change. Then if I >>>>>>> just do ksh ../webrev.ksh without any options, it seems to produce a >>>>>>> similar page but now with only the changes I had (so the 06-07 comparison >>>>>>> you were talking about) and a changeset that has it all. I imagine that is >>>>>>> what you meant. >>>>>>> >>>>>>> Which means that my workflow would become: >>>>>>> >>>>>>> 1) Make changes >>>>>>> 2) Make a webrev without any options to show just the differences >>>>>>> with the tip >>>>>>> 3) Amend my changes to my local commit so that I have it done with >>>>>>> 4) Go to 1 >>>>>>> >>>>>>> Does that seem correct to you? >>>>>>> >>>>>>> Note that when I do this, I only see the full change of a file in >>>>>>> the full change set (Side note here: now the page says change set and not >>>>>>> patch, which is maybe why Serguei was having issues?). >>>>>>> >>>>>>> Thanks! >>>>>>> Jc >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Wed, Jun 28, 2017 at 1:12 AM, Robbin Ehn >>>>>> > wrote: >>>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> On 06/28/2017 12:04 AM, JC Beyler wrote: >>>>>>> >>>>>>> Dear Thomas et al, >>>>>>> >>>>>>> Here is the newest webrev: >>>>>>> http://cr.openjdk.java.net/~rasbold/8171119/webrev.07/ < >>>>>>> http://cr.openjdk.java.net/~rasbold/8171119/webrev.07/> >>>>>>> >>>>>>> >>>>>>> >>>>>>> You have some more bits to in there but generally this looks >>>>>>> good and really nice with more tests. >>>>>>> I'll do and deep dive and re-test this when I get back from my >>>>>>> long vacation with whatever patch version you have then. >>>>>>> >>>>>>> Also I think it's time you provide incremental (v06->07 changes) >>>>>>> as well as complete change-sets. >>>>>>> >>>>>>> Thanks, Robbin >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> Thomas, I "think" I have answered all your remarks. The >>>>>>> summary is: >>>>>>> >>>>>>> - The statistic system is up and provides insight on what >>>>>>> the heap sampler is doing >>>>>>> - I've noticed that, though the sampling rate is at the >>>>>>> right mean, we are missing some samples, I have not yet tracked out why >>>>>>> (details below) >>>>>>> >>>>>>> - I've run a tiny benchmark that is the worse case: it is a >>>>>>> very tight loop and allocated a small array >>>>>>> - In this case, I see no overhead when the system is >>>>>>> off so that is a good start :) >>>>>>> - I see right now a high overhead in this case when >>>>>>> sampling is on. This is not a really too surprising but I'm going to see if >>>>>>> this is consistent with our >>>>>>> internal implementation. The benchmark is really allocation >>>>>>> stressful so I'm not too surprised but I want to do the due diligence. >>>>>>> >>>>>>> - The statistic system up is up and I have a new test >>>>>>> http://cr.openjdk.java.net/~rasbold/8171119/webrev.07/test/s >>>>>>> erviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatTes >>>>>>> t.java.patch >>>>>>> >>>>>> serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatTe >>>>>>> st.java.patch> >>>>>>> - I did a bit of a study about the random generator >>>>>>> here, more details are below but basically it seems to work well >>>>>>> >>>>>>> - I added a capability but since this is the first time >>>>>>> doing this, I was not sure I did it right >>>>>>> - I did add a test though for it and the test seems to >>>>>>> do what I expect (all methods are failing with the >>>>>>> JVMTI_ERROR_MUST_POSSESS_CAPABILITY error). >>>>>>> - http://cr.openjdk.java.net/~ra >>>>>>> sbold/8171119/webrev.07/test/serviceability/jvmti/HeapMonito >>>>>>> r/MyPackage/HeapMonitorNoCapabilityTest.java.patch >>>>>>> >>>>>> serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorNoCapa >>>>>>> bilityTest.java.patch> >>>>>>> >>>>>>> - I still need to figure out what to do about the >>>>>>> multi-agent vs single-agent issue >>>>>>> >>>>>>> - As far as measurements, it seems I still need to look >>>>>>> at: >>>>>>> - Why we do the 20 random calls first, are they >>>>>>> necessary? >>>>>>> - Look at the mean of the sampling rate that the random >>>>>>> generator does and also what is actually sampled >>>>>>> - What is the overhead in terms of memory/performance >>>>>>> when on? >>>>>>> >>>>>>> I have inlined my answers, I think I got them all in the new >>>>>>> webrev, let me know your thoughts. >>>>>>> >>>>>>> Thanks again! >>>>>>> Jc >>>>>>> >>>>>>> >>>>>>> On Fri, Jun 23, 2017 at 3:52 AM, Thomas Schatzl < >>>>>>> thomas.schatzl at oracle.com >>>>>>> >>>>>> >>>>>>> >> wrote: >>>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> On Wed, 2017-06-21 at 13:45 -0700, JC Beyler wrote: >>>>>>> > Hi all, >>>>>>> > >>>>>>> > First off: Thanks again to Robbin and Thomas for >>>>>>> their reviews :) >>>>>>> > >>>>>>> > Next, I've uploaded a new webrev: >>>>>>> > http://cr.openjdk.java.net/~ra >>>>>>> sbold/8171119/webrev.06/ >>>>>> asbold/8171119/webrev.06/> >>>>>>> >>>>>> http://cr.openjdk.java.net/~rasbold/8171119/webrev.06/>> >>>>>>> >>>>>>> > >>>>>>> > Here is an update: >>>>>>> > >>>>>>> > - @Robbin, I forgot to say that yes I need to look at >>>>>>> implementing >>>>>>> > this for the other architectures and testing it >>>>>>> before it is all >>>>>>> > ready to go. Is it common to have it working on all >>>>>>> possible >>>>>>> > combinations or is there a subset that I should be >>>>>>> doing first and we >>>>>>> > can do the others later? >>>>>>> > - I've tested slowdebug, built and ran the JTreg >>>>>>> tests I wrote with >>>>>>> > slowdebug and fixed a few more issues >>>>>>> > - I've refactored a bit of the code following Thomas' >>>>>>> comments >>>>>>> > - I think I've handled all the comments from >>>>>>> Thomas (I put >>>>>>> > comments inline below for the specifics) >>>>>>> >>>>>>> Thanks for handling all those. >>>>>>> >>>>>>> > - Following Thomas' comments on statistics, I want to >>>>>>> add some >>>>>>> > quality assurance tests and find that the easiest way >>>>>>> would be to >>>>>>> > have a few counters of what is happening in the >>>>>>> sampler and expose >>>>>>> > that to the user. >>>>>>> > - I'll be adding that in the next version if no >>>>>>> one sees any >>>>>>> > objections to that. >>>>>>> > - This will allow me to add a sanity test in JTreg >>>>>>> about number of >>>>>>> > samples and average of sampling rate >>>>>>> > >>>>>>> > @Thomas: I had a few questions that I inlined below >>>>>>> but I will >>>>>>> > summarize the "bigger ones" here: >>>>>>> > - You mentioned constants are not using the right >>>>>>> conventions, I >>>>>>> > looked around and didn't see any convention except >>>>>>> normal naming then >>>>>>> > for static constants. Is that right? >>>>>>> >>>>>>> I looked through https://wiki.openjdk.java.net/ >>>>>>> display/HotSpot/StyleGui >>>>>> /display/HotSpot/StyleGui> >>>>>>> >>>>>> https://wiki.openjdk.java.net/display/HotSpot/StyleGui>> >>>>>>> de and the rule is to "follow an existing pattern and >>>>>>> must have a >>>>>>> distinct appearance from other names". Which does not >>>>>>> help a lot I >>>>>>> guess :/ The GC team started using upper camel case, >>>>>>> e.g. >>>>>>> SomeOtherConstant, but very likely this is probably not >>>>>>> applied >>>>>>> consistently throughout. So I am fine with not adding >>>>>>> another style >>>>>>> (like kMaxStackDepth with the "k" in front with some >>>>>>> unknown meaning) >>>>>>> is fine. >>>>>>> >>>>>>> (Chances are you will find that style somewhere used >>>>>>> anyway too, >>>>>>> apologies if so :/) >>>>>>> >>>>>>> >>>>>>> Thanks for that link, now I know where to look. I used the >>>>>>> upper camel case in my code as well then :) I should have gotten them all. >>>>>>> >>>>>>> >>>>>>> > PS: I've also inlined my answers to Thomas below: >>>>>>> > >>>>>>> > On Tue, Jun 13, 2017 at 8:03 AM, Thomas Schatzl >>>>>>> >>>>>> > e.com > wrote: >>>>>>> > > Hi all, >>>>>>> > > >>>>>>> > > On Mon, 2017-06-12 at 11:11 -0700, JC Beyler wrote: >>>>>>> > > > Dear all, >>>>>>> > > > >>>>>>> > > > I've continued working on this and have done the >>>>>>> following >>>>>>> > > webrev: >>>>>>> > > > http://cr.openjdk.java.net/~ra >>>>>>> sbold/8171119/webrev.05/ >>>>>> asbold/8171119/webrev.05/> >>>>>>> >>>>>> http://cr.openjdk.java.net/~rasbold/8171119/webrev.05/>> >>>>>>> >>>>>>> > > >>>>>>> > > [...] >>>>>>> > > > Things I still need to do: >>>>>>> > > > - Have to fix that TLAB case for the >>>>>>> FastTLABRefill >>>>>>> > > > - Have to start looking at the data to see >>>>>>> that it is >>>>>>> > > consistent and does gather the right samples, >>>>>>> right frequency, etc. >>>>>>> > > > - Have to check the GC elements and what that >>>>>>> produces >>>>>>> > > > - Run a slowdebug run and ensure I fixed all >>>>>>> those issues you >>>>>>> > > saw > Robbin >>>>>>> > > > >>>>>>> > > > Thanks for looking at the webrev and have a >>>>>>> great week! >>>>>>> > > >>>>>>> > > scratching a bit on the surface of this change, >>>>>>> so apologies for >>>>>>> > > rather shallow comments: >>>>>>> > > >>>>>>> > > - macroAssembler_x86.cpp:5604: while this is >>>>>>> compiler code, and I >>>>>>> > > am not sure this is final, please avoid littering >>>>>>> the code with >>>>>>> > > TODO remarks :) They tend to be candidates for >>>>>>> later wtf moments >>>>>>> > > only. >>>>>>> > > >>>>>>> > > Just file a CR for that. >>>>>>> > > >>>>>>> > Newcomer question: what is a CR and not sure I have >>>>>>> the rights to do >>>>>>> > that yet ? :) >>>>>>> >>>>>>> Apologies. CR is a change request, this suggests to >>>>>>> file a bug in the >>>>>>> bug tracker. And you are right, you can't just create a >>>>>>> new account in >>>>>>> the OpenJDK JIRA yourselves. :( >>>>>>> >>>>>>> >>>>>>> Ok good to know, I'll continue with my own todo list but >>>>>>> I'll work hard on not letting it slip in the webrevs anymore :) >>>>>>> >>>>>>> >>>>>>> I was mostly referring to the "... but it is a TODO" >>>>>>> part of that >>>>>>> comment in macroassembler_x86.cpp. Comments about the >>>>>>> why of the code >>>>>>> are appreciated. >>>>>>> >>>>>>> [Note that I now understand that this is to some degree >>>>>>> still work in >>>>>>> progress. As long as the final changeset does no >>>>>>> contain TODO's I am >>>>>>> fine (and it's not a hard objection, rather their use >>>>>>> in "final" code >>>>>>> is typically limited in my experience)] >>>>>>> >>>>>>> 5603 // Currently, if this happens, just set back the >>>>>>> actual end to >>>>>>> where it was. >>>>>>> 5604 // We miss a chance to sample here. >>>>>>> >>>>>>> Would be okay, if explaining "this" and the "why" of >>>>>>> missing a chance >>>>>>> to sample here would be best. >>>>>>> >>>>>>> Like maybe: >>>>>>> >>>>>>> // If we needed to refill TLABs, just set the actual >>>>>>> end point to >>>>>>> // the end of the TLAB again. We do not sample here >>>>>>> although we could. >>>>>>> >>>>>>> Done with your comment, it works well in my mind. >>>>>>> >>>>>>> I am not sure whether "miss a chance to sample" meant >>>>>>> "we could, but >>>>>>> consciously don't because it's not that useful" or "it >>>>>>> would be >>>>>>> necessary but don't because it's too complicated to >>>>>>> do.". >>>>>>> >>>>>>> Looking at the original comment once more, I am also >>>>>>> not sure if that >>>>>>> comment shouldn't referring to the "end" variable (not >>>>>>> actual_end) >>>>>>> because that's the variable that is responsible for >>>>>>> taking the sampling >>>>>>> path? (Going from the member description of >>>>>>> ThreadLocalAllocBuffer). >>>>>>> >>>>>>> >>>>>>> I've moved this code and it no longer shows up here but the >>>>>>> rationale and answer was: >>>>>>> >>>>>>> So.. Yes, end is the variable provoking the sampling. Actual >>>>>>> end is the actual end of the TLAB. >>>>>>> >>>>>>> What was happening here is that the code is resetting _end >>>>>>> to point towards the end of the new TLAB. Because, we now have the end for >>>>>>> sampling and _actual_end for >>>>>>> the actual end, we need to update the actual_end as well. >>>>>>> >>>>>>> Normally, were we to do the real work here, we would >>>>>>> calculate the (end - start) offset, then do: >>>>>>> >>>>>>> - Set the new end to : start + (old_end - old_start) >>>>>>> - Set the actual end like we do here now where it because it >>>>>>> is the actual end. >>>>>>> >>>>>>> Why is this not done here now anymore? >>>>>>> - I was still debating which path to take: >>>>>>> - Do it in the fast refill code, it has its perks: >>>>>>> - In a world where fast refills are happening all >>>>>>> the time or a lot, we can augment there the code to do the sampling >>>>>>> - Remember what we had as an end before leaving the >>>>>>> slowpath and check on return >>>>>>> - This is what I'm doing now, it removes the need >>>>>>> to go fix up all fast refill paths but if you remain in fast refill paths, >>>>>>> you won't get sampling. I >>>>>>> have to think of the consequences of that, maybe a future >>>>>>> change later on? >>>>>>> - I have the statistics now so I'm going to >>>>>>> study that >>>>>>> -> By the way, though my statistics are >>>>>>> showing I'm missing some samples, if I turn off FastTlabRefill, it is the >>>>>>> same loss so for now, it seems >>>>>>> this does not occur in my simple test. >>>>>>> >>>>>>> >>>>>>> >>>>>>> But maybe I am only confused and it's best to just >>>>>>> leave the comment >>>>>>> away. :) >>>>>>> >>>>>>> Thinking about it some more, doesn't this not-sampling >>>>>>> in this case >>>>>>> mean that sampling does not work in any collector that >>>>>>> does inline TLAB >>>>>>> allocation at the moment? (Or is inline TLAB alloc >>>>>>> automatically >>>>>>> disabled with sampling somehow?) >>>>>>> >>>>>>> That would indeed be a bigger TODO then :) >>>>>>> >>>>>>> >>>>>>> Agreed, this remark made me think that perhaps as a first >>>>>>> step the new way of doing it is better but I did have to: >>>>>>> - Remove the const of the ThreadLocalBuffer remaining and >>>>>>> hard_end methods >>>>>>> - Move hard_end out of the header file to have a bit more >>>>>>> logic there >>>>>>> >>>>>>> Please let me know what you think of that and if you prefer >>>>>>> it this way or changing the fast refills. (I prefer this way now because it >>>>>>> is more incremental). >>>>>>> >>>>>>> >>>>>>> > > - calling HeapMonitoring::do_weak_oops() (which >>>>>>> should probably be >>>>>>> > > called weak_oops_do() like other similar methods) >>>>>>> only if string >>>>>>> > > deduplication is enabled (in >>>>>>> g1CollectedHeap.cpp:4511) seems wrong. >>>>>>> > >>>>>>> > The call should be at least around 6 lines up outside >>>>>>> the if. >>>>>>> > >>>>>>> > Preferentially in a method like >>>>>>> process_weak_jni_handles(), including >>>>>>> > additional logging. (No new (G1) gc phase without >>>>>>> minimal logging >>>>>>> > :)). >>>>>>> > Done but really not sure because: >>>>>>> > >>>>>>> > I put for logging: >>>>>>> > log_develop_trace(gc, >>>>>>> freelist)("G1ConcRegionFreeing [other] : heap >>>>>>> > monitoring"); >>>>>>> >>>>>>> I would think that "gc, ref" would be more appropriate >>>>>>> log tags for >>>>>>> this similar to jni handles. >>>>>>> (I am als not sure what weak reference handling has to >>>>>>> do with >>>>>>> G1ConcRegionFreeing, so I am a bit puzzled) >>>>>>> >>>>>>> >>>>>>> I was not sure what to put for the tags or really as the >>>>>>> message. I cleaned it up a bit now to: >>>>>>> log_develop_trace(gc, ref)("HeapSampling [other] : heap >>>>>>> monitoring processing"); >>>>>>> >>>>>>> >>>>>>> >>>>>>> > Since weak_jni_handles didn't have logging for me to >>>>>>> be inspired >>>>>>> > from, I did that but unconvinced this is what should >>>>>>> be done. >>>>>>> >>>>>>> The JNI handle processing does have logging, but only in >>>>>>> ReferenceProcessor::process_discovered_references(). In >>>>>>> process_weak_jni_handles() only overall time is >>>>>>> measured (in a G1 >>>>>>> specific way, since only G1 supports disabling >>>>>>> reference procesing) :/ >>>>>>> >>>>>>> The code in ReferenceProcessor prints both time taken >>>>>>> referenceProcessor.cpp:254, as well as the count, but >>>>>>> strangely only in >>>>>>> debug VMs. >>>>>>> >>>>>>> I have no idea why this logging is that unimportant to >>>>>>> only print that >>>>>>> in a debug VM. However there are reviews out for >>>>>>> changing this area a >>>>>>> bit, so it might be useful to wait for that >>>>>>> (JDK-8173335). >>>>>>> >>>>>>> >>>>>>> I cleaned it up a bit anyway and now it returns the count of >>>>>>> objects that are in the system. >>>>>>> >>>>>>> >>>>>>> > > - the change doubles the size of >>>>>>> > > CollectedHeap::allocate_from_tlab_slow() above the >>>>>>> "small and nice" >>>>>>> > > threshold. Maybe it could be refactored a bit. >>>>>>> > Done I think, it looks better to me :). >>>>>>> >>>>>>> In ThreadLocalAllocBuffer::handle_sample() I think the >>>>>>> set_back_actual_end()/pick_next_sample() calls could >>>>>>> be hoisted out of >>>>>>> the "if" :) >>>>>>> >>>>>>> >>>>>>> Done! >>>>>>> >>>>>>> >>>>>>> > > - referenceProcessor.cpp:261: the change should add >>>>>>> logging about >>>>>>> > > the number of references encountered, maybe after >>>>>>> the corresponding >>>>>>> > > "JNI weak reference count" log message. >>>>>>> > Just to double check, are you saying that you'd like >>>>>>> to have the heap >>>>>>> > sampler to keep in store how many sampled objects >>>>>>> were encountered in >>>>>>> > the HeapMonitoring::weak_oops_do? >>>>>>> > - Would a return of the method with the number of >>>>>>> handled >>>>>>> > references and logging that work? >>>>>>> >>>>>>> Yes, it's fine if HeapMonitoring::weak_oops_do() only >>>>>>> returned the >>>>>>> number of processed weak oops. >>>>>>> >>>>>>> >>>>>>> Done also (but I admit I have not tested the output yet) :) >>>>>>> >>>>>>> >>>>>>> > - Additionally, would you prefer it in a separate >>>>>>> block with its >>>>>>> > GCTraceTime? >>>>>>> >>>>>>> Yes. Both kinds of information is interesting: while >>>>>>> the time taken is >>>>>>> typically more important, the next question would be >>>>>>> why, and the >>>>>>> number of references typically goes a long way there. >>>>>>> >>>>>>> See above though, it is probably best to wait a bit. >>>>>>> >>>>>>> >>>>>>> Agreed that I "could" wait but, if it's ok, I'll just >>>>>>> refactor/remove this when we get closer to something final. Either, >>>>>>> JDK-8173335 >>>>>>> has gone in and I will notice it now or it will soon and I >>>>>>> can change it then. >>>>>>> >>>>>>> >>>>>>> > > - threadLocalAllocBuffer.cpp:331: one more "TODO" >>>>>>> > Removed it and added it to my personal todos to look >>>>>>> at. >>>>>>> > > > >>>>>>> > > - threadLocalAllocBuffer.hpp: >>>>>>> ThreadLocalAllocBuffer class >>>>>>> > > documentation should be updated about the sampling >>>>>>> additions. I >>>>>>> > > would have no clue what the difference between >>>>>>> "actual_end" and >>>>>>> > > "end" would be from the given information. >>>>>>> > If you are talking about the comments in this file, I >>>>>>> made them more >>>>>>> > clear I hope in the new webrev. If it was somewhere >>>>>>> else, let me know >>>>>>> > where to change. >>>>>>> >>>>>>> Thanks, that's much better. Maybe a note in the comment >>>>>>> of the class >>>>>>> that ThreadLocalBuffer provides some sampling facility >>>>>>> by modifying the >>>>>>> end() of the TLAB to cause "frequent" calls into the >>>>>>> runtime call where >>>>>>> actual sampling takes place. >>>>>>> >>>>>>> >>>>>>> Done, I think it's better now. Added something about the >>>>>>> slow_path_end as well. >>>>>>> >>>>>>> >>>>>>> > > - in heapMonitoring.hpp: there are some random >>>>>>> comments about some >>>>>>> > > code that has been grabbed from >>>>>>> "util/math/fastmath.[h|cc]". I >>>>>>> > > can't tell whether this is code that can be used >>>>>>> but I assume that >>>>>>> > > Noam Shazeer is okay with that (i.e. that's all >>>>>>> Google code). >>>>>>> > Jeremy and I double checked and we can release that >>>>>>> as I thought. I >>>>>>> > removed the comment from that piece of code entirely. >>>>>>> >>>>>>> Thanks. >>>>>>> >>>>>>> > > - heapMonitoring.hpp/cpp static constant naming >>>>>>> does not correspond >>>>>>> > > to Hotspot's. Additionally, in Hotspot static >>>>>>> methods are cased >>>>>>> > > like other methods. >>>>>>> > I think I fixed the methods to be cased the same way >>>>>>> as all other >>>>>>> > methods. For static constants, I was not sure. I >>>>>>> fixed a few other >>>>>>> > variables but I could not seem to really see a >>>>>>> consistent trend for >>>>>>> > constants. I made them as variables but I'm not sure >>>>>>> now. >>>>>>> >>>>>>> Sorry again, style is a kind of mess. The goal of my >>>>>>> suggestions here >>>>>>> is only to prevent yet another style creeping in. >>>>>>> >>>>>>> > > - in heapMonitoring.cpp there are a few cryptic >>>>>>> comments at the top >>>>>>> > > that seem to refer to internal stuff that should >>>>>>> probably be >>>>>>> > > removed. >>>>>>> > Sorry about that! My personal todos not cleared out. >>>>>>> >>>>>>> I am happy about comments, but I simply did not >>>>>>> understand any of that >>>>>>> and I do not know about other readers as well. >>>>>>> >>>>>>> If you think you will remember removing/updating them >>>>>>> until the review >>>>>>> proper (I misunderstood the review situation a little >>>>>>> it seems). >>>>>>> >>>>>>> > > I did not think through the impact of the TLAB >>>>>>> changes on collector >>>>>>> > > behavior yet (if there are). Also I did not check >>>>>>> for problems with >>>>>>> > > concurrent mark and SATB/G1 (if there are). >>>>>>> > I would love to know your thoughts on this, I think >>>>>>> this is fine. I >>>>>>> >>>>>>> I think so too now. No objects are made live out of >>>>>>> thin air :) >>>>>>> >>>>>>> > see issues with multiple threads right now hitting >>>>>>> the stack storage >>>>>>> > instance. Previous webrevs had a mutex lock here but >>>>>>> we took it out >>>>>>> > for simplificity (and only for now). >>>>>>> >>>>>>> :) When looking at this after some thinking I now >>>>>>> assume for this >>>>>>> review that this code is not MT safe at all. There >>>>>>> seems to be more >>>>>>> synchronization missing than just the one for the >>>>>>> StackTraceStorage. So >>>>>>> no comments about this here. >>>>>>> >>>>>>> >>>>>>> I doubled checked a bit (quickly I admit) but it seems that >>>>>>> synchronization in StackTraceStorage is really all you need (all methods >>>>>>> lead to a StackTraceStorage one >>>>>>> and can be multithreaded outside of that). >>>>>>> There is a question about the initialization where the >>>>>>> method HeapMonitoring::initialize_profiling is not thread safe. >>>>>>> It would work (famous last words) and not crash if there was >>>>>>> a race but we could add a synchronization point there as well (and >>>>>>> therefore on the stop as well). >>>>>>> >>>>>>> But anyway I will really check and do this once we add back >>>>>>> synchronization. >>>>>>> >>>>>>> >>>>>>> Also, this would require some kind of specification of >>>>>>> what is allowed >>>>>>> to be called when and where. >>>>>>> >>>>>>> >>>>>>> Would we specify this with the methods in the jvmti.xml >>>>>>> file? We could start by specifying in each that they are not thread safe >>>>>>> but I saw no mention of that for >>>>>>> other methods. >>>>>>> >>>>>>> >>>>>>> One potentially relevant observation about locking >>>>>>> here: depending on >>>>>>> sampling frequency, StackTraceStore::add_trace() may be >>>>>>> rather >>>>>>> frequently called. I assume that you are going to do >>>>>>> measurements :) >>>>>>> >>>>>>> >>>>>>> Though we don't have the TLAB implementation in our code, >>>>>>> the compiler generated sampler uses 2% of overhead with a 512k sampling >>>>>>> rate. I can do real measurements >>>>>>> when the code settles and we can see how costly this is as a >>>>>>> TLAB implementation. >>>>>>> However, my theory is that if the rate is 512k, the >>>>>>> memory/performance overhead should be minimal since it is what we saw with >>>>>>> our code/workloads (though not called >>>>>>> the same way, we call it essentially at the same rate). >>>>>>> If you have a benchmark you'd like me to test, let me know! >>>>>>> >>>>>>> Right now, with my really small test, this does use a bit of >>>>>>> overhead even for a 512k sample size. I don't know yet why, I'm going to >>>>>>> see what is going on. >>>>>>> >>>>>>> Finally, I think it is not reasonable to suppose the >>>>>>> overhead to be negligible if the sampling rate used is too low. The user >>>>>>> should know that the lower the rate, >>>>>>> the higher the overhead (documentation TODO?). >>>>>>> >>>>>>> >>>>>>> I am not sure what the expected usage of the API is, but >>>>>>> StackTraceStore::add_trace() seems to be able to grow >>>>>>> without bounds. >>>>>>> Only a GC truncates them to the live ones. That in >>>>>>> itself seems to be >>>>>>> problematic (GCs can be *wide* apart), and of course >>>>>>> some of the API >>>>>>> methods add to that because they duplicate that >>>>>>> unbounded array. Do you >>>>>>> have any concerns/measurements about this? >>>>>>> >>>>>>> >>>>>>> So, the theory is that yes add_trace can be able to grow >>>>>>> without bounds but it grows at a sample per 512k of allocated space. The >>>>>>> stacks it gathers are currently >>>>>>> maxed at 64 (I'd like to expand that to an option to the >>>>>>> user though at some point). So I have no concerns because: >>>>>>> >>>>>>> - If really this is taking a lot of space, that means the >>>>>>> job is keeping a lot of objects in memory as well, therefore the entire >>>>>>> heap is getting huge >>>>>>> - If this is the case, you will be triggering a GC at some >>>>>>> point anyway. >>>>>>> >>>>>>> (I'm putting under the rug the issue of "What if we set the >>>>>>> rate to 1 for example" because as you lower the sampling rate, we cannot >>>>>>> guarantee low overhead; the >>>>>>> idea behind this feature is to have a means of having >>>>>>> meaningful allocated samples at a low overhead) >>>>>>> >>>>>>> I have no measurements really right now but since I now have >>>>>>> some statistics I can poll, I will look a bit more at this question. >>>>>>> >>>>>>> I have the same last sentence than above: the user should >>>>>>> expect this to happen if the sampling rate is too small. That probably can >>>>>>> be reflected in the >>>>>>> StartHeapSampling as a note : careful this might impact your >>>>>>> performance. >>>>>>> >>>>>>> >>>>>>> Also, these stack traces might hold on to huge arrays. >>>>>>> Any >>>>>>> consideration of that? Particularly it might be the >>>>>>> cause for OOMEs in >>>>>>> tight memory situations. >>>>>>> >>>>>>> >>>>>>> There is a stack size maximum that is set to 64 so it should >>>>>>> not hold huge arrays. I don't think this is an issue but I can double check >>>>>>> with a test or two. >>>>>>> >>>>>>> >>>>>>> - please consider adding a safepoint check in >>>>>>> HeapMonitoring::weak_oops_do to prevent accidental >>>>>>> misuse. >>>>>>> >>>>>>> - in struct StackTraceStorage, the public fields may >>>>>>> also need >>>>>>> underscores. At least some files in the runtime >>>>>>> directory have structs >>>>>>> with underscored public members (and some don't). The >>>>>>> runtime team >>>>>>> should probably comment on that. >>>>>>> >>>>>>> >>>>>>> Agreed I did not know. I looked around and a lot of structs >>>>>>> did not have them it seemed so I left it as is. I will happily change it if >>>>>>> someone prefers (I was not >>>>>>> sure if you really preferred or not, your sentence seemed to >>>>>>> be more a note of "this might need to change but I don't know if the >>>>>>> runtime team enforces that", let >>>>>>> me know if I read that wrongly). >>>>>>> >>>>>>> >>>>>>> - In StackTraceStorage::weak_oops_do(), when examining >>>>>>> the >>>>>>> StackTraceData, maybe it is useful to consider having a >>>>>>> non-NULL >>>>>>> reference outside of the heap's reserved space an >>>>>>> error. There should >>>>>>> be no oop outside of the heap's reserved space ever. >>>>>>> >>>>>>> Unless you allow storing random values in >>>>>>> StackTraceData::obj, which I >>>>>>> would not encourage. >>>>>>> >>>>>>> >>>>>>> I suppose you are talking about this part: >>>>>>> if ((value != NULL && Universe::heap()->is_in_reserved(value)) >>>>>>> && >>>>>>> (is_alive == NULL || >>>>>>> is_alive->do_object_b(value))) { >>>>>>> >>>>>>> What you are saying is that I could have something like: >>>>>>> if (value != my_non_null_reference && >>>>>>> (is_alive == NULL || >>>>>>> is_alive->do_object_b(value))) { >>>>>>> >>>>>>> Is that what you meant? Is there really a reason to do so? >>>>>>> When I look at the code, is_in_reserved seems like a O(1) method call. I'm >>>>>>> not even sure we can have a >>>>>>> NULL value to be honest. I might have to study that to see >>>>>>> if this was not a paranoid test to begin with. >>>>>>> >>>>>>> The is_alive code has now morphed due to the comment below. >>>>>>> >>>>>>> >>>>>>> >>>>>>> - HeapMonitoring::weak_oops_do() does not seem to use >>>>>>> the >>>>>>> passed AbstractRefProcTaskExecutor. >>>>>>> >>>>>>> >>>>>>> It did use it: >>>>>>> size_t HeapMonitoring::weak_oops_do( >>>>>>> AbstractRefProcTaskExecutor *task_executor, >>>>>>> BoolObjectClosure* is_alive, >>>>>>> OopClosure *f, >>>>>>> VoidClosure *complete_gc) { >>>>>>> assert(SafepointSynchronize::is_at_safepoint(), "must >>>>>>> be at safepoint"); >>>>>>> >>>>>>> if (task_executor != NULL) { >>>>>>> task_executor->set_single_threaded_mode(); >>>>>>> } >>>>>>> return StackTraceStorage::storage()->weak_oops_do(is_alive, >>>>>>> f, complete_gc); >>>>>>> } >>>>>>> >>>>>>> But due to the comment below, I refactored this, so this is >>>>>>> no longer here. Now I have an always true closure that is passed. >>>>>>> >>>>>>> >>>>>>> - I do not understand allowing to call this method with >>>>>>> a NULL >>>>>>> complete_gc closure. This would mean that objects >>>>>>> referenced from the >>>>>>> object that is referenced by the StackTraceData are not >>>>>>> pulled, meaning >>>>>>> they would get stale. >>>>>>> >>>>>>> - same with is_alive parameter value of NULL >>>>>>> >>>>>>> >>>>>>> So these questions made me look a bit closer at this code. >>>>>>> This code I think was written this way to have a very small impact on the >>>>>>> file but you are right, there >>>>>>> is no reason for this here. I've simplified the code by >>>>>>> making in referenceProcessor.cpp a process_HeapSampling method that handles >>>>>>> everything there. >>>>>>> >>>>>>> The code allowed NULLs because it depended on where you were >>>>>>> coming from and how the code was being called. >>>>>>> >>>>>>> - I added a static always_true variable and pass that now to >>>>>>> be more consistent with the rest of the code. >>>>>>> - I moved the complete_gc into process_phaseHeapSampling now >>>>>>> (new method) and handle the task_executor and the complete_gc there >>>>>>> - Newbie question: in our code we did a >>>>>>> set_single_threaded_mode but I see that process_phaseJNI does it right >>>>>>> before its call, do I need to do it for the >>>>>>> process_phaseHeapSample? >>>>>>> That API is much cleaner (in my mind) and is consistent with >>>>>>> what is done around it (again in my mind). >>>>>>> >>>>>>> >>>>>>> - heapMonitoring.cpp:590: I do not completely >>>>>>> understand the purpose of >>>>>>> this code: in the end this results in a fixed value >>>>>>> directly dependent >>>>>>> on the Thread address anyway? In the end this results >>>>>>> in a fixed value >>>>>>> directly dependent on the Thread address anyway? >>>>>>> IOW, what is special about exactly 20 rounds? >>>>>>> >>>>>>> >>>>>>> So we really want a fast random number generator that has a >>>>>>> specific mean (512k is the default we use). The code uses the thread >>>>>>> address as the start number of the >>>>>>> sequence (why not, it is random enough is rationale). Then >>>>>>> instead of just starting there, we prime the sequence and really only start >>>>>>> at the 21st number, it is >>>>>>> arbitrary and I have not done a study to see if we could do >>>>>>> more or less of that. >>>>>>> >>>>>>> As I have the statistics of the system up and running, I'll >>>>>>> run some experiments to see if this is needed, is 20 good, or not. >>>>>>> >>>>>>> >>>>>>> - also I would consider stripping a few bits of the >>>>>>> threads' address as >>>>>>> initialization value for your rng. The last three bits >>>>>>> (and probably >>>>>>> more, check whether the Thread object is allocated on >>>>>>> special >>>>>>> boundaries) are always zero for them. >>>>>>> Not sure if the given "random" value is random enough >>>>>>> before/after, >>>>>>> this method, so just skip that comment if you think >>>>>>> this is not >>>>>>> required. >>>>>>> >>>>>>> >>>>>>> I don't know is the honest answer. I think what is important >>>>>>> is that we tend towards a mean and it is random "enough" to not fall in >>>>>>> pitfalls of only sampling a >>>>>>> subset of objects due to their allocation order. I added >>>>>>> that as test to do to see if it changes the mean in any way for the 512k >>>>>>> default value and/or if the first >>>>>>> 1000 elements look better. >>>>>>> >>>>>>> >>>>>>> Some more random nits I did not find a place to put >>>>>>> anywhere: >>>>>>> >>>>>>> - ThreadLocalAllocBuffer::_extra_space does not seem >>>>>>> to be used >>>>>>> anywhere? >>>>>>> >>>>>>> >>>>>>> Good catch :). >>>>>>> >>>>>>> >>>>>>> - Maybe indent the declaration of >>>>>>> ThreadLocalAllocBuffer::_bytes_until_sample to align below the >>>>>>> other members of that group. >>>>>>> >>>>>>> >>>>>>> Done moved it up a bit to have non static members together >>>>>>> and static separate. >>>>>>> >>>>>>> Thanks, >>>>>>> Thomas >>>>>>> >>>>>>> >>>>>>> Thanks for your review! >>>>>>> Jc >>>>>>> >>>>>>> >>>>>>> >>>>> >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From yasuenag at gmail.com Tue Oct 10 02:03:24 2017 From: yasuenag at gmail.com (Yasumasa Suenaga) Date: Tue, 10 Oct 2017 11:03:24 +0900 Subject: RFR: 8189069: regression after push of 8187403: "AssertionFailure: addr should be OopHandle" Message-ID: Hi all, The following serviceability/sa tests are failed after 8187403: serviceability/sa/TestHeapDumpForInvokeDynamic.java serviceability/sa/TestHeapDumpForLargeArray.java serviceability/sa/jmap-hprof/JMapHProfLargeHeapTest.java These failures are caused by the address of HeapRegion. The address which is passed to c'tor of HeapRegion might not be OopHandle. So we have to switch the method of address calculation. I uploaded webrev for this issue. Could you review it? http://cr.openjdk.java.net/~ysuenaga/JDK-8189069/webrev.00/ I cannot access JPRT. So I need a sponsor. Thanks, Yasumasa From serguei.spitsyn at oracle.com Tue Oct 10 02:10:38 2017 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 9 Oct 2017 19:10:38 -0700 Subject: RFR: 8189069: regression after push of 8187403: "AssertionFailure: addr should be OopHandle" In-Reply-To: References: Message-ID: <49e1a6b6-bc89-d158-cbe1-3e163edf4b09@oracle.com> Hi Yasumasa, Thank you for the quick fix! It looks good. I'll sponsor your fix after we get at least one more review. Thanks, Serguei On 10/9/17 19:03, Yasumasa Suenaga wrote: > Hi all, > > The following serviceability/sa tests are failed after 8187403: > > serviceability/sa/TestHeapDumpForInvokeDynamic.java > serviceability/sa/TestHeapDumpForLargeArray.java > serviceability/sa/jmap-hprof/JMapHProfLargeHeapTest.java > > These failures are caused by the address of HeapRegion. > The address which is passed to c'tor of HeapRegion might not be OopHandle. > So we have to switch the method of address calculation. > > I uploaded webrev for this issue. Could you review it? > > http://cr.openjdk.java.net/~ysuenaga/JDK-8189069/webrev.00/ > > > I cannot access JPRT. So I need a sponsor. > > > Thanks, > > Yasumasa From david.holmes at oracle.com Tue Oct 10 03:19:43 2017 From: david.holmes at oracle.com (David Holmes) Date: Tue, 10 Oct 2017 13:19:43 +1000 Subject: RFR: 8189069: regression after push of 8187403: "AssertionFailure: addr should be OopHandle" In-Reply-To: References: Message-ID: <5898f5da-30d2-a943-84df-c0bb7702b133@oracle.com> Hi Yasumasa, On 10/10/2017 12:03 PM, Yasumasa Suenaga wrote: > Hi all, > > The following serviceability/sa tests are failed after 8187403: Please ensure you add a link to the bug that introduces a failure when creating the new bug - I've added it now. > serviceability/sa/TestHeapDumpForInvokeDynamic.java > serviceability/sa/TestHeapDumpForLargeArray.java > serviceability/sa/jmap-hprof/JMapHProfLargeHeapTest.java Why was this not detected before 8187403 was pushed? > These failures are caused by the address of HeapRegion. > The address which is passed to c'tor of HeapRegion might not be OopHandle. > So we have to switch the method of address calculation. > > I uploaded webrev for this issue. Could you review it? > > http://cr.openjdk.java.net/~ysuenaga/JDK-8189069/webrev.00/ The approach seems reasonable though I'm somewhat unclear on what the possibilities for the address are. Thanks, David ----- > > I cannot access JPRT. So I need a sponsor. > > > Thanks, > > Yasumasa > From david.holmes at oracle.com Tue Oct 10 03:31:11 2017 From: david.holmes at oracle.com (David Holmes) Date: Tue, 10 Oct 2017 13:31:11 +1000 Subject: RFR: SA: MacOS X: 8184042: several serviceability/sa tests timed out on MacOS X In-Reply-To: <6b4cb988-a4e4-de22-1728-c87dccfde972@oracle.com> References: <2121618c-2878-385e-72bb-d9dc6c25b65a@oracle.com> <7b5ba4a7-2463-d9e3-b0a4-76957c630719@oracle.com> <081d276d-4616-364c-2f9c-acda35b4eaf4@oracle.com> <6b4cb988-a4e4-de22-1728-c87dccfde972@oracle.com> Message-ID: Thanks for your patience on this one Jini! The change looks good. Thanks, David On 10/10/2017 2:15 AM, Jini George wrote: > Hi all, > > I have created a webrev restoring the PT_ATTACH: > > http://cr.openjdk.java.net/~jgeorge/8184042/webrev.01/ > > Have included Dmitry's comments on disabling the the deprecation > warning. I would like to request for reviews for this. > > Thank you, > Jini. > > > On 9/8/2017 3:09 AM, serguei.spitsyn at oracle.com wrote: >> On 8/25/17 02:24, serguei.spitsyn at oracle.com wrote: >>> Hi Jini, >>> >>> >>> On 8/18/17 04:00, David Holmes wrote: >>>> Hi Jini, >>>> >>>> Just reading the bug report and your description below this seems >>>> like a major change to try and use a facility (mach exceptions) that >>>> no one seems to have any experience with! That isn't something to be >>>> rushed. >>> >>>> Even if PT_ATTACH has been deprecated restoring its use may be the >>>> quick way forward instead of trying to rush in something like this. >>> >>> This approach looks reasonable to me. >> >> I've just realized that my statement might sound incorrectly. >> I meant that the David's suggestion to restore the use of the >> deprecated PT_ATTACH looks reasonable. >> >> Sorry, if it caused any confusion. >> >> Thanks, >> Serguei >> >> >>> Otherwise, it would be nice to hear why it is not good. >>> How much would it break the fix of the JDK-8182299? >>> >>> Thanks, >>> Serguei >>> >>>> >>>> Just my 2c. >>>> >>>> Cheers, >>>> David >>>> >>>> On 18/08/2017 8:00 PM, Jini George wrote: >>>>> Hi all, >>>>> >>>>> Requesting reviews for: >>>>> https://bugs.openjdk.java.net/browse/JDK-8184042 >>>>> >>>>> Webrev: http://cr.openjdk.java.net/~jgeorge/8184042/webrev.00/ >>>>> >>>>> Problem gist: The deprecated ptrace() command, PT_ATTACH was >>>>> changed to PT_ATTACHEXC, which causes mach exceptions (and not UNIX >>>>> signals) to be delivered via mach messages.This caused SA to hang >>>>> at waitpid() waiting for a signal, which does not arrive. >>>>> >>>>> Solution in a nutshell: The solution is to make the required >>>>> changes to handle mach 'soft signal' exceptions in the form of mach >>>>> messages instead of signals, while attaching to and detaching from >>>>> the target process. The detailed steps are outlined in JBS. >>>>> >>>>> The changes appear huge due to the inclusion of pre-generated mach >>>>> exception handling files (mach_exc*). Since this is an integration >>>>> blocker, it would be great to get quick reviews on this. >>>>> >>>>> Thank you, >>>>> Jini. >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>> >> From serguei.spitsyn at oracle.com Tue Oct 10 04:18:54 2017 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 9 Oct 2017 21:18:54 -0700 Subject: RFR: SA: MacOS X: 8184042: several serviceability/sa tests timed out on MacOS X In-Reply-To: References: <2121618c-2878-385e-72bb-d9dc6c25b65a@oracle.com> <7b5ba4a7-2463-d9e3-b0a4-76957c630719@oracle.com> <081d276d-4616-364c-2f9c-acda35b4eaf4@oracle.com> <6b4cb988-a4e4-de22-1728-c87dccfde972@oracle.com> Message-ID: Hi Jini, +1 Thanks, Serguei On 10/9/17 20:31, David Holmes wrote: > Thanks for your patience on this one Jini! The change looks good. > > Thanks, > David > > On 10/10/2017 2:15 AM, Jini George wrote: >> Hi all, >> >> I have created a webrev restoring the PT_ATTACH: >> >> http://cr.openjdk.java.net/~jgeorge/8184042/webrev.01/ >> >> Have included Dmitry's comments on disabling the the deprecation >> warning. I would like to request for reviews for this. >> >> Thank you, >> Jini. >> >> >> On 9/8/2017 3:09 AM, serguei.spitsyn at oracle.com wrote: >>> On 8/25/17 02:24, serguei.spitsyn at oracle.com wrote: >>>> Hi Jini, >>>> >>>> >>>> On 8/18/17 04:00, David Holmes wrote: >>>>> Hi Jini, >>>>> >>>>> Just reading the bug report and your description below this seems >>>>> like a major change to try and use a facility (mach exceptions) >>>>> that no one seems to have any experience with! That isn't >>>>> something to be rushed. >>>> >>>>> Even if PT_ATTACH has been deprecated restoring its use may be the >>>>> quick way forward instead of trying to rush in something like this. >>>> >>>> This approach looks reasonable to me. >>> >>> I've just realized that my statement might sound incorrectly. >>> I meant that the David's suggestion to restore the use of the >>> deprecated PT_ATTACH looks reasonable. >>> >>> Sorry, if it caused any confusion. >>> >>> Thanks, >>> Serguei >>> >>> >>>> Otherwise, it would be nice to hear why it is not good. >>>> How much would it break the fix of the JDK-8182299? >>>> >>>> Thanks, >>>> Serguei >>>> >>>>> >>>>> Just my 2c. >>>>> >>>>> Cheers, >>>>> David >>>>> >>>>> On 18/08/2017 8:00 PM, Jini George wrote: >>>>>> Hi all, >>>>>> >>>>>> Requesting reviews for: >>>>>> https://bugs.openjdk.java.net/browse/JDK-8184042 >>>>>> >>>>>> Webrev: http://cr.openjdk.java.net/~jgeorge/8184042/webrev.00/ >>>>>> >>>>>> Problem gist: The deprecated ptrace() command, PT_ATTACH was >>>>>> changed to PT_ATTACHEXC, which causes mach exceptions (and not >>>>>> UNIX signals) to be delivered via mach messages.This caused SA to >>>>>> hang at waitpid() waiting for a signal, which does not arrive. >>>>>> >>>>>> Solution in a nutshell: The solution is to make the required >>>>>> changes to handle mach 'soft signal' exceptions in the form of >>>>>> mach messages instead of signals, while attaching to and >>>>>> detaching from the target process. The detailed steps are >>>>>> outlined in JBS. >>>>>> >>>>>> The changes appear huge due to the inclusion of pre-generated >>>>>> mach exception handling files (mach_exc*). Since this is an >>>>>> integration blocker, it would be great to get quick reviews on this. >>>>>> >>>>>> Thank you, >>>>>> Jini. >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>> >>> From serguei.spitsyn at oracle.com Tue Oct 10 04:31:13 2017 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 9 Oct 2017 21:31:13 -0700 Subject: RFR: 8189069: regression after push of 8187403: "AssertionFailure: addr should be OopHandle" In-Reply-To: <5898f5da-30d2-a943-84df-c0bb7702b133@oracle.com> References: <5898f5da-30d2-a943-84df-c0bb7702b133@oracle.com> Message-ID: <03b648fd-78a5-b9d9-7b81-9328aaae3476@oracle.com> Hi David, On 10/9/17 20:19, David Holmes wrote: > Hi Yasumasa, > > On 10/10/2017 12:03 PM, Yasumasa Suenaga wrote: >> Hi all, >> >> The following serviceability/sa tests are failed after 8187403: > > Please ensure you add a link to the bug that introduces a failure when > creating the new bug - I've added it now. Thanks, David! I've also added a link to the original mdash failures. >> serviceability/sa/TestHeapDumpForInvokeDynamic.java >> ?? serviceability/sa/TestHeapDumpForLargeArray.java >> ?? serviceability/sa/jmap-hprof/JMapHProfLargeHeapTest.java > > Why was this not detected before 8187403 was pushed? It was a miscommunication between me and Yasumasa. I expected him to run all the SA tests before requesting a push. Thanks, Serguei > >> These failures are caused by the address of HeapRegion. >> The address which is passed to c'tor of HeapRegion might not be >> OopHandle. >> So we have to switch the method of address calculation. >> >> I uploaded webrev for this issue. Could you review it? >> >> ?? http://cr.openjdk.java.net/~ysuenaga/JDK-8189069/webrev.00/ > > The approach seems reasonable though I'm somewhat unclear on what the > possibilities for the address are. > > Thanks, > David > ----- > >> >> I cannot access JPRT. So I need a sponsor. >> >> >> Thanks, >> >> Yasumasa >> From yasuenag at gmail.com Tue Oct 10 04:44:53 2017 From: yasuenag at gmail.com (Yasumasa Suenaga) Date: Tue, 10 Oct 2017 13:44:53 +0900 Subject: RFR: 8189069: regression after push of 8187403: "AssertionFailure: addr should be OopHandle" In-Reply-To: <03b648fd-78a5-b9d9-7b81-9328aaae3476@oracle.com> References: <5898f5da-30d2-a943-84df-c0bb7702b133@oracle.com> <03b648fd-78a5-b9d9-7b81-9328aaae3476@oracle.com> Message-ID: Hi David, Serguei, > It was a miscommunication between me and Yasumasa. > I expected him to run all the SA tests before requesting a push. Sorry, I did not run jtreg about this. I expected JPRT runs all tests before pushing. >> The approach seems reasonable though I'm somewhat unclear on what the >> possibilities for the address are. Exception stack shows that it was occurred in heap region iteration. So I guess it might not be OopHandle when the region is not used. Thanks, Yasumasa 2017-10-10 13:31 GMT+09:00 serguei.spitsyn at oracle.com : > Hi David, > > > On 10/9/17 20:19, David Holmes wrote: >> >> Hi Yasumasa, >> >> On 10/10/2017 12:03 PM, Yasumasa Suenaga wrote: >>> >>> Hi all, >>> >>> The following serviceability/sa tests are failed after 8187403: >> >> >> Please ensure you add a link to the bug that introduces a failure when >> creating the new bug - I've added it now. > > > Thanks, David! > I've also added a link to the original mdash failures. > >>> serviceability/sa/TestHeapDumpForInvokeDynamic.java >>> serviceability/sa/TestHeapDumpForLargeArray.java >>> serviceability/sa/jmap-hprof/JMapHProfLargeHeapTest.java >> >> >> Why was this not detected before 8187403 was pushed? > > > It was a miscommunication between me and Yasumasa. > I expected him to run all the SA tests before requesting a push. > > Thanks, > Serguei > > >> >>> These failures are caused by the address of HeapRegion. >>> The address which is passed to c'tor of HeapRegion might not be >>> OopHandle. >>> So we have to switch the method of address calculation. >>> >>> I uploaded webrev for this issue. Could you review it? >>> >>> http://cr.openjdk.java.net/~ysuenaga/JDK-8189069/webrev.00/ >> >> >> The approach seems reasonable though I'm somewhat unclear on what the >> possibilities for the address are. >> >> Thanks, >> David >> ----- >> >>> >>> I cannot access JPRT. So I need a sponsor. >>> >>> >>> Thanks, >>> >>> Yasumasa >>> > From ujwal.vangapally at oracle.com Tue Oct 10 11:47:30 2017 From: ujwal.vangapally at oracle.com (Ujwal Vangapally) Date: Tue, 10 Oct 2017 17:17:30 +0530 Subject: RFR : JDK-8044122 MBean access to the PID Message-ID: <2721cad6-a8a4-af6e-f0dd-90e933580003@oracle.com> Kindly review the changes made. https://bugs.openjdk.java.net/browse/JDK-8044122 webrev : http://cr.openjdk.java.net/~uvangapally/webrev/2017/8044122/webrev.00/ CSR : https://bugs.openjdk.java.net/browse/JDK-8189091 Thanks, Ujwal. From Alan.Bateman at oracle.com Tue Oct 10 11:57:41 2017 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Tue, 10 Oct 2017 12:57:41 +0100 Subject: RFR : JDK-8044122 MBean access to the PID In-Reply-To: <2721cad6-a8a4-af6e-f0dd-90e933580003@oracle.com> References: <2721cad6-a8a4-af6e-f0dd-90e933580003@oracle.com> Message-ID: <4d34b7fd-6623-464b-1a7b-ef4089ad1279@oracle.com> On 10/10/2017 12:47, Ujwal Vangapally wrote: > Kindly review the changes made. > > https://bugs.openjdk.java.net/browse/JDK-8044122 > > webrev : > http://cr.openjdk.java.net/~uvangapally/webrev/2017/8044122/webrev.00/ > > CSR : https://bugs.openjdk.java.net/browse/JDK-8189091 The term "PID" is not defined in this context so I think this will need improved javadoc to define the term. In Process API, the term used is "process ID" and we should try to keep the API consistent if we can. -Alan. From ujwal.vangapally at oracle.com Tue Oct 10 12:00:53 2017 From: ujwal.vangapally at oracle.com (Ujwal Vangapally) Date: Tue, 10 Oct 2017 17:30:53 +0530 Subject: RFR : JDK-8044122 MBean access to the PID In-Reply-To: <4d34b7fd-6623-464b-1a7b-ef4089ad1279@oracle.com> References: <2721cad6-a8a4-af6e-f0dd-90e933580003@oracle.com> <4d34b7fd-6623-464b-1a7b-ef4089ad1279@oracle.com> Message-ID: <1a5d7880-aa98-3736-a779-1cf1a4a5b58d@oracle.com> Thanks for the review Alan, will make that change in next webrev. Ujwal. On 10/10/2017 5:27 PM, Alan Bateman wrote: > On 10/10/2017 12:47, Ujwal Vangapally wrote: >> Kindly review the changes made. >> >> https://bugs.openjdk.java.net/browse/JDK-8044122 >> >> webrev : >> http://cr.openjdk.java.net/~uvangapally/webrev/2017/8044122/webrev.00/ >> >> CSR : https://bugs.openjdk.java.net/browse/JDK-8189091 > The term "PID" is not defined in this context so I think this will > need improved javadoc to define the term. In Process API, the term > used is "process ID" and we should try to keep the API consistent if > we can. > > -Alan. From harsha.wardhana.b at oracle.com Tue Oct 10 12:21:07 2017 From: harsha.wardhana.b at oracle.com (Harsha Wardhana B) Date: Tue, 10 Oct 2017 17:51:07 +0530 Subject: RFR : JDK-8044122 MBean access to the PID In-Reply-To: <2721cad6-a8a4-af6e-f0dd-90e933580003@oracle.com> References: <2721cad6-a8a4-af6e-f0dd-90e933580003@oracle.com> Message-ID: Hi Ujwal, Could you please add a test-case to validate your changes? You can spawn a new process and it can exchange its pid to its parent via System.out/in or via sockets. Also, VMManagementImpl:145, the change from getProcessId to getVmPid seems unnecessary. -Harsha On Tuesday 10 October 2017 05:17 PM, Ujwal Vangapally wrote: > Kindly review the changes made. > > https://bugs.openjdk.java.net/browse/JDK-8044122 > > webrev : > http://cr.openjdk.java.net/~uvangapally/webrev/2017/8044122/webrev.00/ > > CSR : https://bugs.openjdk.java.net/browse/JDK-8189091 > > Thanks, > > Ujwal. > From jini.george at oracle.com Tue Oct 10 12:23:25 2017 From: jini.george at oracle.com (Jini George) Date: Tue, 10 Oct 2017 17:53:25 +0530 Subject: RFR: SA: MacOS X: 8184042: several serviceability/sa tests timed out on MacOS X In-Reply-To: References: <2121618c-2878-385e-72bb-d9dc6c25b65a@oracle.com> <7b5ba4a7-2463-d9e3-b0a4-76957c630719@oracle.com> <081d276d-4616-364c-2f9c-acda35b4eaf4@oracle.com> <6b4cb988-a4e4-de22-1728-c87dccfde972@oracle.com> Message-ID: Thank you, David and Serguei. - Jini. On 10/10/2017 9:48 AM, serguei.spitsyn at oracle.com wrote: > Hi Jini, > > +1 > > Thanks, > Serguei > > > On 10/9/17 20:31, David Holmes wrote: >> Thanks for your patience on this one Jini! The change looks good. >> >> Thanks, >> David >> >> On 10/10/2017 2:15 AM, Jini George wrote: >>> Hi all, >>> >>> I have created a webrev restoring the PT_ATTACH: >>> >>> http://cr.openjdk.java.net/~jgeorge/8184042/webrev.01/ >>> >>> Have included Dmitry's comments on disabling the the deprecation >>> warning. I would like to request for reviews for this. >>> >>> Thank you, >>> Jini. >>> >>> >>> On 9/8/2017 3:09 AM, serguei.spitsyn at oracle.com wrote: >>>> On 8/25/17 02:24, serguei.spitsyn at oracle.com wrote: >>>>> Hi Jini, >>>>> >>>>> >>>>> On 8/18/17 04:00, David Holmes wrote: >>>>>> Hi Jini, >>>>>> >>>>>> Just reading the bug report and your description below this seems >>>>>> like a major change to try and use a facility (mach exceptions) >>>>>> that no one seems to have any experience with! That isn't >>>>>> something to be rushed. >>>>> >>>>>> Even if PT_ATTACH has been deprecated restoring its use may be the >>>>>> quick way forward instead of trying to rush in something like this. >>>>> >>>>> This approach looks reasonable to me. >>>> >>>> I've just realized that my statement might sound incorrectly. >>>> I meant that the David's suggestion to restore the use of the >>>> deprecated PT_ATTACH looks reasonable. >>>> >>>> Sorry, if it caused any confusion. >>>> >>>> Thanks, >>>> Serguei >>>> >>>> >>>>> Otherwise, it would be nice to hear why it is not good. >>>>> How much would it break the fix of the JDK-8182299? >>>>> >>>>> Thanks, >>>>> Serguei >>>>> >>>>>> >>>>>> Just my 2c. >>>>>> >>>>>> Cheers, >>>>>> David >>>>>> >>>>>> On 18/08/2017 8:00 PM, Jini George wrote: >>>>>>> Hi all, >>>>>>> >>>>>>> Requesting reviews for: >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8184042 >>>>>>> >>>>>>> Webrev: http://cr.openjdk.java.net/~jgeorge/8184042/webrev.00/ >>>>>>> >>>>>>> Problem gist: The deprecated ptrace() command, PT_ATTACH was >>>>>>> changed to PT_ATTACHEXC, which causes mach exceptions (and not >>>>>>> UNIX signals) to be delivered via mach messages.This caused SA to >>>>>>> hang at waitpid() waiting for a signal, which does not arrive. >>>>>>> >>>>>>> Solution in a nutshell: The solution is to make the required >>>>>>> changes to handle mach 'soft signal' exceptions in the form of >>>>>>> mach messages instead of signals, while attaching to and >>>>>>> detaching from the target process. The detailed steps are >>>>>>> outlined in JBS. >>>>>>> >>>>>>> The changes appear huge due to the inclusion of pre-generated >>>>>>> mach exception handling files (mach_exc*). Since this is an >>>>>>> integration blocker, it would be great to get quick reviews on this. >>>>>>> >>>>>>> Thank you, >>>>>>> Jini. >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>> >>>> > From ujwal.vangapally at oracle.com Tue Oct 10 17:01:24 2017 From: ujwal.vangapally at oracle.com (Ujwal Vangapally) Date: Tue, 10 Oct 2017 22:31:24 +0530 Subject: RFR : JDK-8044122 MBean access to the PID In-Reply-To: References: <2721cad6-a8a4-af6e-f0dd-90e933580003@oracle.com> Message-ID: Thanks for the review Harsha. kindly see my comments inline. On 10/10/2017 5:51 PM, Harsha Wardhana B wrote: > Hi Ujwal, > > Could you please add a test-case to validate your changes? You can > spawn a new process and it can exchange its pid to its parent via > System.out/in or via sockets. > will it be sufficient if I verify that by comparing it with ProcessHandle.current().pid() > Also, VMManagementImpl:145, the change from getProcessId to getVmPid > seems unnecessary. I will revert it back if not required. > > -Harsha > > > On Tuesday 10 October 2017 05:17 PM, Ujwal Vangapally wrote: >> Kindly review the changes made. >> >> https://bugs.openjdk.java.net/browse/JDK-8044122 >> >> webrev : >> http://cr.openjdk.java.net/~uvangapally/webrev/2017/8044122/webrev.00/ >> >> CSR : https://bugs.openjdk.java.net/browse/JDK-8189091 >> >> Thanks, >> >> Ujwal. >> > From mandy.chung at oracle.com Tue Oct 10 17:20:12 2017 From: mandy.chung at oracle.com (mandy chung) Date: Tue, 10 Oct 2017 10:20:12 -0700 Subject: RFR : JDK-8044122 MBean access to the PID In-Reply-To: <2721cad6-a8a4-af6e-f0dd-90e933580003@oracle.com> References: <2721cad6-a8a4-af6e-f0dd-90e933580003@oracle.com> Message-ID: <7b20fd53-e7e9-337a-d77a-e068deeb10b7@oracle.com> On 10/10/17 4:47 AM, Ujwal Vangapally wrote: > Kindly review the changes made. > > https://bugs.openjdk.java.net/browse/JDK-8044122 > > webrev : > http://cr.openjdk.java.net/~uvangapally/webrev/2017/8044122/webrev.00/ > RuntimeMXBean.java ?? @since is missing ?? Process::pid is long rather than int. The javadoc for this method should be consistent with Process::pid, as Alan points out. VMManagementImpl.java ??? I think getProcessId should probably be replaced to implement with ProcessHandle.current().pid(); Please include an unit test for it. Mandy From Roger.Riggs at Oracle.com Tue Oct 10 17:55:08 2017 From: Roger.Riggs at Oracle.com (Roger Riggs) Date: Tue, 10 Oct 2017 13:55:08 -0400 Subject: RFR : JDK-8044122 MBean access to the PID In-Reply-To: <7b20fd53-e7e9-337a-d77a-e068deeb10b7@oracle.com> References: <2721cad6-a8a4-af6e-f0dd-90e933580003@oracle.com> <7b20fd53-e7e9-337a-d77a-e068deeb10b7@oracle.com> Message-ID: Hi Ujwal, In the implementation RuntimeMXBean.java: 72:? Include a message "getProcessId" in the throw new Unsupported... In the text and @return change "PID" to "process ID" as Alan suggested. 66: the @implSpec should be on its own line so the text starts on a new line to make the source more readable. Adding a test for getProcessId() should fit into one of the existing tests that spawns and then checks the attributes of a vm.? Perhaps MXBeanInteropTest1.java Roger On 10/10/2017 1:20 PM, mandy chung wrote: > > > On 10/10/17 4:47 AM, Ujwal Vangapally wrote: >> Kindly review the changes made. >> >> https://bugs.openjdk.java.net/browse/JDK-8044122 >> >> webrev : >> http://cr.openjdk.java.net/~uvangapally/webrev/2017/8044122/webrev.00/ >> > > RuntimeMXBean.java > ?? @since is missing > > ?? Process::pid is long rather than int. The javadoc for this method > should be consistent with Process::pid, as Alan points out. > > VMManagementImpl.java > ??? I think getProcessId should probably be replaced to implement with > ProcessHandle.current().pid(); > > Please include an unit test for it. > > Mandy From daniel.fuchs at oracle.com Wed Oct 11 09:00:45 2017 From: daniel.fuchs at oracle.com (Daniel Fuchs) Date: Wed, 11 Oct 2017 10:00:45 +0100 Subject: RFE Review : JDK-5016517 - Replace plaintext passwords by hashed passwords for out-of-the-box JMX Agent In-Reply-To: <83690f04-4cdb-ce11-a744-7d4372ec7500@oracle.com> References: <6a193cdf-efea-e4e9-4faf-72d8960cd72b@Oracle.com> <0483f2c9-892e-f842-bb5f-2b522b81b199@oracle.com> <6ec697c3-cb0d-ef5b-4a0c-e26e0cfc92b0@Oracle.com> <1ceb614a-585d-2ccd-0877-f99202c43718@oracle.com> <5892e0b2-51c6-6a22-5a39-d4ab93ca2fcf@oracle.com> <85633749-9830-dcd5-97a5-411f684776bd@oracle.com> <83690f04-4cdb-ce11-a744-7d4372ec7500@oracle.com> Message-ID: Hi Harsha, Your changes look good. However I have still a nagging doubt: What happens if two Java process share the same password file, and it needs hashing? Are there any protection in place to prevent the two processes from writing to the same file concurrently? best regards, -- daniel On 09/10/2017 06:34, Harsha Wardhana B wrote: > Hi Daniel, > > Below is the webrev addressing the review comments. > > http://cr.openjdk.java.net/~hb/5016517/webrev.04/ > > On Friday 06 October 2017 03:38 PM, Daniel Fuchs wrote: >> Hi Harsha, >> >> Good work! >> >> > http://cr.openjdk.java.net/~hb/5016517/webrev.03/ >> >> long standing typo in management.properties at line 90: >> >> ? measureRole => monitorRole > Done. >> >> HashedPasswordManager.java: loadPasswords() >> >> It seems this function will add the header to the file even >> if it already contains the header. >> >> So every time a user/administrator wants to change/add a password, >> the header will be inserted again. >> > Yes. It is fixed in the new webrev. >> >> HashedPasswordFileTest should probably have a test for this >> scenario as well: >> >> generate password file with clear text password >> load it, then verify passwords have been hashed (properly) >> add some new user/name password to the same file >> load it again, verify all passwords are hashed >> (do this a number of times - to make sure it doesn't >> ?break the second or third time) >> and finally verify the header is only present once ;-) >> > Done. Added a testcase for the same. >> I'm surprised no other tests had to be modified. >> Is password hash disabled by default in the default agent? >> > Password hashing is enabled by default. But it is only the > implementation that is changed. The pluggable JAAS mechanism isolates > interfaces from implementation. So in theory, all tests should pass. >> If not then you should try (locally) running jtreg >> more than once over the default agent tests. >> Just make sure running the same test twice doesn't >> make the legacy tests that use password files failing the >> second time when they discover that passwords have been >> hashed under their feet (the client part of the test >> might be reading the password file too to see which >> password it should send to the agent). >> >> Otherwise I think it looks good to me - provided all >> tests are passing! >> > Done. Had a few test failures but nothing related to this enhancement. >> best regards, >> >> -- daniel > Thanks > Harsha >> >> On 06/10/2017 06:25, Harsha Wardhana B wrote: >>> Hi All, >>> >>> Previously, for default agent, hashing of the passwords was done >>> during the agent boot-up (ConnectorBootstrap.java). That was an error >>> since login configuration could be different and is determined only >>> when a login attempt is made. It would be then pointless to hash the >>> password file. The fix for above and some off-list comments are >>> incorporated in webrev below. >>> >>> http://cr.openjdk.java.net/~hb/5016517/webrev.03/ >>> >>> -Harsha >>> >>> >>> On Wednesday 04 October 2017 01:53 PM, Harsha Wardhana B wrote: >>>> Hi Roger, >>>> >>>> Below is the webrev incorporating changes suggested by you. >>>> >>>> http://cr.openjdk.java.net/~hb/5016517/webrev.02/ >>>> >>>> -Harsha >>>> >>>> On Wednesday 04 October 2017 12:54 AM, Roger Riggs wrote: >>>>> Hi Harsha, >>>>> >>>>> FileLoginModule.java:? 104:? Add a period at the end of the the >>>>> sentence. >>>>> >>>>> JMXPluggableAuthenticator.java: line 306:? Is the difference >>>>> between singular and plural significant? >>>>> ? It would be less confusing if both were plural (hashPasswords). >>>> Ok. >>>>> ConnectorBootstrap: >>>>> 134: ...password.file.hash" and HashedPasswordManager disagree on >>>>> the exact string. >>>>> I would propose 'hashpasswords' as the suffix in all places to be >>>>> consistent >>>>> in ConnectorBootstrap.java, HashedPasswordManager (except for >>>>> capitalization), >>>>> jmxremote.password.template, and management.properties >>>> Do you want to rename HashedPasswordManager class? >>>>> >>>>> As is you have a mix of "...password.hash", >>>>> "...password.file.hash", "...hashpassword"; >>>>> that's not good for knowing there is only one semantic. >>>>> >>>>> line 482:? " ," -> ", "? space after comma, not before >>>>> >>>> Will incorporate above comments. >>>>> line: 771: is it intentional to discard the reference to the new >>>>> HashedPasswordManager? >>>>> If the intention is only to use the side effect of loadPasswords, >>>>> then please >>>>> create a static method in HashedPasswordManager for that purpose. >>>>> (Even if just does the same code; it would be clear that's the >>>>> purpose). >>>>> (It probably also implies that the password file will be read a >>>>> second time somewhere else in the initialization). >>>> Static methods just to hash passwords can be created but >>>> HashedPasswordManager class will have to be re-factored since almost >>>> all methods are using instance variables. Not sure if we want >>>> instance methods and look-alike static methods side-by-side. >>>> Wouldn't that be more confusing than current implementation? >>>>> >>>>> line:770:? the string constant would be nicer as a final static >>>>> string somewhere. >>>>> ? "jmx.remote.x.password.file.hashpassword" >>>> All of "jmx.remote.x.*" don't have static strings. They are used 'as >>>> is' all over the code to maintain isolation between pluggable login >>>> authenticator and JDK code. >>>>> >>>>> Roger >>>>> >>>>> >>>> Harsha >>>>> >>>>> On 10/3/2017 3:47 PM, Harsha Wardhana B wrote: >>>>>> >>>>>> Hi Roger,>>> Thanks for the detailed review. Below is the webrev >>>>>> addressing all the review comments. >>>>>> >>>>>> http://cr.openjdk.java.net/~hb/5016517/webrev.01/ >>>>>> >>>>>> -Harsha >>>>>> >>>>>> >>>>>> On Tuesday 25 April 2017 10:56 PM, Roger Riggs wrote: >>>>>>> Hi Harsha, >>>>>>> >>>>>>> Thanks for this important improvement. Comments: >>>>>>> >>>>>>> >>>>>>> * jmxremote.password.template: >>>>>>> ? "Passwords will be hashed by server if they are in clear." >>>>>>> Perhaps should be more explicit: >>>>>>> >>>>>>> ?? "The jmxremote.passwords file will be re-written by the server >>>>>>> to replace all plain text passwords with hashed passwords when >>>>>>> the file is read by the server." >>>>>>> >>>>>>> line 35: "Base64 encoded hash"? -> drop the "Base64" in this line >>>>>>> isn't needed and >>>>>>> make it seems like it should appear as 1 field instead of 2 or 3. >>>>>>> >>>>>>> 37+: The syntax of the file may be clearer if it includes the >>>>>>> complete syntax in (line 39) not >>>>>>> just the password/hash fragment. >>>>>>> >>>>>>> Line 41:? "W = spaces"; above "tabs" are allowed as a delimiter; >>>>>>> it would be good to be consistent >>>>>>> and include the usualy white-space characters in the set, be as >>>>>>> specific as possible. >>>>>>> Is this the same set of whitespace used by Regex '\\s'. >>>>>> Only spaces and tabs are allowed. '\s' matches newline as well >>>>>> hence not allowed. >>>>>>> >>>>>>> 45: "java platform""? ->?? "MD5, SDA-1, SHA-256 are supported >>>>>>> algorithms." >>>>>>> >>>>>>> 49: be more specific about 'hashing is requested'? how? Refer to >>>>>>> the management.properties >>>>>>> ? com.sun.management.jmxremote.password.hash value. >>>>>>> >>>>>>> >>>>>>> >>>>>>> 51:? "replace hashed" -> "replace *the *hashed" >>>>>>> 52: "with clear text or new" -> "with the clear text or the new" >>>>>>> 52: "If new password" -> "If the new password" >>>>>>> 53: "when new login" -> "when a new login" >>>>>>> >>>>>>> 60: "User generated" -> "A User generated" >>>>>>> >>>>>>> 67: Will the file be ignored if it has the wrong permissions. >>>>>>> (With a logged message) >>>>>> Addressed all the above review comments. >>>>>>> >>>>>>> * management.properties >>>>>>> >>>>>>> 306: "(Case for true/false ignored)"? - what does this mean; I >>>>>>> think it can be removed. >>>>>>> >>>>>>> 307: missing period at the end of the sentence. >>>>>>> 309: "in password file" -> "in the password file" >>>>>>> >>>>>> Done. >>>>>>> >>>>>>> * FileLoginModule.java >>>>>>> >>>>>>> 102: can this match better the similar name in the >>>>>>> management.properties if it has the same function: >>>>>>> ??? com.sun.management.jmxremote.password.hash >>>>>> Are you suggesting that 'hashPassword' be renamed to something >>>>>> similar to com.sun.management.jmxremote.password.hash? Variable >>>>>> names cannot be similar to property names since property names are >>>>>> long and provide complete context which local variables need not >>>>>> have to do. >>>>> the suffix should be the same in all places since it is a single >>>>> semantic. >>>> Done. >>>>>>> 103: "replaces clear text passwords" -> "replaces each clear text >>>>>>> password" >>>>>>> 104: indent to match previous
enteries. >>>>>>> >>>>>>> * JMXPluggableAuthenticator.java >>>>>>> >>>>>>> 119: There is no need to copy the password to a new local >>>>>> It is required since variables accessed from inner class must be >>>>>> final or effectively final. >>>>> right >>>>>>> >>>>>>> 128: add a space after "," >>>>>>> >>>>>>> 256 private static final String HASH_PASSWORDS = >>>>>>> 257 "jmx.remote.x.password.file.hash"; >>>>>>> >>>>>>> The name ".hash" part does not clearly communicate that passwords >>>>>>> are to be hashed. >>>>>>> "hashPasswords" might be more self explanatory. >>>>>> Changed it to "jmx.remote.x.password.file.hashpassword". >>>>> drop the "file." >>>> Done. >>>>>>> Also, can this be NOT duplicated here and in >>>>>>> ConnectorBootStrap.java? >>>>>> The property names used in ConnectorBootStrap follows the >>>>>> convention used in management.properties file - >>>>>> 'com.sun.management.*'. For environment variables for a >>>>>> JMXConnector "jmx.remote.x.*" convention is used . Hence they >>>>>> cannot be duplicated. >>>>> The differing prefix'es are fine as is; no change except to make >>>>> the new keys consistent. >>>>> >>>>>>> >>>>>>> >>>>>>> * ConnectorBootStrap.java: >>>>>>> ?482: Add space after ","s; no spaces before. >>>>>>> >>>>>>> 770: use the same name for the option/property if possible to >>>>>>> avoid confusion. >>>>>> Not possible as explained above. >>>>>>> >>>>>>> 770:? if the HASH_PASSWORDS static is appropriate use it instead >>>>>>> of literal "true". >>>>>> DefaultValues.HASH_PASSWORDS static is set to 'true' and can be >>>>>> used. However using literal "true" is more readable than using the >>>>>> static. >>>>>>> >>>>>>> * HashedPasswordManager >>>>>>> >>>>>>> 80-83: The fields can be final and use the constructor to >>>>>>> initialize in all cases and make the class final >>>>>>> to avoid unintentional subclassing. >>>>>>> >>>>>>> >>>>>>> 113: canWriteToFile:?? It should be made clear in the template >>>>>>> that *both* the Security policy >>>>>>> ?? and the file access value are used to check that the file can >>>>>>> be updated. >>>>>> Made it explicit in template as well as code comments. >>>>>>> >>>>>>> 200: loadPasswords() - should this confirm the access to the file >>>>>>> is allowed and it has >>>>>>> the correct file access before reading? >>>>>> Not really required. Appropriate exceptions are thrown if file >>>>>> cannot be accessed. >>>>>>> >>>>>>> Is the re-writing of the passwords intended to be done by a >>>>>>> 'priveleged' system. >>>>>>> Does this need doPrivileged? >>>>>> I am not sure. Maybe it will be covered in the security review. >>>>>>> >>>>>>> * HashedPasswordFileTest: >>>>>>> >>>>>>> 88: should use the TestLibrary Utils.getRandomInstance so it logs >>>>>>> the seed and can be replayed if necessary. >>>>>>> >>>>>>> >>>>>> Done >>>>>>> Thanks, Roger >>>>>>> >>>>>> Thanks >>>>>> Harsha >>>>>>> >>>>>>> On 4/23/2017 6:20 AM, Harsha Wardhana B wrote: >>>>>>>> >>>>>>>> Hi All, >>>>>>>> >>>>>>>> Please review this enhancement to replace plain-text password >>>>>>>> for JMX agent with SHA-256 hash. >>>>>>>> >>>>>>>> Issue: https://bugs.openjdk.java.net/browse/JDK-5016517 >>>>>>>> >>>>>>>> >>>>>>>> webrev: http://cr.openjdk.java.net/~hb/5016517/webrev.00/ >>>>>>>> >>>>>>>> Overview of implementation: >>>>>>>> >>>>>>>> Currently, the JMX agent password file used to authenticate >>>>>>>> user, stores user name and password as clear text. Though system >>>>>>>> level restrictions are recommended for jmx password file, >>>>>>>> passwords are vulnerable since they are stored in clear. The >>>>>>>> current RFE proposes to store passwords as SHA256 hash instead >>>>>>>> of clear text. >>>>>>>> >>>>>>>> In current implementation, if password file is writable, and if >>>>>>>> passwords are in clear, they will be replaced by SHA256 hash >>>>>>>> upon agent boot-up or when login attempt is made. >>>>>>>> >>>>>>>> The file, >>>>>>>> src/jdk.management.agent/share/conf/jmxremote.password.template >>>>>>>> contains more details about the implementation. >>>>>>>> >>>>>>>> - Harsha >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> > From ujwal.vangapally at oracle.com Wed Oct 11 10:20:16 2017 From: ujwal.vangapally at oracle.com (Ujwal Vangapally) Date: Wed, 11 Oct 2017 15:50:16 +0530 Subject: RFR : JDK-8044122 MBean access to the PID In-Reply-To: References: <2721cad6-a8a4-af6e-f0dd-90e933580003@oracle.com> <7b20fd53-e7e9-337a-d77a-e068deeb10b7@oracle.com> Message-ID: <6e09f0f8-6040-bd84-47f0-1e88239e9164@oracle.com> Thanks for the review and suggestions Mandy, Roger. kindly see my comments inline. On 10/10/2017 11:25 PM, Roger Riggs wrote: > Hi Ujwal, > > In the implementation RuntimeMXBean.java: 72: Include a message > "getProcessId" in the throw new Unsupported... > In the text and @return change "PID" to "process ID" as Alan suggested. > 66: the @implSpec should be on its own line so the text starts on a > new line to make the source more readable. > > Adding a test for getProcessId() should fit into one of the existing > tests that spawns and then checks > the attributes of a vm. Perhaps MXBeanInteropTest1.java I will make changes as suggested. > > Roger > > > > On 10/10/2017 1:20 PM, mandy chung wrote: >> >> >> On 10/10/17 4:47 AM, Ujwal Vangapally wrote: >>> Kindly review the changes made. >>> >>> https://bugs.openjdk.java.net/browse/JDK-8044122 >>> >>> webrev : >>> http://cr.openjdk.java.net/~uvangapally/webrev/2017/8044122/webrev.00/ >>> >> >> RuntimeMXBean.java >> @since is missing >> I will add it. >> Process::pid is long rather than int. The javadoc for this method >> should be consistent with Process::pid, as Alan points out. will do it. >> >> VMManagementImpl.java >> I think getProcessId should probably be replaced to implement >> with ProcessHandle.current().pid(); >> you mean it would be better to use ProcessHandle.current().pid(); in RuntimeImpl.java instead of jvm.getVmPid(); kindly clarify. >> Please include an unit test for it. >> will it be sufficient to add it to existing test MXBeanInteropTest1.java System.out.println("getName\t\t" + runtime.getName()); + System.out.println("getPid\t\t" + + runtime.getPid()); System.out.println("getSpecName\t\t" + runtime.getSpecName()); >> Mandy > Ujwal From jini.george at oracle.com Wed Oct 11 12:22:10 2017 From: jini.george at oracle.com (Jini George) Date: Wed, 11 Oct 2017 17:52:10 +0530 Subject: RFR: 8189069: regression after push of 8187403: "AssertionFailure: addr should be OopHandle" In-Reply-To: <49e1a6b6-bc89-d158-cbe1-3e163edf4b09@oracle.com> References: <49e1a6b6-bc89-d158-cbe1-3e163edf4b09@oracle.com> Message-ID: <075beb8e-4ffe-0292-6b46-053e1d2dc68c@oracle.com> Hi Yasumasa, The changes look fine. Thanks, Jini (not a Reviewer). On 10/10/2017 7:40 AM, serguei.spitsyn at oracle.com wrote: > Hi Yasumasa, > > Thank you for the quick fix! > It looks good. > I'll sponsor your fix after we get at least one more review. > > Thanks, > Serguei > > > On 10/9/17 19:03, Yasumasa Suenaga wrote: >> Hi all, >> >> The following serviceability/sa tests are failed after 8187403: >> >> ?? serviceability/sa/TestHeapDumpForInvokeDynamic.java >> ?? serviceability/sa/TestHeapDumpForLargeArray.java >> ?? serviceability/sa/jmap-hprof/JMapHProfLargeHeapTest.java >> >> These failures are caused by the address of HeapRegion. >> The address which is passed to c'tor of HeapRegion might not be >> OopHandle. >> So we have to switch the method of address calculation. >> >> I uploaded webrev for this issue. Could you review it? >> >> ?? http://cr.openjdk.java.net/~ysuenaga/JDK-8189069/webrev.00/ >> >> >> I cannot access JPRT. So I need a sponsor. >> >> >> Thanks, >> >> Yasumasa > From Roger.Riggs at Oracle.com Wed Oct 11 15:51:25 2017 From: Roger.Riggs at Oracle.com (Roger Riggs) Date: Wed, 11 Oct 2017 11:51:25 -0400 Subject: RFE Review : JDK-5016517 - Replace plaintext passwords by hashed passwords for out-of-the-box JMX Agent In-Reply-To: References: <6a193cdf-efea-e4e9-4faf-72d8960cd72b@Oracle.com> <0483f2c9-892e-f842-bb5f-2b522b81b199@oracle.com> <6ec697c3-cb0d-ef5b-4a0c-e26e0cfc92b0@Oracle.com> <1ceb614a-585d-2ccd-0877-f99202c43718@oracle.com> <5892e0b2-51c6-6a22-5a39-d4ab93ca2fcf@oracle.com> <85633749-9830-dcd5-97a5-411f684776bd@oracle.com> <83690f04-4cdb-ce11-a744-7d4372ec7500@oracle.com> Message-ID: Hi Harsha, conf/management.properties: - typo line 307: pa*sss*words HashedPasswordManager.java: ?- line 46: "classes" -> "class" - line 84-87 "private" and 'static" come before "final" in declarations. ?- 158 and everywhere: add space after "if"? before "(" ?- line 202: add "the" before password file. ?- line 287:? why a separate canWriteToFile(); it does the same check as newFileWriter(passwordFile); ?? instead catch the exception and log then ignore. Looking good Regards, Roger On 10/11/2017 5:00 AM, Daniel Fuchs wrote: > Hi Harsha, > > Your changes look good. However I have still a nagging doubt: > > What happens if two Java process share the same password file, > and it needs hashing? Are there any protection in place > to prevent the two processes from writing to the same > file concurrently? > > best regards, > > -- daniel > > On 09/10/2017 06:34, Harsha Wardhana B wrote: >> Hi Daniel, >> >> Below is the webrev addressing the review comments. >> >> http://cr.openjdk.java.net/~hb/5016517/webrev.04/ >> >> On Friday 06 October 2017 03:38 PM, Daniel Fuchs wrote: >>> Hi Harsha, >>> >>> Good work! >>> >>> > http://cr.openjdk.java.net/~hb/5016517/webrev.03/ >>> >>> long standing typo in management.properties at line 90: >>> >>> ? measureRole => monitorRole >> Done. >>> >>> HashedPasswordManager.java: loadPasswords() >>> >>> It seems this function will add the header to the file even >>> if it already contains the header. >>> >>> So every time a user/administrator wants to change/add a password, >>> the header will be inserted again. >>> >> Yes. It is fixed in the new webrev. >>> >>> HashedPasswordFileTest should probably have a test for this >>> scenario as well: >>> >>> generate password file with clear text password >>> load it, then verify passwords have been hashed (properly) >>> add some new user/name password to the same file >>> load it again, verify all passwords are hashed >>> (do this a number of times - to make sure it doesn't >>> ?break the second or third time) >>> and finally verify the header is only present once ;-) >>> >> Done. Added a testcase for the same. >>> I'm surprised no other tests had to be modified. >>> Is password hash disabled by default in the default agent? >>> >> Password hashing is enabled by default. But it is only the >> implementation that is changed. The pluggable JAAS mechanism isolates >> interfaces from implementation. So in theory, all tests should pass. >>> If not then you should try (locally) running jtreg >>> more than once over the default agent tests. >>> Just make sure running the same test twice doesn't >>> make the legacy tests that use password files failing the >>> second time when they discover that passwords have been >>> hashed under their feet (the client part of the test >>> might be reading the password file too to see which >>> password it should send to the agent). >>> >>> Otherwise I think it looks good to me - provided all >>> tests are passing! >>> >> Done. Had a few test failures but nothing related to this enhancement. >>> best regards, >>> >>> -- daniel >> Thanks >> Harsha >>> >>> On 06/10/2017 06:25, Harsha Wardhana B wrote: >>>> Hi All, >>>> >>>> Previously, for default agent, hashing of the passwords was done >>>> during the agent boot-up (ConnectorBootstrap.java). That was an >>>> error since login configuration could be different and is >>>> determined only when a login attempt is made. It would be then >>>> pointless to hash the password file. The fix for above and some >>>> off-list comments are incorporated in webrev below. >>>> >>>> http://cr.openjdk.java.net/~hb/5016517/webrev.03/ >>>> >>>> -Harsha >>>> >>>> >>>> On Wednesday 04 October 2017 01:53 PM, Harsha Wardhana B wrote: >>>>> Hi Roger, >>>>> >>>>> Below is the webrev incorporating changes suggested by you. >>>>> >>>>> http://cr.openjdk.java.net/~hb/5016517/webrev.02/ >>>>> >>>>> -Harsha >>>>> >>>>> On Wednesday 04 October 2017 12:54 AM, Roger Riggs wrote: >>>>>> Hi Harsha, >>>>>> >>>>>> FileLoginModule.java:? 104:? Add a period at the end of the the >>>>>> sentence. >>>>>> >>>>>> JMXPluggableAuthenticator.java: line 306:? Is the difference >>>>>> between singular and plural significant? >>>>>> ? It would be less confusing if both were plural (hashPasswords). >>>>> Ok. >>>>>> ConnectorBootstrap: >>>>>> 134: ...password.file.hash" and HashedPasswordManager disagree on >>>>>> the exact string. >>>>>> I would propose 'hashpasswords' as the suffix in all places to be >>>>>> consistent >>>>>> in ConnectorBootstrap.java, HashedPasswordManager (except for >>>>>> capitalization), >>>>>> jmxremote.password.template, and management.properties >>>>> Do you want to rename HashedPasswordManager class? >>>>>> >>>>>> As is you have a mix of "...password.hash", >>>>>> "...password.file.hash", "...hashpassword"; >>>>>> that's not good for knowing there is only one semantic. >>>>>> >>>>>> line 482:? " ," -> ", "? space after comma, not before >>>>>> >>>>> Will incorporate above comments. >>>>>> line: 771: is it intentional to discard the reference to the new >>>>>> HashedPasswordManager? >>>>>> If the intention is only to use the side effect of loadPasswords, >>>>>> then please >>>>>> create a static method in HashedPasswordManager for that purpose. >>>>>> (Even if just does the same code; it would be clear that's the >>>>>> purpose). >>>>>> (It probably also implies that the password file will be read a >>>>>> second time somewhere else in the initialization). >>>>> Static methods just to hash passwords can be created but >>>>> HashedPasswordManager class will have to be re-factored since >>>>> almost all methods are using instance variables. Not sure if we >>>>> want instance methods and look-alike static methods side-by-side. >>>>> Wouldn't that be more confusing than current implementation? >>>>>> >>>>>> line:770:? the string constant would be nicer as a final static >>>>>> string somewhere. >>>>>> ? "jmx.remote.x.password.file.hashpassword" >>>>> All of "jmx.remote.x.*" don't have static strings. They are used >>>>> 'as is' all over the code to maintain isolation between pluggable >>>>> login authenticator and JDK code. >>>>>> >>>>>> Roger >>>>>> >>>>>> >>>>> Harsha >>>>>> >>>>>> On 10/3/2017 3:47 PM, Harsha Wardhana B wrote: >>>>>>> >>>>>>> Hi Roger,>>> Thanks for the detailed review. Below is the webrev >>>>>>> addressing all the review comments. >>>>>>> >>>>>>> http://cr.openjdk.java.net/~hb/5016517/webrev.01/ >>>>>>> >>>>>>> -Harsha >>>>>>> >>>>>>> >>>>>>> On Tuesday 25 April 2017 10:56 PM, Roger Riggs wrote: >>>>>>>> Hi Harsha, >>>>>>>> >>>>>>>> Thanks for this important improvement. Comments: >>>>>>>> >>>>>>>> >>>>>>>> * jmxremote.password.template: >>>>>>>> ? "Passwords will be hashed by server if they are in clear." >>>>>>>> Perhaps should be more explicit: >>>>>>>> >>>>>>>> ?? "The jmxremote.passwords file will be re-written by the >>>>>>>> server to replace all plain text passwords with hashed >>>>>>>> passwords when the file is read by the server." >>>>>>>> >>>>>>>> line 35: "Base64 encoded hash"? -> drop the "Base64" in this >>>>>>>> line isn't needed and >>>>>>>> make it seems like it should appear as 1 field instead of 2 or 3. >>>>>>>> >>>>>>>> 37+: The syntax of the file may be clearer if it includes the >>>>>>>> complete syntax in (line 39) not >>>>>>>> just the password/hash fragment. >>>>>>>> >>>>>>>> Line 41:? "W = spaces"; above "tabs" are allowed as a >>>>>>>> delimiter; it would be good to be consistent >>>>>>>> and include the usualy white-space characters in the set, be as >>>>>>>> specific as possible. >>>>>>>> Is this the same set of whitespace used by Regex '\\s'. >>>>>>> Only spaces and tabs are allowed. '\s' matches newline as well >>>>>>> hence not allowed. >>>>>>>> >>>>>>>> 45: "java platform""? ->?? "MD5, SDA-1, SHA-256 are supported >>>>>>>> algorithms." >>>>>>>> >>>>>>>> 49: be more specific about 'hashing is requested' how? Refer to >>>>>>>> the management.properties >>>>>>>> ? com.sun.management.jmxremote.password.hash value. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> 51:? "replace hashed" -> "replace *the *hashed" >>>>>>>> 52: "with clear text or new" -> "with the clear text or the new" >>>>>>>> 52: "If new password" -> "If the new password" >>>>>>>> 53: "when new login" -> "when a new login" >>>>>>>> >>>>>>>> 60: "User generated" -> "A User generated" >>>>>>>> >>>>>>>> 67: Will the file be ignored if it has the wrong permissions. >>>>>>>> (With a logged message) >>>>>>> Addressed all the above review comments. >>>>>>>> >>>>>>>> * management.properties >>>>>>>> >>>>>>>> 306: "(Case for true/false ignored)"? - what does this mean; I >>>>>>>> think it can be removed. >>>>>>>> >>>>>>>> 307: missing period at the end of the sentence. >>>>>>>> 309: "in password file" -> "in the password file" >>>>>>>> >>>>>>> Done. >>>>>>>> >>>>>>>> * FileLoginModule.java >>>>>>>> >>>>>>>> 102: can this match better the similar name in the >>>>>>>> management.properties if it has the same function: >>>>>>>> ??? com.sun.management.jmxremote.password.hash >>>>>>> Are you suggesting that 'hashPassword' be renamed to something >>>>>>> similar to com.sun.management.jmxremote.password.hash? Variable >>>>>>> names cannot be similar to property names since property names >>>>>>> are long and provide complete context which local variables need >>>>>>> not have to do. >>>>>> the suffix should be the same in all places since it is a single >>>>>> semantic. >>>>> Done. >>>>>>>> 103: "replaces clear text passwords" -> "replaces each clear >>>>>>>> text password" >>>>>>>> 104: indent to match previous
enteries. >>>>>>>> >>>>>>>> * JMXPluggableAuthenticator.java >>>>>>>> >>>>>>>> 119: There is no need to copy the password to a new local >>>>>>> It is required since variables accessed from inner class must be >>>>>>> final or effectively final. >>>>>> right >>>>>>>> >>>>>>>> 128: add a space after "," >>>>>>>> >>>>>>>> 256 private static final String HASH_PASSWORDS = >>>>>>>> 257 "jmx.remote.x.password.file.hash"; >>>>>>>> >>>>>>>> The name ".hash" part does not clearly communicate that >>>>>>>> passwords are to be hashed. >>>>>>>> "hashPasswords" might be more self explanatory. >>>>>>> Changed it to "jmx.remote.x.password.file.hashpassword". >>>>>> drop the "file." >>>>> Done. >>>>>>>> Also, can this be NOT duplicated here and in >>>>>>>> ConnectorBootStrap.java? >>>>>>> The property names used in ConnectorBootStrap follows the >>>>>>> convention used in management.properties file - >>>>>>> 'com.sun.management.*'. For environment variables for a >>>>>>> JMXConnector "jmx.remote.x.*" convention is used . Hence they >>>>>>> cannot be duplicated. >>>>>> The differing prefix'es are fine as is; no change except to make >>>>>> the new keys consistent. >>>>>> >>>>>>>> >>>>>>>> >>>>>>>> * ConnectorBootStrap.java: >>>>>>>> ?482: Add space after ","s; no spaces before. >>>>>>>> >>>>>>>> 770: use the same name for the option/property if possible to >>>>>>>> avoid confusion. >>>>>>> Not possible as explained above. >>>>>>>> >>>>>>>> 770:? if the HASH_PASSWORDS static is appropriate use it >>>>>>>> instead of literal "true". >>>>>>> DefaultValues.HASH_PASSWORDS static is set to 'true' and can be >>>>>>> used. However using literal "true" is more readable than using >>>>>>> the static. >>>>>>>> >>>>>>>> * HashedPasswordManager >>>>>>>> >>>>>>>> 80-83: The fields can be final and use the constructor to >>>>>>>> initialize in all cases and make the class final >>>>>>>> to avoid unintentional subclassing. >>>>>>>> >>>>>>>> >>>>>>>> 113: canWriteToFile:?? It should be made clear in the template >>>>>>>> that *both* the Security policy >>>>>>>> ?? and the file access value are used to check that the file >>>>>>>> can be updated. >>>>>>> Made it explicit in template as well as code comments. >>>>>>>> >>>>>>>> 200: loadPasswords() - should this confirm the access to the >>>>>>>> file is allowed and it has >>>>>>>> the correct file access before reading? >>>>>>> Not really required. Appropriate exceptions are thrown if file >>>>>>> cannot be accessed. >>>>>>>> >>>>>>>> Is the re-writing of the passwords intended to be done by a >>>>>>>> 'priveleged' system. >>>>>>>> Does this need doPrivileged? >>>>>>> I am not sure. Maybe it will be covered in the security review. >>>>>>>> >>>>>>>> * HashedPasswordFileTest: >>>>>>>> >>>>>>>> 88: should use the TestLibrary Utils.getRandomInstance so it >>>>>>>> logs the seed and can be replayed if necessary. >>>>>>>> >>>>>>>> >>>>>>> Done >>>>>>>> Thanks, Roger >>>>>>>> >>>>>>> Thanks >>>>>>> Harsha >>>>>>>> >>>>>>>> On 4/23/2017 6:20 AM, Harsha Wardhana B wrote: >>>>>>>>> >>>>>>>>> Hi All, >>>>>>>>> >>>>>>>>> Please review this enhancement to replace plain-text password >>>>>>>>> for JMX agent with SHA-256 hash. >>>>>>>>> >>>>>>>>> Issue: https://bugs.openjdk.java.net/browse/JDK-5016517 >>>>>>>>> >>>>>>>>> >>>>>>>>> webrev: http://cr.openjdk.java.net/~hb/5016517/webrev.00/ >>>>>>>>> >>>>>>>>> Overview of implementation: >>>>>>>>> >>>>>>>>> Currently, the JMX agent password file used to authenticate >>>>>>>>> user, stores user name and password as clear text. Though >>>>>>>>> system level restrictions are recommended for jmx password >>>>>>>>> file, passwords are vulnerable since they are stored in clear. >>>>>>>>> The current RFE proposes to store passwords as SHA256 hash >>>>>>>>> instead of clear text. >>>>>>>>> >>>>>>>>> In current implementation, if password file is writable, and >>>>>>>>> if passwords are in clear, they will be replaced by SHA256 >>>>>>>>> hash upon agent boot-up or when login attempt is made. >>>>>>>>> >>>>>>>>> The file, >>>>>>>>> src/jdk.management.agent/share/conf/jmxremote.password.template >>>>>>>>> contains more details about the implementation. >>>>>>>>> >>>>>>>>> - Harsha >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> > From mandy.chung at oracle.com Wed Oct 11 18:18:44 2017 From: mandy.chung at oracle.com (mandy chung) Date: Wed, 11 Oct 2017 11:18:44 -0700 Subject: RFE Review : JDK-5016517 - Replace plaintext passwords by hashed passwords for out-of-the-box JMX Agent In-Reply-To: <83690f04-4cdb-ce11-a744-7d4372ec7500@oracle.com> References: <6a193cdf-efea-e4e9-4faf-72d8960cd72b@Oracle.com> <0483f2c9-892e-f842-bb5f-2b522b81b199@oracle.com> <6ec697c3-cb0d-ef5b-4a0c-e26e0cfc92b0@Oracle.com> <1ceb614a-585d-2ccd-0877-f99202c43718@oracle.com> <5892e0b2-51c6-6a22-5a39-d4ab93ca2fcf@oracle.com> <85633749-9830-dcd5-97a5-411f684776bd@oracle.com> <83690f04-4cdb-ce11-a744-7d4372ec7500@oracle.com> Message-ID: <02f07607-b713-a6ef-4e2c-24d642eb595a@oracle.com> On 10/8/17 10:34 PM, Harsha Wardhana B wrote: > > Hi Daniel, > > Below is the webrev addressing the review comments. > > http://cr.openjdk.java.net/~hb/5016517/webrev.04/ > This approach seems reasonable.?? I only review management.properties and jmxremote.password.template file. 304 # ################# Hash passwords in password file ############## 305 # com.sun.management.jmxremote.password.hashpasswords = true|false 306 # Default for this property is true. 307 # Specifies if passswords in the above file should be hashed or not. typo: passswords s/above file/password file/ - it has been referred to as "password file" in many places. I'm thinking any better alternative to the new property name?? com.sun.management.jmxremote.password.hashes com.sun.management.jmxremote.password.asHashes com.sun.management.jmxremote.passowrd.toHashes 49 # https://docs.oracle.com/javase/7/docs/technotes/guides/security/StandardNames.html#MessageDigest 50 # MD5, SHA-1 and SHA-256 are supported algorithms. 51 # This is an optional field. If not specified SHA-256 will be assumed. I would avoid the link to the documentation of a specific JDK release. Maybe say: Refer to "Java Security Standard Algorithm Names Specification" for supported algorithm. 53 # If passwords are in clear, they will be over-written by their hash if all of s/over-written/overwritten 67 # If multiple entries are found for the same role name, then the last one 68 # is used. If there are multiple entries of the same role, will all entries be overridden with hash value? It may be better to detect as an error when there are more than one entries of the same role? HashedPasswordFileTest.java @bug is missing Mandy -------------- next part -------------- An HTML attachment was scrubbed... URL: From harsha.wardhana.b at oracle.com Thu Oct 12 06:29:15 2017 From: harsha.wardhana.b at oracle.com (Harsha Wardhana B) Date: Thu, 12 Oct 2017 11:59:15 +0530 Subject: RFE Review : JDK-5016517 - Replace plaintext passwords by hashed passwords for out-of-the-box JMX Agent In-Reply-To: References: <6a193cdf-efea-e4e9-4faf-72d8960cd72b@Oracle.com> <0483f2c9-892e-f842-bb5f-2b522b81b199@oracle.com> <6ec697c3-cb0d-ef5b-4a0c-e26e0cfc92b0@Oracle.com> <1ceb614a-585d-2ccd-0877-f99202c43718@oracle.com> <5892e0b2-51c6-6a22-5a39-d4ab93ca2fcf@oracle.com> <85633749-9830-dcd5-97a5-411f684776bd@oracle.com> <83690f04-4cdb-ce11-a744-7d4372ec7500@oracle.com> Message-ID: <73f87611-5a3c-4ed4-d7d6-364d2597964d@oracle.com> Hi Daniel, The contents written into the password file are identical and written at the same offset. Hence the order of the writes should not matter. However there is a possibility that file could be read in the midst of password change and different file contents could be read by different processes. Since multiple writes could be done on the file, it is possible for differing contents to be interleaved into the file, thereby corrupting the entire file. I will add FileLock from the nio package to avoid the above race condition. Thanks Harsha On Wednesday 11 October 2017 02:30 PM, Daniel Fuchs wrote: > Hi Harsha, > > Your changes look good. However I have still a nagging doubt: > > What happens if two Java process share the same password file, > and it needs hashing? Are there any protection in place > to prevent the two processes from writing to the same > file concurrently? > > best regards, > > -- daniel > > On 09/10/2017 06:34, Harsha Wardhana B wrote: >> Hi Daniel, >> >> Below is the webrev addressing the review comments. >> >> http://cr.openjdk.java.net/~hb/5016517/webrev.04/ >> >> On Friday 06 October 2017 03:38 PM, Daniel Fuchs wrote: >>> Hi Harsha, >>> >>> Good work! >>> >>> > http://cr.openjdk.java.net/~hb/5016517/webrev.03/ >>> >>> long standing typo in management.properties at line 90: >>> >>> ? measureRole => monitorRole >> Done. >>> >>> HashedPasswordManager.java: loadPasswords() >>> >>> It seems this function will add the header to the file even >>> if it already contains the header. >>> >>> So every time a user/administrator wants to change/add a password, >>> the header will be inserted again. >>> >> Yes. It is fixed in the new webrev. >>> >>> HashedPasswordFileTest should probably have a test for this >>> scenario as well: >>> >>> generate password file with clear text password >>> load it, then verify passwords have been hashed (properly) >>> add some new user/name password to the same file >>> load it again, verify all passwords are hashed >>> (do this a number of times - to make sure it doesn't >>> ?break the second or third time) >>> and finally verify the header is only present once ;-) >>> >> Done. Added a testcase for the same. >>> I'm surprised no other tests had to be modified. >>> Is password hash disabled by default in the default agent? >>> >> Password hashing is enabled by default. But it is only the >> implementation that is changed. The pluggable JAAS mechanism isolates >> interfaces from implementation. So in theory, all tests should pass. >>> If not then you should try (locally) running jtreg >>> more than once over the default agent tests. >>> Just make sure running the same test twice doesn't >>> make the legacy tests that use password files failing the >>> second time when they discover that passwords have been >>> hashed under their feet (the client part of the test >>> might be reading the password file too to see which >>> password it should send to the agent). >>> >>> Otherwise I think it looks good to me - provided all >>> tests are passing! >>> >> Done. Had a few test failures but nothing related to this enhancement. >>> best regards, >>> >>> -- daniel >> Thanks >> Harsha >>> >>> On 06/10/2017 06:25, Harsha Wardhana B wrote: >>>> Hi All, >>>> >>>> Previously, for default agent, hashing of the passwords was done >>>> during the agent boot-up (ConnectorBootstrap.java). That was an >>>> error since login configuration could be different and is >>>> determined only when a login attempt is made. It would be then >>>> pointless to hash the password file. The fix for above and some >>>> off-list comments are incorporated in webrev below. >>>> >>>> http://cr.openjdk.java.net/~hb/5016517/webrev.03/ >>>> >>>> -Harsha >>>> >>>> >>>> On Wednesday 04 October 2017 01:53 PM, Harsha Wardhana B wrote: >>>>> Hi Roger, >>>>> >>>>> Below is the webrev incorporating changes suggested by you. >>>>> >>>>> http://cr.openjdk.java.net/~hb/5016517/webrev.02/ >>>>> >>>>> -Harsha >>>>> >>>>> On Wednesday 04 October 2017 12:54 AM, Roger Riggs wrote: >>>>>> Hi Harsha, >>>>>> >>>>>> FileLoginModule.java:? 104:? Add a period at the end of the the >>>>>> sentence. >>>>>> >>>>>> JMXPluggableAuthenticator.java: line 306:? Is the difference >>>>>> between singular and plural significant? >>>>>> ? It would be less confusing if both were plural (hashPasswords). >>>>> Ok. >>>>>> ConnectorBootstrap: >>>>>> 134: ...password.file.hash" and HashedPasswordManager disagree on >>>>>> the exact string. >>>>>> I would propose 'hashpasswords' as the suffix in all places to be >>>>>> consistent >>>>>> in ConnectorBootstrap.java, HashedPasswordManager (except for >>>>>> capitalization), >>>>>> jmxremote.password.template, and management.properties >>>>> Do you want to rename HashedPasswordManager class? >>>>>> >>>>>> As is you have a mix of "...password.hash", >>>>>> "...password.file.hash", "...hashpassword"; >>>>>> that's not good for knowing there is only one semantic. >>>>>> >>>>>> line 482:? " ," -> ", "? space after comma, not before >>>>>> >>>>> Will incorporate above comments. >>>>>> line: 771: is it intentional to discard the reference to the new >>>>>> HashedPasswordManager? >>>>>> If the intention is only to use the side effect of loadPasswords, >>>>>> then please >>>>>> create a static method in HashedPasswordManager for that purpose. >>>>>> (Even if just does the same code; it would be clear that's the >>>>>> purpose). >>>>>> (It probably also implies that the password file will be read a >>>>>> second time somewhere else in the initialization). >>>>> Static methods just to hash passwords can be created but >>>>> HashedPasswordManager class will have to be re-factored since >>>>> almost all methods are using instance variables. Not sure if we >>>>> want instance methods and look-alike static methods side-by-side. >>>>> Wouldn't that be more confusing than current implementation? >>>>>> >>>>>> line:770:? the string constant would be nicer as a final static >>>>>> string somewhere. >>>>>> ? "jmx.remote.x.password.file.hashpassword" >>>>> All of "jmx.remote.x.*" don't have static strings. They are used >>>>> 'as is' all over the code to maintain isolation between pluggable >>>>> login authenticator and JDK code. >>>>>> >>>>>> Roger >>>>>> >>>>>> >>>>> Harsha >>>>>> >>>>>> On 10/3/2017 3:47 PM, Harsha Wardhana B wrote: >>>>>>> >>>>>>> Hi Roger,>>> Thanks for the detailed review. Below is the webrev >>>>>>> addressing all the review comments. >>>>>>> >>>>>>> http://cr.openjdk.java.net/~hb/5016517/webrev.01/ >>>>>>> >>>>>>> -Harsha >>>>>>> >>>>>>> >>>>>>> On Tuesday 25 April 2017 10:56 PM, Roger Riggs wrote: >>>>>>>> Hi Harsha, >>>>>>>> >>>>>>>> Thanks for this important improvement. Comments: >>>>>>>> >>>>>>>> >>>>>>>> * jmxremote.password.template: >>>>>>>> ? "Passwords will be hashed by server if they are in clear." >>>>>>>> Perhaps should be more explicit: >>>>>>>> >>>>>>>> ?? "The jmxremote.passwords file will be re-written by the >>>>>>>> server to replace all plain text passwords with hashed >>>>>>>> passwords when the file is read by the server." >>>>>>>> >>>>>>>> line 35: "Base64 encoded hash"? -> drop the "Base64" in this >>>>>>>> line isn't needed and >>>>>>>> make it seems like it should appear as 1 field instead of 2 or 3. >>>>>>>> >>>>>>>> 37+: The syntax of the file may be clearer if it includes the >>>>>>>> complete syntax in (line 39) not >>>>>>>> just the password/hash fragment. >>>>>>>> >>>>>>>> Line 41:? "W = spaces"; above "tabs" are allowed as a >>>>>>>> delimiter; it would be good to be consistent >>>>>>>> and include the usualy white-space characters in the set, be as >>>>>>>> specific as possible. >>>>>>>> Is this the same set of whitespace used by Regex '\\s'. >>>>>>> Only spaces and tabs are allowed. '\s' matches newline as well >>>>>>> hence not allowed. >>>>>>>> >>>>>>>> 45: "java platform""? ->?? "MD5, SDA-1, SHA-256 are supported >>>>>>>> algorithms." >>>>>>>> >>>>>>>> 49: be more specific about 'hashing is requested' how? Refer to >>>>>>>> the management.properties >>>>>>>> ? com.sun.management.jmxremote.password.hash value. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> 51:? "replace hashed" -> "replace *the *hashed" >>>>>>>> 52: "with clear text or new" -> "with the clear text or the new" >>>>>>>> 52: "If new password" -> "If the new password" >>>>>>>> 53: "when new login" -> "when a new login" >>>>>>>> >>>>>>>> 60: "User generated" -> "A User generated" >>>>>>>> >>>>>>>> 67: Will the file be ignored if it has the wrong permissions. >>>>>>>> (With a logged message) >>>>>>> Addressed all the above review comments. >>>>>>>> >>>>>>>> * management.properties >>>>>>>> >>>>>>>> 306: "(Case for true/false ignored)"? - what does this mean; I >>>>>>>> think it can be removed. >>>>>>>> >>>>>>>> 307: missing period at the end of the sentence. >>>>>>>> 309: "in password file" -> "in the password file" >>>>>>>> >>>>>>> Done. >>>>>>>> >>>>>>>> * FileLoginModule.java >>>>>>>> >>>>>>>> 102: can this match better the similar name in the >>>>>>>> management.properties if it has the same function: >>>>>>>> ??? com.sun.management.jmxremote.password.hash >>>>>>> Are you suggesting that 'hashPassword' be renamed to something >>>>>>> similar to com.sun.management.jmxremote.password.hash? Variable >>>>>>> names cannot be similar to property names since property names >>>>>>> are long and provide complete context which local variables need >>>>>>> not have to do. >>>>>> the suffix should be the same in all places since it is a single >>>>>> semantic. >>>>> Done. >>>>>>>> 103: "replaces clear text passwords" -> "replaces each clear >>>>>>>> text password" >>>>>>>> 104: indent to match previous
enteries. >>>>>>>> >>>>>>>> * JMXPluggableAuthenticator.java >>>>>>>> >>>>>>>> 119: There is no need to copy the password to a new local >>>>>>> It is required since variables accessed from inner class must be >>>>>>> final or effectively final. >>>>>> right >>>>>>>> >>>>>>>> 128: add a space after "," >>>>>>>> >>>>>>>> 256 private static final String HASH_PASSWORDS = >>>>>>>> 257 "jmx.remote.x.password.file.hash"; >>>>>>>> >>>>>>>> The name ".hash" part does not clearly communicate that >>>>>>>> passwords are to be hashed. >>>>>>>> "hashPasswords" might be more self explanatory. >>>>>>> Changed it to "jmx.remote.x.password.file.hashpassword". >>>>>> drop the "file." >>>>> Done. >>>>>>>> Also, can this be NOT duplicated here and in >>>>>>>> ConnectorBootStrap.java? >>>>>>> The property names used in ConnectorBootStrap follows the >>>>>>> convention used in management.properties file - >>>>>>> 'com.sun.management.*'. For environment variables for a >>>>>>> JMXConnector "jmx.remote.x.*" convention is used . Hence they >>>>>>> cannot be duplicated. >>>>>> The differing prefix'es are fine as is; no change except to make >>>>>> the new keys consistent. >>>>>> >>>>>>>> >>>>>>>> >>>>>>>> * ConnectorBootStrap.java: >>>>>>>> ?482: Add space after ","s; no spaces before. >>>>>>>> >>>>>>>> 770: use the same name for the option/property if possible to >>>>>>>> avoid confusion. >>>>>>> Not possible as explained above. >>>>>>>> >>>>>>>> 770:? if the HASH_PASSWORDS static is appropriate use it >>>>>>>> instead of literal "true". >>>>>>> DefaultValues.HASH_PASSWORDS static is set to 'true' and can be >>>>>>> used. However using literal "true" is more readable than using >>>>>>> the static. >>>>>>>> >>>>>>>> * HashedPasswordManager >>>>>>>> >>>>>>>> 80-83: The fields can be final and use the constructor to >>>>>>>> initialize in all cases and make the class final >>>>>>>> to avoid unintentional subclassing. >>>>>>>> >>>>>>>> >>>>>>>> 113: canWriteToFile:?? It should be made clear in the template >>>>>>>> that *both* the Security policy >>>>>>>> ?? and the file access value are used to check that the file >>>>>>>> can be updated. >>>>>>> Made it explicit in template as well as code comments. >>>>>>>> >>>>>>>> 200: loadPasswords() - should this confirm the access to the >>>>>>>> file is allowed and it has >>>>>>>> the correct file access before reading? >>>>>>> Not really required. Appropriate exceptions are thrown if file >>>>>>> cannot be accessed. >>>>>>>> >>>>>>>> Is the re-writing of the passwords intended to be done by a >>>>>>>> 'priveleged' system. >>>>>>>> Does this need doPrivileged? >>>>>>> I am not sure. Maybe it will be covered in the security review. >>>>>>>> >>>>>>>> * HashedPasswordFileTest: >>>>>>>> >>>>>>>> 88: should use the TestLibrary Utils.getRandomInstance so it >>>>>>>> logs the seed and can be replayed if necessary. >>>>>>>> >>>>>>>> >>>>>>> Done >>>>>>>> Thanks, Roger >>>>>>>> >>>>>>> Thanks >>>>>>> Harsha >>>>>>>> >>>>>>>> On 4/23/2017 6:20 AM, Harsha Wardhana B wrote: >>>>>>>>> >>>>>>>>> Hi All, >>>>>>>>> >>>>>>>>> Please review this enhancement to replace plain-text password >>>>>>>>> for JMX agent with SHA-256 hash. >>>>>>>>> >>>>>>>>> Issue: https://bugs.openjdk.java.net/browse/JDK-5016517 >>>>>>>>> >>>>>>>>> >>>>>>>>> webrev: http://cr.openjdk.java.net/~hb/5016517/webrev.00/ >>>>>>>>> >>>>>>>>> Overview of implementation: >>>>>>>>> >>>>>>>>> Currently, the JMX agent password file used to authenticate >>>>>>>>> user, stores user name and password as clear text. Though >>>>>>>>> system level restrictions are recommended for jmx password >>>>>>>>> file, passwords are vulnerable since they are stored in clear. >>>>>>>>> The current RFE proposes to store passwords as SHA256 hash >>>>>>>>> instead of clear text. >>>>>>>>> >>>>>>>>> In current implementation, if password file is writable, and >>>>>>>>> if passwords are in clear, they will be replaced by SHA256 >>>>>>>>> hash upon agent boot-up or when login attempt is made. >>>>>>>>> >>>>>>>>> The file, >>>>>>>>> src/jdk.management.agent/share/conf/jmxremote.password.template >>>>>>>>> contains more details about the implementation. >>>>>>>>> >>>>>>>>> - Harsha >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> > From harsha.wardhana.b at oracle.com Thu Oct 12 06:49:16 2017 From: harsha.wardhana.b at oracle.com (Harsha Wardhana B) Date: Thu, 12 Oct 2017 12:19:16 +0530 Subject: RFE Review : JDK-5016517 - Replace plaintext passwords by hashed passwords for out-of-the-box JMX Agent In-Reply-To: References: <6a193cdf-efea-e4e9-4faf-72d8960cd72b@Oracle.com> <0483f2c9-892e-f842-bb5f-2b522b81b199@oracle.com> <6ec697c3-cb0d-ef5b-4a0c-e26e0cfc92b0@Oracle.com> <1ceb614a-585d-2ccd-0877-f99202c43718@oracle.com> <5892e0b2-51c6-6a22-5a39-d4ab93ca2fcf@oracle.com> <85633749-9830-dcd5-97a5-411f684776bd@oracle.com> <83690f04-4cdb-ce11-a744-7d4372ec7500@oracle.com> Message-ID: Hi Roger, On Wednesday 11 October 2017 09:21 PM, Roger Riggs wrote: > Hi Harsha, > > conf/management.properties: - typo line 307: pa*sss*words > > > HashedPasswordManager.java: > ?- line 46: "classes" -> "class" > > - line 84-87 "private" and 'static" come before "final" in declarations. > > ?- 158 and everywhere: add space after "if"? before "(" > > ?- line 202: add "the" before password file. > Done. > ?- line 287:? why a separate canWriteToFile(); it does the same check > as newFileWriter(passwordFile); > ?? instead catch the exception and log then ignore. It will be replaced by nio APIs. Thanks Harsha > > Looking good > > Regards, Roger > > On 10/11/2017 5:00 AM, Daniel Fuchs wrote: >> Hi Harsha, >> >> Your changes look good. However I have still a nagging doubt: >> >> What happens if two Java process share the same password file, >> and it needs hashing? Are there any protection in place >> to prevent the two processes from writing to the same >> file concurrently? >> >> best regards, >> >> -- daniel >> >> On 09/10/2017 06:34, Harsha Wardhana B wrote: >>> Hi Daniel, >>> >>> Below is the webrev addressing the review comments. >>> >>> http://cr.openjdk.java.net/~hb/5016517/webrev.04/ >>> >>> On Friday 06 October 2017 03:38 PM, Daniel Fuchs wrote: >>>> Hi Harsha, >>>> >>>> Good work! >>>> >>>> > http://cr.openjdk.java.net/~hb/5016517/webrev.03/ >>>> >>>> long standing typo in management.properties at line 90: >>>> >>>> ? measureRole => monitorRole >>> Done. >>>> >>>> HashedPasswordManager.java: loadPasswords() >>>> >>>> It seems this function will add the header to the file even >>>> if it already contains the header. >>>> >>>> So every time a user/administrator wants to change/add a password, >>>> the header will be inserted again. >>>> >>> Yes. It is fixed in the new webrev. >>>> >>>> HashedPasswordFileTest should probably have a test for this >>>> scenario as well: >>>> >>>> generate password file with clear text password >>>> load it, then verify passwords have been hashed (properly) >>>> add some new user/name password to the same file >>>> load it again, verify all passwords are hashed >>>> (do this a number of times - to make sure it doesn't >>>> ?break the second or third time) >>>> and finally verify the header is only present once ;-) >>>> >>> Done. Added a testcase for the same. >>>> I'm surprised no other tests had to be modified. >>>> Is password hash disabled by default in the default agent? >>>> >>> Password hashing is enabled by default. But it is only the >>> implementation that is changed. The pluggable JAAS mechanism >>> isolates interfaces from implementation. So in theory, all tests >>> should pass. >>>> If not then you should try (locally) running jtreg >>>> more than once over the default agent tests. >>>> Just make sure running the same test twice doesn't >>>> make the legacy tests that use password files failing the >>>> second time when they discover that passwords have been >>>> hashed under their feet (the client part of the test >>>> might be reading the password file too to see which >>>> password it should send to the agent). >>>> >>>> Otherwise I think it looks good to me - provided all >>>> tests are passing! >>>> >>> Done. Had a few test failures but nothing related to this enhancement. >>>> best regards, >>>> >>>> -- daniel >>> Thanks >>> Harsha >>>> >>>> On 06/10/2017 06:25, Harsha Wardhana B wrote: >>>>> Hi All, >>>>> >>>>> Previously, for default agent, hashing of the passwords was done >>>>> during the agent boot-up (ConnectorBootstrap.java). That was an >>>>> error since login configuration could be different and is >>>>> determined only when a login attempt is made. It would be then >>>>> pointless to hash the password file. The fix for above and some >>>>> off-list comments are incorporated in webrev below. >>>>> >>>>> http://cr.openjdk.java.net/~hb/5016517/webrev.03/ >>>>> >>>>> -Harsha >>>>> >>>>> >>>>> On Wednesday 04 October 2017 01:53 PM, Harsha Wardhana B wrote: >>>>>> Hi Roger, >>>>>> >>>>>> Below is the webrev incorporating changes suggested by you. >>>>>> >>>>>> http://cr.openjdk.java.net/~hb/5016517/webrev.02/ >>>>>> >>>>>> -Harsha >>>>>> >>>>>> On Wednesday 04 October 2017 12:54 AM, Roger Riggs wrote: >>>>>>> Hi Harsha, >>>>>>> >>>>>>> FileLoginModule.java:? 104:? Add a period at the end of the the >>>>>>> sentence. >>>>>>> >>>>>>> JMXPluggableAuthenticator.java: line 306:? Is the difference >>>>>>> between singular and plural significant? >>>>>>> ? It would be less confusing if both were plural (hashPasswords). >>>>>> Ok. >>>>>>> ConnectorBootstrap: >>>>>>> 134: ...password.file.hash" and HashedPasswordManager disagree >>>>>>> on the exact string. >>>>>>> I would propose 'hashpasswords' as the suffix in all places to >>>>>>> be consistent >>>>>>> in ConnectorBootstrap.java, HashedPasswordManager (except for >>>>>>> capitalization), >>>>>>> jmxremote.password.template, and management.properties >>>>>> Do you want to rename HashedPasswordManager class? >>>>>>> >>>>>>> As is you have a mix of "...password.hash", >>>>>>> "...password.file.hash", "...hashpassword"; >>>>>>> that's not good for knowing there is only one semantic. >>>>>>> >>>>>>> line 482:? " ," -> ", "? space after comma, not before >>>>>>> >>>>>> Will incorporate above comments. >>>>>>> line: 771: is it intentional to discard the reference to the new >>>>>>> HashedPasswordManager? >>>>>>> If the intention is only to use the side effect of >>>>>>> loadPasswords, then please >>>>>>> create a static method in HashedPasswordManager for that purpose. >>>>>>> (Even if just does the same code; it would be clear that's the >>>>>>> purpose). >>>>>>> (It probably also implies that the password file will be read a >>>>>>> second time somewhere else in the initialization). >>>>>> Static methods just to hash passwords can be created but >>>>>> HashedPasswordManager class will have to be re-factored since >>>>>> almost all methods are using instance variables. Not sure if we >>>>>> want instance methods and look-alike static methods side-by-side. >>>>>> Wouldn't that be more confusing than current implementation? >>>>>>> >>>>>>> line:770:? the string constant would be nicer as a final static >>>>>>> string somewhere. >>>>>>> ? "jmx.remote.x.password.file.hashpassword" >>>>>> All of "jmx.remote.x.*" don't have static strings. They are used >>>>>> 'as is' all over the code to maintain isolation between pluggable >>>>>> login authenticator and JDK code. >>>>>>> >>>>>>> Roger >>>>>>> >>>>>>> >>>>>> Harsha >>>>>>> >>>>>>> On 10/3/2017 3:47 PM, Harsha Wardhana B wrote: >>>>>>>> >>>>>>>> Hi Roger,>>> Thanks for the detailed review. Below is the >>>>>>>> webrev addressing all the review comments. >>>>>>>> >>>>>>>> http://cr.openjdk.java.net/~hb/5016517/webrev.01/ >>>>>>>> >>>>>>>> -Harsha >>>>>>>> >>>>>>>> >>>>>>>> On Tuesday 25 April 2017 10:56 PM, Roger Riggs wrote: >>>>>>>>> Hi Harsha, >>>>>>>>> >>>>>>>>> Thanks for this important improvement. Comments: >>>>>>>>> >>>>>>>>> >>>>>>>>> * jmxremote.password.template: >>>>>>>>> ? "Passwords will be hashed by server if they are in clear." >>>>>>>>> Perhaps should be more explicit: >>>>>>>>> >>>>>>>>> ?? "The jmxremote.passwords file will be re-written by the >>>>>>>>> server to replace all plain text passwords with hashed >>>>>>>>> passwords when the file is read by the server." >>>>>>>>> >>>>>>>>> line 35: "Base64 encoded hash"? -> drop the "Base64" in this >>>>>>>>> line isn't needed and >>>>>>>>> make it seems like it should appear as 1 field instead of 2 or 3. >>>>>>>>> >>>>>>>>> 37+: The syntax of the file may be clearer if it includes the >>>>>>>>> complete syntax in (line 39) not >>>>>>>>> just the password/hash fragment. >>>>>>>>> >>>>>>>>> Line 41:? "W = spaces"; above "tabs" are allowed as a >>>>>>>>> delimiter; it would be good to be consistent >>>>>>>>> and include the usualy white-space characters in the set, be >>>>>>>>> as specific as possible. >>>>>>>>> Is this the same set of whitespace used by Regex '\\s'. >>>>>>>> Only spaces and tabs are allowed. '\s' matches newline as well >>>>>>>> hence not allowed. >>>>>>>>> >>>>>>>>> 45: "java platform""? ->?? "MD5, SDA-1, SHA-256 are supported >>>>>>>>> algorithms." >>>>>>>>> >>>>>>>>> 49: be more specific about 'hashing is requested' how? Refer >>>>>>>>> to the management.properties >>>>>>>>> ? com.sun.management.jmxremote.password.hash value. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> 51:? "replace hashed" -> "replace *the *hashed" >>>>>>>>> 52: "with clear text or new" -> "with the clear text or the new" >>>>>>>>> 52: "If new password" -> "If the new password" >>>>>>>>> 53: "when new login" -> "when a new login" >>>>>>>>> >>>>>>>>> 60: "User generated" -> "A User generated" >>>>>>>>> >>>>>>>>> 67: Will the file be ignored if it has the wrong permissions. >>>>>>>>> (With a logged message) >>>>>>>> Addressed all the above review comments. >>>>>>>>> >>>>>>>>> * management.properties >>>>>>>>> >>>>>>>>> 306: "(Case for true/false ignored)"? - what does this mean; I >>>>>>>>> think it can be removed. >>>>>>>>> >>>>>>>>> 307: missing period at the end of the sentence. >>>>>>>>> 309: "in password file" -> "in the password file" >>>>>>>>> >>>>>>>> Done. >>>>>>>>> >>>>>>>>> * FileLoginModule.java >>>>>>>>> >>>>>>>>> 102: can this match better the similar name in the >>>>>>>>> management.properties if it has the same function: >>>>>>>>> ??? com.sun.management.jmxremote.password.hash >>>>>>>> Are you suggesting that 'hashPassword' be renamed to something >>>>>>>> similar to com.sun.management.jmxremote.password.hash? Variable >>>>>>>> names cannot be similar to property names since property names >>>>>>>> are long and provide complete context which local variables >>>>>>>> need not have to do. >>>>>>> the suffix should be the same in all places since it is a single >>>>>>> semantic. >>>>>> Done. >>>>>>>>> 103: "replaces clear text passwords" -> "replaces each clear >>>>>>>>> text password" >>>>>>>>> 104: indent to match previous
enteries. >>>>>>>>> >>>>>>>>> * JMXPluggableAuthenticator.java >>>>>>>>> >>>>>>>>> 119: There is no need to copy the password to a new local >>>>>>>> It is required since variables accessed from inner class must >>>>>>>> be final or effectively final. >>>>>>> right >>>>>>>>> >>>>>>>>> 128: add a space after "," >>>>>>>>> >>>>>>>>> 256 private static final String HASH_PASSWORDS = >>>>>>>>> 257 "jmx.remote.x.password.file.hash"; >>>>>>>>> >>>>>>>>> The name ".hash" part does not clearly communicate that >>>>>>>>> passwords are to be hashed. >>>>>>>>> "hashPasswords" might be more self explanatory. >>>>>>>> Changed it to "jmx.remote.x.password.file.hashpassword". >>>>>>> drop the "file." >>>>>> Done. >>>>>>>>> Also, can this be NOT duplicated here and in >>>>>>>>> ConnectorBootStrap.java? >>>>>>>> The property names used in ConnectorBootStrap follows the >>>>>>>> convention used in management.properties file - >>>>>>>> 'com.sun.management.*'. For environment variables for a >>>>>>>> JMXConnector "jmx.remote.x.*" convention is used . Hence they >>>>>>>> cannot be duplicated. >>>>>>> The differing prefix'es are fine as is; no change except to make >>>>>>> the new keys consistent. >>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> * ConnectorBootStrap.java: >>>>>>>>> ?482: Add space after ","s; no spaces before. >>>>>>>>> >>>>>>>>> 770: use the same name for the option/property if possible to >>>>>>>>> avoid confusion. >>>>>>>> Not possible as explained above. >>>>>>>>> >>>>>>>>> 770:? if the HASH_PASSWORDS static is appropriate use it >>>>>>>>> instead of literal "true". >>>>>>>> DefaultValues.HASH_PASSWORDS static is set to 'true' and can be >>>>>>>> used. However using literal "true" is more readable than using >>>>>>>> the static. >>>>>>>>> >>>>>>>>> * HashedPasswordManager >>>>>>>>> >>>>>>>>> 80-83: The fields can be final and use the constructor to >>>>>>>>> initialize in all cases and make the class final >>>>>>>>> to avoid unintentional subclassing. >>>>>>>>> >>>>>>>>> >>>>>>>>> 113: canWriteToFile:?? It should be made clear in the template >>>>>>>>> that *both* the Security policy >>>>>>>>> ?? and the file access value are used to check that the file >>>>>>>>> can be updated. >>>>>>>> Made it explicit in template as well as code comments. >>>>>>>>> >>>>>>>>> 200: loadPasswords() - should this confirm the access to the >>>>>>>>> file is allowed and it has >>>>>>>>> the correct file access before reading? >>>>>>>> Not really required. Appropriate exceptions are thrown if file >>>>>>>> cannot be accessed. >>>>>>>>> >>>>>>>>> Is the re-writing of the passwords intended to be done by a >>>>>>>>> 'priveleged' system. >>>>>>>>> Does this need doPrivileged? >>>>>>>> I am not sure. Maybe it will be covered in the security review. >>>>>>>>> >>>>>>>>> * HashedPasswordFileTest: >>>>>>>>> >>>>>>>>> 88: should use the TestLibrary Utils.getRandomInstance so it >>>>>>>>> logs the seed and can be replayed if necessary. >>>>>>>>> >>>>>>>>> >>>>>>>> Done >>>>>>>>> Thanks, Roger >>>>>>>>> >>>>>>>> Thanks >>>>>>>> Harsha >>>>>>>>> >>>>>>>>> On 4/23/2017 6:20 AM, Harsha Wardhana B wrote: >>>>>>>>>> >>>>>>>>>> Hi All, >>>>>>>>>> >>>>>>>>>> Please review this enhancement to replace plain-text password >>>>>>>>>> for JMX agent with SHA-256 hash. >>>>>>>>>> >>>>>>>>>> Issue: https://bugs.openjdk.java.net/browse/JDK-5016517 >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> webrev: http://cr.openjdk.java.net/~hb/5016517/webrev.00/ >>>>>>>>>> >>>>>>>>>> Overview of implementation: >>>>>>>>>> >>>>>>>>>> Currently, the JMX agent password file used to authenticate >>>>>>>>>> user, stores user name and password as clear text. Though >>>>>>>>>> system level restrictions are recommended for jmx password >>>>>>>>>> file, passwords are vulnerable since they are stored in >>>>>>>>>> clear. The current RFE proposes to store passwords as SHA256 >>>>>>>>>> hash instead of clear text. >>>>>>>>>> >>>>>>>>>> In current implementation, if password file is writable, and >>>>>>>>>> if passwords are in clear, they will be replaced by SHA256 >>>>>>>>>> hash upon agent boot-up or when login attempt is made. >>>>>>>>>> >>>>>>>>>> The file, >>>>>>>>>> src/jdk.management.agent/share/conf/jmxremote.password.template >>>>>>>>>> contains more details about the implementation. >>>>>>>>>> >>>>>>>>>> - Harsha >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> > From harsha.wardhana.b at oracle.com Thu Oct 12 08:16:12 2017 From: harsha.wardhana.b at oracle.com (Harsha Wardhana B) Date: Thu, 12 Oct 2017 13:46:12 +0530 Subject: RFE Review : JDK-5016517 - Replace plaintext passwords by hashed passwords for out-of-the-box JMX Agent In-Reply-To: <02f07607-b713-a6ef-4e2c-24d642eb595a@oracle.com> References: <6a193cdf-efea-e4e9-4faf-72d8960cd72b@Oracle.com> <0483f2c9-892e-f842-bb5f-2b522b81b199@oracle.com> <6ec697c3-cb0d-ef5b-4a0c-e26e0cfc92b0@Oracle.com> <1ceb614a-585d-2ccd-0877-f99202c43718@oracle.com> <5892e0b2-51c6-6a22-5a39-d4ab93ca2fcf@oracle.com> <85633749-9830-dcd5-97a5-411f684776bd@oracle.com> <83690f04-4cdb-ce11-a744-7d4372ec7500@oracle.com> <02f07607-b713-a6ef-4e2c-24d642eb595a@oracle.com> Message-ID: Hi Mandy, On Wednesday 11 October 2017 11:48 PM, mandy chung wrote: > > > On 10/8/17 10:34 PM, Harsha Wardhana B wrote: >> >> Hi Daniel, >> >> Below is the webrev addressing the review comments. >> >> http://cr.openjdk.java.net/~hb/5016517/webrev.04/ >> > > This approach seems reasonable.?? I only review management.properties > and jmxremote.password.template file. > 304 # ################# Hash passwords in password file ############## > 305 # com.sun.management.jmxremote.password.hashpasswords = true|false > 306 # Default for this property is true. > 307 # Specifies if passswords in the above file should be hashed or > not. typo: passswords s/above file/password file/ - it has been > referred to as "password file" in many places. Done. > I'm thinking any better alternative to the new property name?? > com.sun.management.jmxremote.password.hashes > com.sun.management.jmxremote.password.asHashes com.sun.management.jmxremote.passowrd.toHashes > 49 # > https://docs.oracle.com/javase/7/docs/technotes/guides/security/StandardNames.html#MessageDigest > 50 # MD5, SHA-1 and SHA-256 are supported algorithms. > 51 # This is an optional field. If not specified SHA-256 will be assumed. > I would avoid the link to the documentation of a specific JDK release. > Maybe say: > > Refer to "Java Security Standard Algorithm Names Specification" > for supported algorithm. Will modify the file appropriately. > > > 53 # If passwords are in clear, they will be over-written by their > hash if all of s/over-written/overwritten 67 # If multiple entries are > found for the same role name, then the last one 68 # is used. > If there are multiple entries of the same role, will all entries be > overridden with hash value? It may be better to detect as an error > when there are more than one entries of the same role? It would be better to log a warning. Throwing an error would seem a bit extreme. > HashedPasswordFileTest.java > @bug is missing > > Mandy -Harsha -------------- next part -------------- An HTML attachment was scrubbed... URL: From mandy.chung at oracle.com Thu Oct 12 15:10:26 2017 From: mandy.chung at oracle.com (mandy chung) Date: Thu, 12 Oct 2017 08:10:26 -0700 Subject: RFE Review : JDK-5016517 - Replace plaintext passwords by hashed passwords for out-of-the-box JMX Agent In-Reply-To: References: <6a193cdf-efea-e4e9-4faf-72d8960cd72b@Oracle.com> <0483f2c9-892e-f842-bb5f-2b522b81b199@oracle.com> <6ec697c3-cb0d-ef5b-4a0c-e26e0cfc92b0@Oracle.com> <1ceb614a-585d-2ccd-0877-f99202c43718@oracle.com> <5892e0b2-51c6-6a22-5a39-d4ab93ca2fcf@oracle.com> <85633749-9830-dcd5-97a5-411f684776bd@oracle.com> <83690f04-4cdb-ce11-a744-7d4372ec7500@oracle.com> <02f07607-b713-a6ef-4e2c-24d642eb595a@oracle.com> Message-ID: <53972784-d929-6196-ed59-5d466a273b7e@oracle.com> On 10/12/17 1:16 AM, Harsha Wardhana B wrote: > >> I'm thinking any better alternative to the new property name?? >> com.sun.management.jmxremote.password.hashes >> com.sun.management.jmxremote.password.asHashes com.sun.management.jmxremote.passowrd.toHashes I suggest to rename com.sun.management.jmxremote.password.hashpasswords to com.sun.management.jmxremote.password.hashes. What do you think? >> 67 # If multiple entries are found for the same role name, then the >> last one 68 # is used. >> If there are multiple entries of the same role, will all entries be >> overridden with hash value? It may be better to detect as an error >> when there are more than one entries of the same role? > It would be better to log a warning. Throwing an error would seem a > bit extreme. What happen to the duplicated entries?? The clear password will stay?? Warning is fine. Mandy -------------- next part -------------- An HTML attachment was scrubbed... URL: From harsha.wardhana.b at oracle.com Thu Oct 12 15:18:20 2017 From: harsha.wardhana.b at oracle.com (Harsha Wardhana B) Date: Thu, 12 Oct 2017 20:48:20 +0530 Subject: RFE Review : JDK-5016517 - Replace plaintext passwords by hashed passwords for out-of-the-box JMX Agent In-Reply-To: <53972784-d929-6196-ed59-5d466a273b7e@oracle.com> References: <6a193cdf-efea-e4e9-4faf-72d8960cd72b@Oracle.com> <0483f2c9-892e-f842-bb5f-2b522b81b199@oracle.com> <6ec697c3-cb0d-ef5b-4a0c-e26e0cfc92b0@Oracle.com> <1ceb614a-585d-2ccd-0877-f99202c43718@oracle.com> <5892e0b2-51c6-6a22-5a39-d4ab93ca2fcf@oracle.com> <85633749-9830-dcd5-97a5-411f684776bd@oracle.com> <83690f04-4cdb-ce11-a744-7d4372ec7500@oracle.com> <02f07607-b713-a6ef-4e2c-24d642eb595a@oracle.com> <53972784-d929-6196-ed59-5d466a273b7e@oracle.com> Message-ID: <486455cb-1937-49a9-d55d-664cbdfc4524@oracle.com> On Thursday 12 October 2017 08:40 PM, mandy chung wrote: > > > On 10/12/17 1:16 AM, Harsha Wardhana B wrote: >> >>> I'm thinking any better alternative to the new property name?? >>> com.sun.management.jmxremote.password.hashes >>> com.sun.management.jmxremote.password.asHashes com.sun.management.jmxremote.passowrd.toHashes > > I suggest to rename > com.sun.management.jmxremote.password.hashpasswords to > com.sun.management.jmxremote.password.hashes. > > What do you think? We want the property to suggest an action and hence *.toHashes would be better than *.hashes. > >>> 67 # If multiple entries are found for the same role name, then the >>> last one 68 # is used. >>> If there are multiple entries of the same role, will all entries be >>> overridden with hash value? It may be better to detect as an error >>> when there are more than one entries of the same role? >> It would be better to log a warning. Throwing an error would seem a >> bit extreme. > > What happen to the duplicated entries?? The clear password will stay?? > Warning is fine. The duplicated entries will be removed. The last entry for a given role along with its hashed password will be written into the file. > > Mandy > Harsha -------------- next part -------------- An HTML attachment was scrubbed... URL: From mandy.chung at oracle.com Thu Oct 12 15:22:38 2017 From: mandy.chung at oracle.com (mandy chung) Date: Thu, 12 Oct 2017 08:22:38 -0700 Subject: RFE Review : JDK-5016517 - Replace plaintext passwords by hashed passwords for out-of-the-box JMX Agent In-Reply-To: <486455cb-1937-49a9-d55d-664cbdfc4524@oracle.com> References: <6a193cdf-efea-e4e9-4faf-72d8960cd72b@Oracle.com> <0483f2c9-892e-f842-bb5f-2b522b81b199@oracle.com> <6ec697c3-cb0d-ef5b-4a0c-e26e0cfc92b0@Oracle.com> <1ceb614a-585d-2ccd-0877-f99202c43718@oracle.com> <5892e0b2-51c6-6a22-5a39-d4ab93ca2fcf@oracle.com> <85633749-9830-dcd5-97a5-411f684776bd@oracle.com> <83690f04-4cdb-ce11-a744-7d4372ec7500@oracle.com> <02f07607-b713-a6ef-4e2c-24d642eb595a@oracle.com> <53972784-d929-6196-ed59-5d466a273b7e@oracle.com> <486455cb-1937-49a9-d55d-664cbdfc4524@oracle.com> Message-ID: <4fc3b0b3-3e6b-3409-ff4c-e748f2b87f3f@oracle.com> On 10/12/17 8:18 AM, Harsha Wardhana B wrote: > > > > On Thursday 12 October 2017 08:40 PM, mandy chung wrote: >> >> >> On 10/12/17 1:16 AM, Harsha Wardhana B wrote: >>> >>>> I'm thinking any better alternative to the new property name?? >>>> com.sun.management.jmxremote.password.hashes >>>> com.sun.management.jmxremote.password.asHashes com.sun.management.jmxremote.passowrd.toHashes >> >> I suggest to rename >> com.sun.management.jmxremote.password.hashpasswords to >> com.sun.management.jmxremote.password.hashes. >> >> What do you think? > We want the property to suggest an action and hence *.toHashes would > be better than *.hashes. "toHashes" suffix is also good to me. >> >>>> 67 # If multiple entries are found for the same role name, then the >>>> last one 68 # is used. >>>> If there are multiple entries of the same role, will all entries be >>>> overridden with hash value? It may be better to detect as an error >>>> when there are more than one entries of the same role? >>> It would be better to log a warning. Throwing an error would seem a >>> bit extreme. >> >> What happen to the duplicated entries?? The clear password will >> stay?? Warning is fine. > The duplicated entries will be removed. The last entry for a given > role along with its hashed password will be written into the file. > The other alternative is to override it with its hash value and output a warning that this entry is ignored.?? This will leave it for the user to remove the entries. Mandy -------------- next part -------------- An HTML attachment was scrubbed... URL: From harsha.wardhana.b at oracle.com Thu Oct 12 15:52:54 2017 From: harsha.wardhana.b at oracle.com (Harsha Wardhana B) Date: Thu, 12 Oct 2017 21:22:54 +0530 Subject: RFE Review : JDK-5016517 - Replace plaintext passwords by hashed passwords for out-of-the-box JMX Agent In-Reply-To: <4fc3b0b3-3e6b-3409-ff4c-e748f2b87f3f@oracle.com> References: <6a193cdf-efea-e4e9-4faf-72d8960cd72b@Oracle.com> <0483f2c9-892e-f842-bb5f-2b522b81b199@oracle.com> <6ec697c3-cb0d-ef5b-4a0c-e26e0cfc92b0@Oracle.com> <1ceb614a-585d-2ccd-0877-f99202c43718@oracle.com> <5892e0b2-51c6-6a22-5a39-d4ab93ca2fcf@oracle.com> <85633749-9830-dcd5-97a5-411f684776bd@oracle.com> <83690f04-4cdb-ce11-a744-7d4372ec7500@oracle.com> <02f07607-b713-a6ef-4e2c-24d642eb595a@oracle.com> <53972784-d929-6196-ed59-5d466a273b7e@oracle.com> <486455cb-1937-49a9-d55d-664cbdfc4524@oracle.com> <4fc3b0b3-3e6b-3409-ff4c-e748f2b87f3f@oracle.com> Message-ID: <37a59c19-d663-63b0-5524-9c0473ea8359@oracle.com> Sure. I will send out a modified webrev soon. -Harsha On Thursday 12 October 2017 08:52 PM, mandy chung wrote: > > > On 10/12/17 8:18 AM, Harsha Wardhana B wrote: >> >> >> >> On Thursday 12 October 2017 08:40 PM, mandy chung wrote: >>> >>> >>> On 10/12/17 1:16 AM, Harsha Wardhana B wrote: >>>> >>>>> I'm thinking any better alternative to the new property name?? >>>>> com.sun.management.jmxremote.password.hashes >>>>> com.sun.management.jmxremote.password.asHashes com.sun.management.jmxremote.passowrd.toHashes >>> >>> I suggest to rename >>> com.sun.management.jmxremote.password.hashpasswords to >>> com.sun.management.jmxremote.password.hashes. >>> >>> What do you think? >> We want the property to suggest an action and hence *.toHashes would >> be better than *.hashes. > > "toHashes" suffix is also good to me. >>> >>>>> 67 # If multiple entries are found for the same role name, then >>>>> the last one 68 # is used. >>>>> If there are multiple entries of the same role, will all entries >>>>> be overridden with hash value? It may be better to detect as an >>>>> error when there are more than one entries of the same role? >>>> It would be better to log a warning. Throwing an error would seem a >>>> bit extreme. >>> >>> What happen to the duplicated entries?? The clear password will >>> stay?? Warning is fine. >> The duplicated entries will be removed. The last entry for a given >> role along with its hashed password will be written into the file. >> > > The other alternative is to override it with its hash value and output > a warning that this entry is ignored.?? This will leave it for the > user to remove the entries. > > Mandy > -------------- next part -------------- An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Thu Oct 12 22:21:24 2017 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 12 Oct 2017 15:21:24 -0700 Subject: RFR (XS): 8175510 Null pointer dereference in getModuleObject of JPLISAgent.c:790 Message-ID: An HTML attachment was scrubbed... URL: From david.holmes at oracle.com Fri Oct 13 01:01:12 2017 From: david.holmes at oracle.com (David Holmes) Date: Fri, 13 Oct 2017 11:01:12 +1000 Subject: RFR (XS): 8175510 Null pointer dereference in getModuleObject of JPLISAgent.c:790 In-Reply-To: References: Message-ID: Hi Serguei, Seems quite reasonable. Reviewed. Thanks, David On 13/10/2017 8:21 AM, serguei.spitsyn at oracle.com wrote: > Please, review a fix for the Parfait bug: > https://bugs.openjdk.java.net/browse/JDK-8175510 > > Webrev: > http://cr.openjdk.java.net/~sspitsyn/webrevs/2017/hotspot/8175510-jplis-parfait.1/ > > > Summary: > > ? This is the main fragment from the Parfait report: > > > getModuleObject > > FileExpandCollapseLine > #jdk/src/java.instrument/share/native/libinstrument/JPLISAgent.c > jdk-9+180-JDK9_linux > > 783. > > int len = (last_slash == NULL) ? 0 : (int)(last_slash - cname); > > 784. > > char* pkg_name_buf = (char*)malloc(len + 1); > > 785. > 786. > > jplis_assert_msg(pkg_name_buf != NULL, "OOM error in native tmp buffer allocation"); > > Pointer checked against constant 'NULL' but does not protect the > dereference. > 787. > > if (last_slash != NULL) { > > 788. > > strncpy(pkg_name_buf, cname, len); > > 789. > > } > > 790. > > pkg_name_buf[len] = '\0'; > > *Null pointer dereference not protected by null check* > Write to pointer pkg_name_buf that could be constant 'NULL' > > > ? The malloc can return NULL in a case of OOME. > ? The assert at L786 checks the returned pointer for NULL but does not > protect the dereference at L790. > ? The fix is to replace the assert with printing a error message and > returning with NULL from the getModuleObject(). > ? It must be safe as the returned result is passed to the > sun.instrument.InstrumentationImpl.transform() > ? which handles null passed as in the module parameter. > > Thanks, > Serguei > From serguei.spitsyn at oracle.com Fri Oct 13 02:12:27 2017 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 12 Oct 2017 19:12:27 -0700 Subject: RFR (XS): 8175510 Null pointer dereference in getModuleObject of JPLISAgent.c:790 In-Reply-To: References: Message-ID: <3d2dee4f-6849-66c7-5859-97790da93013@oracle.com> Thanks for the quick review, David! Serguei On 10/12/17 18:01, David Holmes wrote: > Hi Serguei, > > Seems quite reasonable. > > Reviewed. > > Thanks, > David > > On 13/10/2017 8:21 AM, serguei.spitsyn at oracle.com wrote: >> Please, review a fix for the Parfait bug: >> https://bugs.openjdk.java.net/browse/JDK-8175510 >> >> Webrev: >> http://cr.openjdk.java.net/~sspitsyn/webrevs/2017/hotspot/8175510-jplis-parfait.1/ >> >> >> >> Summary: >> >> ?? This is the main fragment from the Parfait report: >> >> >> ??????? getModuleObject >> >> FileExpandCollapseLine >> #jdk/src/java.instrument/share/native/libinstrument/JPLISAgent.c >> jdk-9+180-JDK9_linux >> >> 783. >> >> ???? int len = (last_slash == NULL) ? 0 : (int)(last_slash - cname); >> >> 784. >> >> ???? char* pkg_name_buf = (char*)malloc(len + 1); >> >> 785. >> 786. >> >> ???? jplis_assert_msg(pkg_name_buf != NULL, "OOM error in native tmp >> buffer allocation"); >> >> Pointer checked against constant 'NULL' but does not protect the >> dereference. >> 787. >> >> ???? if (last_slash != NULL) { >> >> 788. >> >> ???????? strncpy(pkg_name_buf, cname, len); >> >> 789. >> >> ???? } >> >> 790. >> >> ???? pkg_name_buf[len] = '\0'; >> >> *Null pointer dereference not protected by null check* >> Write to pointer pkg_name_buf that could be constant 'NULL' >> >> >> ?? The malloc can return NULL in a case of OOME. >> ?? The assert at L786 checks the returned pointer for NULL but does >> not protect the dereference at L790. >> ?? The fix is to replace the assert with printing a error message and >> returning with NULL from the getModuleObject(). >> ?? It must be safe as the returned result is passed to the >> sun.instrument.InstrumentationImpl.transform() >> ?? which handles null passed as in the module parameter. >> >> Thanks, >> Serguei >> From serguei.spitsyn at oracle.com Fri Oct 13 04:58:04 2017 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 12 Oct 2017 21:58:04 -0700 Subject: RFR (S): 8187289 NotifyFramePop request is not cleared if JVMTI_EVENT_FRAME_POP is disabled Message-ID: An HTML attachment was scrubbed... URL: From robin.westberg at oracle.com Fri Oct 13 14:54:58 2017 From: robin.westberg at oracle.com (Robin Westberg) Date: Fri, 13 Oct 2017 16:54:58 +0200 Subject: RFR(XS): 8173917: Safepoint ID is not consistent across event-based tracing events Message-ID: <1D05B90A-B036-4AFE-AF81-FD188520D347@oracle.com> Hi all, Please review the following change that aligns the safepointId value for the SafepointStateSynchronization event to be consistent with the other related events committed later. Bug: https://bugs.openjdk.java.net/browse/JDK-8173917 Webrev: http://cr.openjdk.java.net/~egahlin/8173917_0/ Best regards, Robin -------------- next part -------------- An HTML attachment was scrubbed... URL: From robin.westberg at oracle.com Fri Oct 13 14:56:10 2017 From: robin.westberg at oracle.com (Robin Westberg) Date: Fri, 13 Oct 2017 16:56:10 +0200 Subject: RFR: 8187042: Events to show which objects are associated with biased object revocations In-Reply-To: References: <61AD2E91-A4E1-4575-A065-7A4F5B0FBFAF@oracle.com> <3059930F-FB64-43A2-A52E-AED01B95796F@oracle.com> Message-ID: <18F21D0A-473B-4BF1-90AA-DCCBAF81D38F@oracle.com> Hi again, Here?s an updated version that adds a separate event for the self-revocation path. It?s a new event class as it is a bit different from the non-self-revocation path, it does not have any relevant safepoint ID for example. Webrev: http://cr.openjdk.java.net/~egahlin/8187042_2/ >>> Third, I would have expected to see more detail in the event such as which thread (id) the object was biased to and which thread revoked the bias. Even perhaps some notion of which instance was involved (though that's harder to shows). >> Right, I?ve been looking at capturing which thread the object was biased towards, but I was afraid of the possible races there as the thread pointer in the mark would have to be saved before executing the VM operation. For that to work 100% reliably I suspect it would have to be done inside the safepoint. > > Right the thread holding the bias may not even exist any more! This may need to utilise the new Thread-SMR work (as a future RFE of course). :) Ah yeah, that may be an effective way of doing it. Another idea suggested by Markus Gr?nlund was to capture the thread?s id inside the operation and propagate it through an additional field in the VM operation class. But anyway, I?ll file a separate RFE for investigating that improvement. Best regards, Robin >> I will create an updated webrev after looking into adding an event for the self-revocation path. > > Thanks, > David > >> Best regards, >> Robin -------------- next part -------------- An HTML attachment was scrubbed... URL: From erik.gahlin at oracle.com Fri Oct 13 15:11:29 2017 From: erik.gahlin at oracle.com (Erik Gahlin) Date: Fri, 13 Oct 2017 17:11:29 +0200 Subject: RFR(XS): 8189274: Allow cutoff attribute for event based tracing Message-ID: <59E0D7A1.4090607@oracle.com> Hi, Could I have a review of this small change that will add the capability to use a cutoff attribute on an event. Webrev: http://cr.openjdk.java.net/~egahlin/8189274/ Bug: https://bugs.openjdk.java.net/browse/JDK-8189274 Thanks Erik From markus.gronlund at oracle.com Fri Oct 13 15:14:38 2017 From: markus.gronlund at oracle.com (Markus Gronlund) Date: Fri, 13 Oct 2017 08:14:38 -0700 (PDT) Subject: RFR(XS): 8189274: Allow cutoff attribute for event based tracing In-Reply-To: <59E0D7A1.4090607@oracle.com> References: <59E0D7A1.4090607@oracle.com> Message-ID: Looks good Erik, Thanks for adding this. Markus -----Original Message----- From: Erik Gahlin Sent: den 13 oktober 2017 17:11 To: serviceability-dev at openjdk.java.net Subject: RFR(XS): 8189274: Allow cutoff attribute for event based tracing Hi, Could I have a review of this small change that will add the capability to use a cutoff attribute on an event. Webrev: http://cr.openjdk.java.net/~egahlin/8189274/ Bug: https://bugs.openjdk.java.net/browse/JDK-8189274 Thanks Erik From david.holmes at oracle.com Sat Oct 14 12:34:26 2017 From: david.holmes at oracle.com (David Holmes) Date: Sat, 14 Oct 2017 22:34:26 +1000 Subject: RFR: 8187042: Events to show which objects are associated with biased object revocations In-Reply-To: <18F21D0A-473B-4BF1-90AA-DCCBAF81D38F@oracle.com> References: <61AD2E91-A4E1-4575-A065-7A4F5B0FBFAF@oracle.com> <3059930F-FB64-43A2-A52E-AED01B95796F@oracle.com> <18F21D0A-473B-4BF1-90AA-DCCBAF81D38F@oracle.com> Message-ID: Thanks robin, these updates seem fine to me. David On 14/10/2017 12:56 AM, Robin Westberg wrote: > Hi again, > > Here?s an updated version that adds a separate event for the > self-revocation path. It?s a new event class as it is a bit different > from the non-self-revocation path, it does not have any relevant > safepoint ID for example. > > Webrev: > http://cr.openjdk.java.net/~egahlin/8187042_2/ > >>>> Third, I would have expected to see more detail in the event such as >>>> which thread (id) the object was biased to and which thread revoked >>>> the bias. Even perhaps some notion of which instance was involved >>>> (though that's harder to shows). >>> Right, I?ve been looking at capturing which thread the object was >>> biased towards, but I was afraid of the possible races there as the >>> thread pointer in the mark would have to be saved before executing >>> the VM operation. For that to work 100% reliably I suspect it would >>> have to be done inside the safepoint. >> >> Right the thread holding the bias may not even exist any more! This >> may need to utilise the new Thread-SMR work (as a future RFE of >> course). :) > > Ah yeah, that may be an effective way of doing it. Another idea > suggested by Markus Gr?nlund was to capture the thread?s id inside the > operation and propagate it through an additional field in the VM > operation class. But anyway, I?ll file a separate RFE for investigating > that improvement. > > Best regards, > Robin > >>> I will create an updated webrev after looking into adding an event >>> for the self-revocation path. >> >> Thanks, >> David >> >>> Best regards, >>> Robin > From erik.gahlin at oracle.com Sun Oct 15 21:34:55 2017 From: erik.gahlin at oracle.com (Erik Gahlin) Date: Sun, 15 Oct 2017 23:34:55 +0200 Subject: RFR(XS): 8173917: Safepoint ID is not consistent across event-based tracing events In-Reply-To: <1D05B90A-B036-4AFE-AF81-FD188520D347@oracle.com> References: <1D05B90A-B036-4AFE-AF81-FD188520D347@oracle.com> Message-ID: <37C820FE-E24D-441B-926A-E3A2D26E0FA1@oracle.com> Look good. Should I push this change before the new events? Thanks Erik > On 13 Oct 2017, at 16:54, Robin Westberg wrote: > > Hi all, > > Please review the following change that aligns the safepointId value for the SafepointStateSynchronization event to be consistent with the other related events committed later. > > Bug: > https://bugs.openjdk.java.net/browse/JDK-8173917 > > Webrev: > http://cr.openjdk.java.net/~egahlin/8173917_0/ > > Best regards, > Robin -------------- next part -------------- An HTML attachment was scrubbed... URL: From thomas.stuefe at gmail.com Mon Oct 16 10:40:06 2017 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Mon, 16 Oct 2017 12:40:06 +0200 Subject: Status of JEP159? Message-ID: Hi all, just a small question. While examining a crash in jvmti_GetClassMethods (jdk9) I noticed that I am able to successfully add and remove methods in a redefined class. But JEP159 is still only in "submitted" stage. Was this feature added for another JEP? Thank you! Kind Regards, Thomas -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.holmes at oracle.com Mon Oct 16 11:20:18 2017 From: david.holmes at oracle.com (David Holmes) Date: Mon, 16 Oct 2017 21:20:18 +1000 Subject: Status of JEP159? In-Reply-To: References: Message-ID: <4ac20e04-db8b-160c-0da2-91d4e8fbe8c0@oracle.com> Hi Thomas, On 16/10/2017 8:40 PM, Thomas St?fe wrote: > Hi all, > > just a small question. > > While examining a crash in jvmti_GetClassMethods (jdk9) I noticed that I > am able to successfully add and remove methods in a redefined class. > > But JEP159 is still only in "submitted" stage. Was this feature added > for another JEP? According to the spec, you are not allowed to add/remove methods. How did you add/remove them? https://docs.oracle.com/javase/9/docs/specs/jvmti.html#RedefineClasses David ----- > Thank you! > > Kind Regards, Thomas From markus.gronlund at oracle.com Mon Oct 16 11:23:11 2017 From: markus.gronlund at oracle.com (Markus Gronlund) Date: Mon, 16 Oct 2017 04:23:11 -0700 (PDT) Subject: RFR: 8187042: Events to show which objects are associated with biased object revocations In-Reply-To: <18F21D0A-473B-4BF1-90AA-DCCBAF81D38F@oracle.com> References: <61AD2E91-A4E1-4575-A065-7A4F5B0FBFAF@oracle.com> <3059930F-FB64-43A2-A52E-AED01B95796F@oracle.com> <18F21D0A-473B-4BF1-90AA-DCCBAF81D38F@oracle.com> Message-ID: <5d1b68c0-b40c-4ed8-8b50-edb485be8d44@default> Hi Robin, ? Looks good. Thanks Markus ? From: Robin Westberg Sent: den 13 oktober 2017 16:56 To: David Holmes Cc: serviceability-dev at openjdk.java.net Subject: Re: RFR: 8187042: Events to show which objects are associated with biased object revocations ? Hi again, ? Here?s an updated version that adds a separate event for the self-revocation path. It?s a new event class as it is a bit different from the non-self-revocation path, it does not have any relevant safepoint ID for example. ? Webrev: http://cr.openjdk.java.net/~egahlin/8187042_2/? ? Third, I would have expected to see more detail in the event such as which thread (id) the object was biased to and which thread revoked the bias. Even perhaps some notion of which instance was involved (though that's harder to shows). Right, I?ve been looking at capturing which thread the object was biased towards, but I was afraid of the possible races there as the thread pointer in the mark would have to be saved before executing the VM operation. For that to work 100% reliably I suspect it would have to be done inside the safepoint. Right the thread holding the bias may not even exist any more! This may need to utilise the new Thread-SMR work (as a future RFE of course). :) ? Ah yeah, that may be an effective way of doing it. Another idea suggested by Markus Gr?nlund was to capture the thread?s id inside the operation and propagate it through an additional field in the VM operation class. But anyway, I?ll file a separate RFE for investigating that improvement. ? Best regards, Robin I will create an updated webrev after looking into adding an event for the self-revocation path. Thanks, David Best regards, Robin ? -------------- next part -------------- An HTML attachment was scrubbed... URL: From markus.gronlund at oracle.com Mon Oct 16 11:24:46 2017 From: markus.gronlund at oracle.com (Markus Gronlund) Date: Mon, 16 Oct 2017 04:24:46 -0700 (PDT) Subject: RFR(XS): 8173917: Safepoint ID is not consistent across event-based tracing events In-Reply-To: <1D05B90A-B036-4AFE-AF81-FD188520D347@oracle.com> References: <1D05B90A-B036-4AFE-AF81-FD188520D347@oracle.com> Message-ID: <5f62c87d-2175-4caa-ae78-74712c128c70@default> Hi Robin, Looks good. Thanks Markus From: Robin Westberg Sent: den 13 oktober 2017 16:55 To: serviceability-dev at openjdk.java.net Subject: RFR(XS): 8173917: Safepoint ID is not consistent across event-based tracing events Hi all, Please review the following change that aligns the safepointId value for the SafepointStateSynchronization event to be consistent with the other related events committed later. Bug: https://bugs.openjdk.java.net/browse/JDK-8173917 Webrev: http://cr.openjdk.java.net/~egahlin/8173917_0/ Best regards, Robin -------------- next part -------------- An HTML attachment was scrubbed... URL: From robbin.ehn at oracle.com Mon Oct 16 13:21:57 2017 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Mon, 16 Oct 2017 15:21:57 +0200 Subject: Status of JEP159? In-Reply-To: <4ac20e04-db8b-160c-0da2-91d4e8fbe8c0@oracle.com> References: <4ac20e04-db8b-160c-0da2-91d4e8fbe8c0@oracle.com> Message-ID: <24b9381a-96fd-1a21-8e0d-069e0338ae5f@oracle.com> Hi, if you use class file load hook you can add/remove public methods. Since this is before the class have been published we don't know how it should look. Whether this is according to spec or not, I have no clue. Is it on CFLH ? /Robbin On 10/16/2017 01:20 PM, David Holmes wrote: > Hi Thomas, > > On 16/10/2017 8:40 PM, Thomas St?fe wrote: >> Hi all, >> >> just a small question. >> >> While examining a crash in jvmti_GetClassMethods (jdk9) I noticed that I am able to successfully add and remove methods in a redefined class. >> >> But JEP159 is still only in "submitted" stage. Was this feature added for another JEP? > > According to the spec, you are not allowed to add/remove methods. How did you add/remove them? > > https://docs.oracle.com/javase/9/docs/specs/jvmti.html#RedefineClasses > > David > ----- > >> Thank you! >> >> Kind Regards, Thomas From yasuenag at gmail.com Mon Oct 16 13:25:38 2017 From: yasuenag at gmail.com (Yasumasa Suenaga) Date: Mon, 16 Oct 2017 22:25:38 +0900 Subject: PING: RFR: JDK-8151815: Could not parse core image with JSnap. In-Reply-To: References: <7b897b36-1824-606d-b206-df577a6afe02@gmail.com> Message-ID: <0a760f75-58bd-c334-26c5-0e9adddfe5b7@gmail.com> PING: Could you review it? > http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.05/ Thanks, Yasumasa On 2017/10/03 13:18, Yasumasa Suenaga wrote: > Hi all, > > I added gtest unit test case for this change in new webrev: > > http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.05/ > > Could you review it? > > > Thanks, > > Yasumasa > > > > 2017-09-27 0:01 GMT+09:00 Yasumasa Suenaga : >> Hi all, >> >> I uploaded new webrev to be adapted to jdk10/hs: >> >> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.04/ >> >> >> Thanks, >> >> Yasumasa >> >> >> On 2017/09/21 7:45, Yasumasa Suenaga wrote: >>> >>> PING: >>> >>> Have you checked this issue? >>> >>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.03/ >>> >>> >>> >>> Yasumasa >>> >>> >>> On 2017/07/01 23:43, Yasumasa Suenaga wrote: >>>> >>>> PING: >>>> >>>> Have you checked this issue? >>>> >>>> >>>> Yasumasa >>>> >>>> >>>> On 2017/06/13 14:10, Yasumasa Suenaga wrote: >>>>> >>>>> Hi all, >>>>> >>>>> I want to discuss about JDK-8151815: Could not parse core image with >>>>> JSnap. >>>>> >>>>> >>>>> In last year, I found JSnap cannot parse coredump and I've sent review >>>>> request for it as JDK-8151815. However it has not been reviewed yet >>>>> [1]. >>>>> >>>>> We've discussed about safety implementation, but we could not get >>>>> consensus. >>>>> IMHO all SA tools should be handled java processes and core images, >>>>> and PerfCounter value is useful. So I fix this issue. >>>>> >>>>> I uploaded new webrev for this issue. I think this patch is safety >>>>> because new flag PerfMemory::_destroyed guards double free, and all >>>>> members in PerfMemory is accessible (they are not munmap'ed) >>>>> >>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.03/ >>>>> >>>>> >>>>> Can you cooperate? >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Yasumasa >>>>> >>>>> >>>>> [1] >>>>> http://mail.openjdk.java.net/pipermail/serviceability-dev/2016-April/019480.html >>>>> >> From Alan.Bateman at oracle.com Mon Oct 16 13:31:09 2017 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Mon, 16 Oct 2017 14:31:09 +0100 Subject: Status of JEP159? In-Reply-To: <24b9381a-96fd-1a21-8e0d-069e0338ae5f@oracle.com> References: <4ac20e04-db8b-160c-0da2-91d4e8fbe8c0@oracle.com> <24b9381a-96fd-1a21-8e0d-069e0338ae5f@oracle.com> Message-ID: <4b3a9eb6-9ad0-37e7-596e-11a5130474a6@oracle.com> On 16/10/2017 14:21, Robbin Ehn wrote: > Hi, if you use class file load hook you can add/remove public methods. > Since this is before the class have been published we don't know how > it should look. > Whether this is according to spec or not, I have no clue. > > Is it on CFLH ? > No issue adding or removing methods or making any other changes to the class file in the CFLH but only for the initial load. The CFLH will be re-run when the class is transformed (RetransformClasses) but that cannot add/remove methods or do other schema changes. -Alan From robbin.ehn at oracle.com Mon Oct 16 13:55:31 2017 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Mon, 16 Oct 2017 15:55:31 +0200 Subject: Status of JEP159? In-Reply-To: <4b3a9eb6-9ad0-37e7-596e-11a5130474a6@oracle.com> References: <4ac20e04-db8b-160c-0da2-91d4e8fbe8c0@oracle.com> <24b9381a-96fd-1a21-8e0d-069e0338ae5f@oracle.com> <4b3a9eb6-9ad0-37e7-596e-11a5130474a6@oracle.com> Message-ID: <50817987-269d-21d2-c61e-ab239e208a6b@oracle.com> On 10/16/2017 03:31 PM, Alan Bateman wrote: > > > On 16/10/2017 14:21, Robbin Ehn wrote: >> Hi, if you use class file load hook you can add/remove public methods. >> Since this is before the class have been published we don't know how it should look. >> Whether this is according to spec or not, I have no clue. >> >> Is it on CFLH ? >> > No issue adding or removing methods or making any other changes to the class file in the CFLH but only for the initial load. The CFLH will be re-run when the class is > transformed (RetransformClasses) but that cannot add/remove methods or do other schema changes. There is actually an issue, we start all transformation with 'on' disk version. If the agent that did the addition of a public method e.g. exits(removeTransformer) we can never re-transform it, instead we get: "error method delete" It have been suggested that we should use 'first published' class version as a baseline (the version after CFLH), but would break current agents (I assume). This is my old patch for it: diff -r 46a21d1c5f1c src/share/vm/prims/jvmtiExport.cpp --- a/src/share/vm/prims/jvmtiExport.cpp Fri Aug 12 14:12:55 2016 -0700 +++ b/src/share/vm/prims/jvmtiExport.cpp Tue Aug 16 16:22:29 2016 +0200 @@ -661,7 +661,8 @@ if (env->is_retransformable() && env->is_enabled(JVMTI_EVENT_CLASS_FILE_LOAD_HOOK)) { // retransformable agents need to cache the original class file // bytes if changes are made via the ClassFileLoadHook - post_to_env(env, true); + // cache the last version after load is completed, hence the published version + post_to_env(env, _load_kind != jvmti_class_load_kind_load); } } } Is it a bug or work as intended? (or a bug we can't fix) /Robbin > > -Alan From robin.westberg at oracle.com Mon Oct 16 15:02:35 2017 From: robin.westberg at oracle.com (Robin Westberg) Date: Mon, 16 Oct 2017 17:02:35 +0200 Subject: RFR: 8187042: Events to show which objects are associated with biased object revocations In-Reply-To: <5d1b68c0-b40c-4ed8-8b50-edb485be8d44@default> References: <61AD2E91-A4E1-4575-A065-7A4F5B0FBFAF@oracle.com> <3059930F-FB64-43A2-A52E-AED01B95796F@oracle.com> <18F21D0A-473B-4BF1-90AA-DCCBAF81D38F@oracle.com> <5d1b68c0-b40c-4ed8-8b50-edb485be8d44@default> Message-ID: <48E53B0F-FD1F-44CF-912D-37661F5906C6@oracle.com> Thanks for the reviews Markus, Erik and David! Filed https://bugs.openjdk.java.net/browse/JDK-8189368 for the improvement to add information on the thread currently holding the bias when it is revoked. Best regards, Robin > On 16 Oct 2017, at 13:23, Markus Gronlund wrote: > > Hi Robin, > > Looks good. > > Thanks > Markus > > From: Robin Westberg > Sent: den 13 oktober 2017 16:56 > To: David Holmes > Cc: serviceability-dev at openjdk.java.net > Subject: Re: RFR: 8187042: Events to show which objects are associated with biased object revocations > > Hi again, > > Here?s an updated version that adds a separate event for the self-revocation path. It?s a new event class as it is a bit different from the non-self-revocation path, it does not have any relevant safepoint ID for example. > > Webrev: > http://cr.openjdk.java.net/~egahlin/8187042_2/ > > Third, I would have expected to see more detail in the event such as which thread (id) the object was biased to and which thread revoked the bias. Even perhaps some notion of which instance was involved (though that's harder to shows). > Right, I?ve been looking at capturing which thread the object was biased towards, but I was afraid of the possible races there as the thread pointer in the mark would have to be saved before executing the VM operation. For that to work 100% reliably I suspect it would have to be done inside the safepoint. > > Right the thread holding the bias may not even exist any more! This may need to utilise the new Thread-SMR work (as a future RFE of course). :) > > Ah yeah, that may be an effective way of doing it. Another idea suggested by Markus Gr?nlund was to capture the thread?s id inside the operation and propagate it through an additional field in the VM operation class. But anyway, I?ll file a separate RFE for investigating that improvement. > > Best regards, > Robin > > > I will create an updated webrev after looking into adding an event for the self-revocation path. > > Thanks, > David > > > Best regards, > Robin -------------- next part -------------- An HTML attachment was scrubbed... URL: From thomas.stuefe at gmail.com Mon Oct 16 15:03:43 2017 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Mon, 16 Oct 2017 17:03:43 +0200 Subject: Status of JEP159? In-Reply-To: <4ac20e04-db8b-160c-0da2-91d4e8fbe8c0@oracle.com> References: <4ac20e04-db8b-160c-0da2-91d4e8fbe8c0@oracle.com> Message-ID: Hi David, On Mon, Oct 16, 2017 at 1:20 PM, David Holmes wrote: > Hi Thomas, > > On 16/10/2017 8:40 PM, Thomas St?fe wrote: > >> Hi all, >> >> just a small question. >> >> While examining a crash in jvmti_GetClassMethods (jdk9) I noticed that I >> am able to successfully add and remove methods in a redefined class. >> >> But JEP159 is still only in "submitted" stage. Was this feature added for >> another JEP? >> > > According to the spec, you are not allowed to add/remove methods. How did > you add/remove them? > > https://docs.oracle.com/javase/9/docs/specs/jvmti.html#RedefineClasses > > David > ----- I used jdb (redefine). I found that add/remove method worked for private methods, but not for public ones, so that explains it. I was examining a bug which now turned out to be a regression of https://bugs.openjdk.java.net/browse/JDK-8149743 - only in my case it was not a lambda method but just an ordinary private method. Sorry for the noise. Thomas > > > Thank you! >> >> Kind Regards, Thomas >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Alan.Bateman at oracle.com Mon Oct 16 15:44:55 2017 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Mon, 16 Oct 2017 16:44:55 +0100 Subject: Status of JEP159? In-Reply-To: <50817987-269d-21d2-c61e-ab239e208a6b@oracle.com> References: <4ac20e04-db8b-160c-0da2-91d4e8fbe8c0@oracle.com> <24b9381a-96fd-1a21-8e0d-069e0338ae5f@oracle.com> <4b3a9eb6-9ad0-37e7-596e-11a5130474a6@oracle.com> <50817987-269d-21d2-c61e-ab239e208a6b@oracle.com> Message-ID: <9057365e-1f47-802c-1004-62b9267ba4e8@oracle.com> On 16/10/2017 14:55, Robbin Ehn wrote: > > There is actually an issue, we start all transformation with 'on' disk > version. > If the agent that did the addition of a public method e.g. > exits(removeTransformer) we can never re-transform it, instead we get: > "error method delete" > It have been suggested that we should use 'first published' class > version as a baseline (the version after CFLH), but would break > current agents (I assume). > > : > Is it a bug or work as intended? (or a bug we can't fix) If all agents (or JVM TI environments) are retransformation capable then retransformClasses should send the initial class file bytes (or the "on disk" version as you termed it) to the CFLH of the first agent. If a retransformation capable agent adds a method in the initial load then it should add it again when called to retransform the class. On the other hand, if there are retransformation incapable agents in picture then the class file bytes sent to the CFLH of the first retransformation capable agent will be the class bytes from the output from the retransformation incapable agents. So if retransformation incapable agent adds a method in the initial load then that method will exist in the class bytes that the retransformation capable agents see when they retransform. -Alan From robbin.ehn at oracle.com Mon Oct 16 15:46:07 2017 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Mon, 16 Oct 2017 17:46:07 +0200 Subject: Low-Overhead Heap Profiling In-Reply-To: References: <1497366226.2829.109.camel@oracle.com> <1498215147.2741.34.camel@oracle.com> <044f8c75-72f3-79fd-af47-7ee875c071fd@oracle.com> <23f4e6f5-c94e-01f7-ef1d-5e328d4823c8@oracle.com> Message-ID: <5ec70351-910a-96bb-eb03-43ca88bd6259@oracle.com> Hi JC, I saw a webrev.12 in the directory, with only test changes(11->12), so I took that version. I had a look and tested the tests, worked fine! First glance at the code (looking at full v12) some minor things below, mostly unused stuff. Thanks, Robbin diff -r 9047e0d726d6 src/hotspot/share/runtime/heapMonitoring.cpp --- a/src/hotspot/share/runtime/heapMonitoring.cpp Mon Oct 16 16:54:06 2017 +0200 +++ b/src/hotspot/share/runtime/heapMonitoring.cpp Mon Oct 16 17:42:42 2017 +0200 @@ -211,2 +211,3 @@ void initialize(int max_storage) { + // validate max_storage to sane value ? What would 0 mean ? MutexLocker mu(HeapMonitor_lock); @@ -227,8 +228,4 @@ bool initialized() { return _initialized; } - volatile bool *initialized_address() { return &_initialized; } private: - // Protects the traces currently sampled (below). - volatile intptr_t _stack_storage_lock[1]; - // The traces currently sampled. @@ -313,3 +310,2 @@ _initialized(false) { - _stack_storage_lock[0] = 0; } @@ -532,13 +528,2 @@ -// Delegate the initialization question to the underlying storage system. -bool HeapMonitoring::initialized() { - return StackTraceStorage::storage()->initialized(); -} - -// Delegate the initialization question to the underlying storage system. -bool *HeapMonitoring::initialized_address() { - return - const_cast(StackTraceStorage::storage()->initialized_address()); -} - void HeapMonitoring::get_live_traces(jvmtiStackTraces *traces) { diff -r 9047e0d726d6 src/hotspot/share/runtime/heapMonitoring.hpp --- a/src/hotspot/share/runtime/heapMonitoring.hpp Mon Oct 16 16:54:06 2017 +0200 +++ b/src/hotspot/share/runtime/heapMonitoring.hpp Mon Oct 16 17:42:42 2017 +0200 @@ -35,3 +35,2 @@ static uint64_t _rnd; - static bool _initialized; static jint _monitoring_rate; @@ -92,7 +91,2 @@ - // Is the profiler initialized and where is the address to the initialized - // boolean. - static bool initialized(); - static bool *initialized_address(); - // Called when o is to be sampled from a given thread and a given size. On 10/10/2017 12:57 AM, JC Beyler wrote: > Dear all, > > Thread-safety is back!! Here is the update webrev: > http://cr.openjdk.java.net/~rasbold/8171119/webrev.10_11/ > > Full webrev is here: > http://cr.openjdk.java.net/~rasbold/8171119/webrev.11/ > > In order to really test this, I needed to add this so thought now was a good time. It required a few changes here for the creation to ensure correctness and safety. Now we > keep the static pointer but clear the data internally so on re-initialize, it will be a bit more costly than before. I don't think this is a huge use-case so I did not > think it was a problem. I used the internal MutexLocker, I think I used it well, let me know. > > I also added three tests: > > 1) Stack depth test: > http://cr.openjdk.java.net/~rasbold/8171119/webrev.10_11/test/hotspot/jtreg/serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStackDepthTest.java.patch > > This test shows that the maximum stack depth system is working. > > 2) Thread safety: > http://cr.openjdk.java.net/~rasbold/8171119/webrev.10_11/test/hotspot/jtreg/serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorThreadTest.java.patch > > The test creates 24 threads and they all allocate at the same time. The test then checks it does find samples from all the threads. > > 3) Thread on/off safety > http://cr.openjdk.java.net/~rasbold/8171119/webrev.10_11/test/hotspot/jtreg/serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorThreadOnOffTest.java.patch > > The test creates 24 threads that all allocate a bunch of memory. Then another thread turns the sampling on/off. > > Btw, both tests 2 & 3 failed without the locks. > > As I worked on this, I saw a lot of places where the tests are doing very similar things, I'm going to clean up the code a bit and make a HeapAllocator class that all tests > can call directly. This will greatly simplify the code. > > Thanks for any comments/criticisms! > Jc > > > On Mon, Oct 2, 2017 at 8:52 PM, JC Beyler > wrote: > > Dear all, > > Small update to the webrev: > http://cr.openjdk.java.net/~rasbold/8171119/webrev.09_10/ > > Full webrev is here: > http://cr.openjdk.java.net/~rasbold/8171119/webrev.10/ > > I updated a bit of the naming, removed a TODO comment, and I added a test for testing the sampling rate. I also updated the maximum stack depth to 1024, there is no > reason to keep it so small. I did a micro benchmark that tests the overhead and it seems relatively the same. > > I compared allocations from a stack depth of 10 and allocations from a stack depth of 1024 (allocations are from the same helper method in > http://cr.openjdk.java.net/~rasbold/8171119/webrev.10/raw_files/new/test/hotspot/jtreg/serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatRateTest.java > ): > ? ? ? ? ? - For an array of 1 integer allocated in a loop; stack depth 1024 vs stack depth 10: 1% slower > ??????????- For an array of 200k integers allocated in a loop; stack depth 1024 vs stack depth 10: 3% slower > > So basically now moving the maximum stack depth to 1024 but we only copy over the stack depths actually used. > > For the next webrev, I will be adding a stack depth test to show that it works and probably put back the mutex locking so that we can see how difficult it is to keep > thread safe. > > Let me know what you think! > Jc > > > > On Mon, Sep 25, 2017 at 3:02 PM, JC Beyler > wrote: > > Forgot to say that for my numbers: > ?- Not in the test are the actual numbers I got for the various array sizes, I ran the program 30 times and parsed the output; here are the averages and standard > deviation: > ? ? ? 1000:? ? ?1.28% average; 1.13% standard deviation > ? ? ? 10000:? ? 1.59% average; 1.25% standard deviation > ? ? ? 100000:? ?1.26% average; 1.26% standard deviation > > The 1000/10000/100000 are the sizes of the arrays being allocated. These are allocated 100k times and the sampling rate is 111 times the size of the array. > > Thanks! > Jc > > > On Mon, Sep 25, 2017 at 3:01 PM, JC Beyler > wrote: > > Hi all, > > After a bit of a break, I am back working on this :). As before, here are two webrevs: > > - Full change set: http://cr.openjdk.java.net/~rasbold/8171119/webrev.09/ > - Compared to version 8: http://cr.openjdk.java.net/~rasbold/8171119/webrev.08_09/ > ? ? (This version is compared to version 8 I last showed but ported to the new folder hierarchy) > > In this version I have: > ? - Handled Thomas' comments from his email of 07/03: > ? ? ? ?- Merged the logging to be standard > ? ? ? ?- Fixed up the code a bit where asked > ? ? ? ?- Added some notes about the code not being thread-safe yet > ? ?- Removed additional dead code from the version that modifies interpreter/c1/c2 > ? ?- Fixed compiler issues so that it compiles with --disable-precompiled-header > ? ? ? ? - Tested with ./configure --with-boot-jdk= --with-debug-level=slowdebug --disable-precompiled-headers > > Additionally, I added a test to check the sanity of the sampler: HeapMonitorStatCorrectnessTest > (http://cr.openjdk.java.net/~rasbold/8171119/webrev.08_09/test/hotspot/jtreg/serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatCorrectnessTest.java.patch ) > ? ?- This allocates a number of arrays and checks that we obtain the number of samples we want with an accepted error of 5%. I tested it 100 times and it > passed everytime, I can test more if wanted > ? ?- Not in the test are the actual numbers I got for the various array sizes, I ran the program 30 times and parsed the output; here are the averages and > standard deviation: > ? ? ? 1000:? ? ?1.28% average; 1.13% standard deviation > ? ? ? 10000:? ? 1.59% average; 1.25% standard deviation > ? ? ? 100000:? ?1.26% average; 1.26% standard deviation > > What this means is that we were always at about 1~2% of the number of samples the test expected. > > Let me know what you think, > Jc > > On Wed, Jul 5, 2017 at 9:31 PM, JC Beyler > wrote: > > Hi all, > > I apologize, I have not yet handled your remarks but thought this new webrev would also be useful to see and comment on perhaps. > > Here is the latest webrev, it is generated slightly different than the others since now I'm using webrev.ksh without the -N option: > http://cr.openjdk.java.net/~rasbold/8171119/webrev.08/ > > And the webrev.07 to webrev.08 diff is here: > http://cr.openjdk.java.net/~rasbold/8171119/webrev.07_08/ > > (Let me know if it works well) > > It's a small change between versions but it: > ? - provides a fix that makes the average sample rate correct (more on that below). > ? - fixes the code to actually have it play nicely with the fast tlab refill > ? - cleaned up a bit the JVMTI text and now use jvmtiFrameInfo > - moved the capability to be onload solo > > With this webrev, I've done a small study of the random number generator we use here for the sampling rate. I took a small program and it can be simplified to: > > for (outer loop) > for (inner loop) > int[] tmp = new int[arraySize]; > > - I've fixed the outer and inner loops to being 800 for this experiment, meaning we allocate 640000 times an array of a given array size. > > - Each program provides the average sample size used for the whole execution > > - Then, I ran each variation 30 times and then calculated the average of the average sample size used for various array sizes. I selected the array size to > be one of the following: 1, 10, 100, 1000. > > - When compared to 512kb, the average sample size of 30 runs: > 1: 4.62% of error > 10: 3.09% of error > 100: 0.36% of error > 1000: 0.1% of error > 10000: 0.03% of error > > What it shows is that, depending on the number of samples, the average does become better. This is because with an allocation of 1 element per array, it > will take longer to hit one of the thresholds. This is seen by looking at the sample count statistic I put in. For the same number of iterations (800 * > 800), the different array sizes provoke: > 1: 62 samples > 10: 125 samples > 100: 788 samples > 1000: 6166 samples > 10000: 57721 samples > > And of course, the more samples you have, the more sample rates you pick, which means that your average gets closer using that math. > > Thanks, > Jc > > On Thu, Jun 29, 2017 at 10:01 PM, JC Beyler > wrote: > > Thanks Robbin, > > This seems to have worked. When I have the next webrev ready, we will find out but I'm fairly confident it will work! > > Thanks agian! > Jc > > On Wed, Jun 28, 2017 at 11:46 PM, Robbin Ehn > wrote: > > Hi JC, > > On 06/29/2017 12:15 AM, JC Beyler wrote: > > B) Incremental changes > > > I guess the most common work flow here is using mq : > hg qnew fix_v1 > edit files > hg qrefresh > hg qnew fix_v2 > edit files > hg qrefresh > > if you do hg log you will see 2 commits > > webrev.ksh -r -2 -o my_inc_v1_v2 > webrev.ksh -o my_full_v2 > > > In? your .hgrc you might need: > [extensions] > mq = > > /Robbin > > > Again another newbiew question here... > > For showing the incremental changes, is there a link that explains how to do that? I apologize for my newbie questions all the time :) > > Right now, I do: > > ? ksh ../webrev.ksh -m -N > > That generates a webrev.zip and send it to Chuck Rasbold. He then uploads it to a new webrev. > > I tried commiting my change and adding a small change. Then if I just do ksh ../webrev.ksh without any options, it seems to produce a similar > page but now with only the changes I had (so the 06-07 comparison you were talking about) and a changeset that has it all. I imagine that is > what you meant. > > Which means that my workflow would become: > > 1) Make changes > 2) Make a webrev without any options to show just the differences with the tip > 3) Amend my changes to my local commit so that I have it done with > 4) Go to 1 > > Does that seem correct to you? > > Note that when I do this, I only see the full change of a file in the full change set (Side note here: now the page says change set and not > patch, which is maybe why Serguei was having issues?). > > Thanks! > Jc > > > > On Wed, Jun 28, 2017 at 1:12 AM, Robbin Ehn >> wrote: > > ? ? Hi, > > ? ? On 06/28/2017 12:04 AM, JC Beyler wrote: > > ? ? ? ? Dear Thomas et al, > > ? ? ? ? Here is the newest webrev: > http://cr.openjdk.java.net/~rasbold/8171119/webrev.07/ > > > > > > ? ? You have some more bits to in there but generally this looks good and really nice with more tests. > ? ? I'll do and deep dive and re-test this when I get back from my long vacation with whatever patch version you have then. > > ? ? Also I think it's time you provide incremental (v06->07 changes) as well as complete change-sets. > > ? ? Thanks, Robbin > > > > > ? ? ? ? Thomas, I "think" I have answered all your remarks. The summary is: > > ? ? ? ? - The statistic system is up and provides insight on what the heap sampler is doing > ? ? ? ? ? ? ?- I've noticed that, though the sampling rate is at the right mean, we are missing some samples, I have not yet tracked out why > (details below) > > ? ? ? ? - I've run a tiny benchmark that is the worse case: it is a very tight loop and allocated a small array > ? ? ? ? ? ? ?- In this case, I see no overhead when the system is off so that is a good start :) > ? ? ? ? ? ? ?- I see right now a high overhead in this case when sampling is on. This is not a really too surprising but I'm going to see if > this is consistent with our > ? ? ? ? internal implementation. The benchmark is really allocation stressful so I'm not too surprised but I want to do the due diligence. > > ? ? ? ? ? ?- The statistic system up is up and I have a new test > http://cr.openjdk.java.net/~rasbold/8171119/webrev.07/test/serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatTest.java.patch > > ? ? ? ? > > ? ? ? ? ? ? ? - I did a bit of a study about the random generator here, more details are below but basically it seems to work well > > ? ? ? ? ? ?- I added a capability but since this is the first time doing this, I was not sure I did it right > ? ? ? ? ? ? ?- I did add a test though for it and the test seems to do what I expect (all methods are failing with the > JVMTI_ERROR_MUST_POSSESS_CAPABILITY error). > ? ? ? ? ? ? ? ? ?- > http://cr.openjdk.java.net/~rasbold/8171119/webrev.07/test/serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorNoCapabilityTest.java.patch > > > > > > ? ? ? ? ? ?- I still need to figure out what to do about the multi-agent vs single-agent issue > > ? ? ? ? ? ?- As far as measurements, it seems I still need to look at: > ? ? ? ? ? ? ?- Why we do the 20 random calls first, are they necessary? > ? ? ? ? ? ? ?- Look at the mean of the sampling rate that the random generator does and also what is actually sampled > ? ? ? ? ? ? ?- What is the overhead in terms of memory/performance when on? > > ? ? ? ? I have inlined my answers, I think I got them all in the new webrev, let me know your thoughts. > > ? ? ? ? Thanks again! > ? ? ? ? Jc > > > ? ? ? ? On Fri, Jun 23, 2017 at 3:52 AM, Thomas Schatzl > > > > ? ? ? ? >>> wrote: > > ? ? ? ? ? ? ?Hi, > > ? ? ? ? ? ? ?On Wed, 2017-06-21 at 13:45 -0700, JC Beyler wrote: > ? ? ? ? ? ? ?> Hi all, > ? ? ? ? ? ? ?> > ? ? ? ? ? ? ?> First off: Thanks again to Robbin and Thomas for their reviews :) > ? ? ? ? ? ? ?> > ? ? ? ? ? ? ?> Next, I've uploaded a new webrev: > ? ? ? ? ? ? ?> http://cr.openjdk.java.net/~rasbold/8171119/webrev.06/ > > > ? ? ? ? > >> > > ? ? ? ? ? ? ?> > ? ? ? ? ? ? ?> Here is an update: > ? ? ? ? ? ? ?> > ? ? ? ? ? ? ?> - @Robbin, I forgot to say that yes I need to look at implementing > ? ? ? ? ? ? ?> this for the other architectures and testing it before it is all > ? ? ? ? ? ? ?> ready to go. Is it common to have it working on all possible > ? ? ? ? ? ? ?> combinations or is there a subset that I should be doing first and we > ? ? ? ? ? ? ?> can do the others later? > ? ? ? ? ? ? ?> - I've tested slowdebug, built and ran the JTreg tests I wrote with > ? ? ? ? ? ? ?> slowdebug and fixed a few more issues > ? ? ? ? ? ? ?> - I've refactored a bit of the code following Thomas' comments > ? ? ? ? ? ? ?>? ? - I think I've handled all the comments from Thomas (I put > ? ? ? ? ? ? ?> comments inline below for the specifics) > > ? ? ? ? ? ? ?Thanks for handling all those. > > ? ? ? ? ? ? ?> - Following Thomas' comments on statistics, I want to add some > ? ? ? ? ? ? ?> quality assurance tests and find that the easiest way would be to > ? ? ? ? ? ? ?> have a few counters of what is happening in the sampler and expose > ? ? ? ? ? ? ?> that to the user. > ? ? ? ? ? ? ?>? ? - I'll be adding that in the next version if no one sees any > ? ? ? ? ? ? ?> objections to that. > ? ? ? ? ? ? ?>? ? - This will allow me to add a sanity test in JTreg about number of > ? ? ? ? ? ? ?> samples and average of sampling rate > ? ? ? ? ? ? ?> > ? ? ? ? ? ? ?> @Thomas: I had a few questions that I inlined below but I will > ? ? ? ? ? ? ?> summarize the "bigger ones" here: > ? ? ? ? ? ? ?>? ? - You mentioned constants are not using the right conventions, I > ? ? ? ? ? ? ?> looked around and didn't see any convention except normal naming then > ? ? ? ? ? ? ?> for static constants. Is that right? > > ? ? ? ? ? ? ?I looked through https://wiki.openjdk.java.net/display/HotSpot/StyleGui > > > ? ? ? ? > >> > ? ? ? ? ? ? ?de and the rule is to "follow an existing pattern and must have a > ? ? ? ? ? ? ?distinct appearance from other names". Which does not help a lot I > ? ? ? ? ? ? ?guess :/ The GC team started using upper camel case, e.g. > ? ? ? ? ? ? ?SomeOtherConstant, but very likely this is probably not applied > ? ? ? ? ? ? ?consistently throughout. So I am fine with not adding another style > ? ? ? ? ? ? ?(like kMaxStackDepth with the "k" in front with some unknown meaning) > ? ? ? ? ? ? ?is fine. > > ? ? ? ? ? ? ?(Chances are you will find that style somewhere used anyway too, > ? ? ? ? ? ? ?apologies if so :/) > > > ? ? ? ? Thanks for that link, now I know where to look. I used the upper camel case in my code as well then :) I should have gotten them all. > > > ? ? ? ? ? ? ? > PS: I've also inlined my answers to Thomas below: > ? ? ? ? ? ? ? > > ? ? ? ? ? ? ? > On Tue, Jun 13, 2017 at 8:03 AM, Thomas Schatzl ? ? ? ? ? ? ? > e.com > wrote: > ? ? ? ? ? ? ? > > Hi all, > ? ? ? ? ? ? ? > > > ? ? ? ? ? ? ? > > On Mon, 2017-06-12 at 11:11 -0700, JC Beyler wrote: > ? ? ? ? ? ? ? > > > Dear all, > ? ? ? ? ? ? ? > > > > ? ? ? ? ? ? ? > > > I've continued working on this and have done the following > ? ? ? ? ? ? ? > > webrev: > ? ? ? ? ? ? ? > > > http://cr.openjdk.java.net/~rasbold/8171119/webrev.05/ > > > ? ? ? ? > >> > > ? ? ? ? ? ? ? > > > ? ? ? ? ? ? ? > > [...] > ? ? ? ? ? ? ? > > > Things I still need to do: > ? ? ? ? ? ? ? > > >? ? - Have to fix that TLAB case for the FastTLABRefill > ? ? ? ? ? ? ? > > >? ? - Have to start looking at the data to see that it is > ? ? ? ? ? ? ? > > consistent and does gather the right samples, right frequency, etc. > ? ? ? ? ? ? ? > > >? ? - Have to check the GC elements and what that produces > ? ? ? ? ? ? ? > > >? ? - Run a slowdebug run and ensure I fixed all those issues you > ? ? ? ? ? ? ? > > saw > Robbin > ? ? ? ? ? ? ? > > > > ? ? ? ? ? ? ? > > > Thanks for looking at the webrev and have a great week! > ? ? ? ? ? ? ? > > > ? ? ? ? ? ? ? > >? ?scratching a bit on the surface of this change, so apologies for > ? ? ? ? ? ? ? > > rather shallow comments: > ? ? ? ? ? ? ? > > > ? ? ? ? ? ? ? > > - macroAssembler_x86.cpp:5604: while this is compiler code, and I > ? ? ? ? ? ? ? > > am not sure this is final, please avoid littering the code with > ? ? ? ? ? ? ? > > TODO remarks :) They tend to be candidates for later wtf moments > ? ? ? ? ? ? ? > > only. > ? ? ? ? ? ? ? > > > ? ? ? ? ? ? ? > > Just file a CR for that. > ? ? ? ? ? ? ? > > > ? ? ? ? ? ? ? > Newcomer question: what is a CR and not sure I have the rights to do > ? ? ? ? ? ? ? > that yet ? :) > > ? ? ? ? ? ? ?Apologies. CR is a change request, this suggests to file a bug in the > ? ? ? ? ? ? ?bug tracker. And you are right, you can't just create a new account in > ? ? ? ? ? ? ?the OpenJDK JIRA yourselves. :( > > > ? ? ? ? Ok good to know, I'll continue with my own todo list but I'll work hard on not letting it slip in the webrevs anymore :) > > > ? ? ? ? ? ? ?I was mostly referring to the "... but it is a TODO" part of that > ? ? ? ? ? ? ?comment in macroassembler_x86.cpp. Comments about the why of the code > ? ? ? ? ? ? ?are appreciated. > > ? ? ? ? ? ? ?[Note that I now understand that this is to some degree still work in > ? ? ? ? ? ? ?progress. As long as the final changeset does no contain TODO's I am > ? ? ? ? ? ? ?fine (and it's not a hard objection, rather their use in "final" code > ? ? ? ? ? ? ?is typically limited in my experience)] > > ? ? ? ? ? ? ?5603? ?// Currently, if this happens, just set back the actual end to > ? ? ? ? ? ? ?where it was. > ? ? ? ? ? ? ?5604? ?// We miss a chance to sample here. > > ? ? ? ? ? ? ?Would be okay, if explaining "this" and the "why" of missing a chance > ? ? ? ? ? ? ?to sample here would be best. > > ? ? ? ? ? ? ?Like maybe: > > ? ? ? ? ? ? ?// If we needed to refill TLABs, just set the actual end point to > ? ? ? ? ? ? ?// the end of the TLAB again. We do not sample here although we could. > > ? ? ? ? Done with your comment, it works well in my mind. > > ? ? ? ? ? ? ?I am not sure whether "miss a chance to sample" meant "we could, but > ? ? ? ? ? ? ?consciously don't because it's not that useful" or "it would be > ? ? ? ? ? ? ?necessary but don't because it's too complicated to do.". > > ? ? ? ? ? ? ?Looking at the original comment once more, I am also not sure if that > ? ? ? ? ? ? ?comment shouldn't referring to the "end" variable (not actual_end) > ? ? ? ? ? ? ?because that's the variable that is responsible for taking the sampling > ? ? ? ? ? ? ?path? (Going from the member description of ThreadLocalAllocBuffer). > > > ? ? ? ? I've moved this code and it no longer shows up here but the rationale and answer was: > > ? ? ? ? So.. Yes, end is the variable provoking the sampling. Actual end is the actual end of the TLAB. > > ? ? ? ? What was happening here is that the code is resetting _end to point towards the end of the new TLAB. Because, we now have the end for > sampling and _actual_end for > ? ? ? ? the actual end, we need to update the actual_end as well. > > ? ? ? ? Normally, were we to do the real work here, we would calculate the (end - start) offset, then do: > > ? ? ? ? - Set the new end to : start + (old_end - old_start) > ? ? ? ? - Set the actual end like we do here now where it because it is the actual end. > > ? ? ? ? Why is this not done here now anymore? > ? ? ? ? ? ? - I was still debating which path to take: > ? ? ? ? ? ? ? ?- Do it in the fast refill code, it has its perks: > ? ? ? ? ? ? ? ? ? ?- In a world where fast refills are happening all the time or a lot, we can augment there the code to do the sampling > ? ? ? ? ? ? ? ?- Remember what we had as an end before leaving the slowpath and check on return > ? ? ? ? ? ? ? ? ? ?- This is what I'm doing now, it removes the need to go fix up all fast refill paths but if you remain in fast refill paths, > you won't get sampling. I > ? ? ? ? have to think of the consequences of that, maybe a future change later on? > ? ? ? ? ? ? ? ? ? ? ? - I have the statistics now so I'm going to study that > ? ? ? ? ? ? ? ? ? ? ? ? ?-> By the way, though my statistics are showing I'm missing some samples, if I turn off FastTlabRefill, it is the same > loss so for now, it seems > ? ? ? ? this does not occur in my simple test. > > > > ? ? ? ? ? ? ?But maybe I am only confused and it's best to just leave the comment > ? ? ? ? ? ? ?away. :) > > ? ? ? ? ? ? ?Thinking about it some more, doesn't this not-sampling in this case > ? ? ? ? ? ? ?mean that sampling does not work in any collector that does inline TLAB > ? ? ? ? ? ? ?allocation at the moment? (Or is inline TLAB alloc automatically > ? ? ? ? ? ? ?disabled with sampling somehow?) > > ? ? ? ? ? ? ?That would indeed be a bigger TODO then :) > > > ? ? ? ? Agreed, this remark made me think that perhaps as a first step the new way of doing it is better but I did have to: > ? ? ? ? ? ?- Remove the const of the ThreadLocalBuffer remaining and hard_end methods > ? ? ? ? ? ?- Move hard_end out of the header file to have a bit more logic there > > ? ? ? ? Please let me know what you think of that and if you prefer it this way or changing the fast refills. (I prefer this way now because it > is more incremental). > > > ? ? ? ? ? ? ?> > - calling HeapMonitoring::do_weak_oops() (which should probably be > ? ? ? ? ? ? ?> > called weak_oops_do() like other similar methods) only if string > ? ? ? ? ? ? ?> > deduplication is enabled (in g1CollectedHeap.cpp:4511) seems wrong. > ? ? ? ? ? ? ?> > ? ? ? ? ? ? ?> The call should be at least around 6 lines up outside the if. > ? ? ? ? ? ? ?> > ? ? ? ? ? ? ?> Preferentially in a method like process_weak_jni_handles(), including > ? ? ? ? ? ? ?> additional logging. (No new (G1) gc phase without minimal logging > ? ? ? ? ? ? ?> :)). > ? ? ? ? ? ? ?> Done but really not sure because: > ? ? ? ? ? ? ?> > ? ? ? ? ? ? ?> I put for logging: > ? ? ? ? ? ? ?>? ?log_develop_trace(gc, freelist)("G1ConcRegionFreeing [other] : heap > ? ? ? ? ? ? ?> monitoring"); > > ? ? ? ? ? ? ?I would think that "gc, ref" would be more appropriate log tags for > ? ? ? ? ? ? ?this similar to jni handles. > ? ? ? ? ? ? ?(I am als not sure what weak reference handling has to do with > ? ? ? ? ? ? ?G1ConcRegionFreeing, so I am a bit puzzled) > > > ? ? ? ? I was not sure what to put for the tags or really as the message. I cleaned it up a bit now to: > ? ? ? ? ? ? ?log_develop_trace(gc, ref)("HeapSampling [other] : heap monitoring processing"); > > > > ? ? ? ? ? ? ?> Since weak_jni_handles didn't have logging for me to be inspired > ? ? ? ? ? ? ?> from, I did that but unconvinced this is what should be done. > > ? ? ? ? ? ? ?The JNI handle processing does have logging, but only in > ? ? ? ? ? ? ?ReferenceProcessor::process_discovered_references(). In > ? ? ? ? ? ? ?process_weak_jni_handles() only overall time is measured (in a G1 > ? ? ? ? ? ? ?specific way, since only G1 supports disabling reference procesing) :/ > > ? ? ? ? ? ? ?The code in ReferenceProcessor prints both time taken > ? ? ? ? ? ? ?referenceProcessor.cpp:254, as well as the count, but strangely only in > ? ? ? ? ? ? ?debug VMs. > > ? ? ? ? ? ? ?I have no idea why this logging is that unimportant to only print that > ? ? ? ? ? ? ?in a debug VM. However there are reviews out for changing this area a > ? ? ? ? ? ? ?bit, so it might be useful to wait for that (JDK-8173335). > > > ? ? ? ? I cleaned it up a bit anyway and now it returns the count of objects that are in the system. > > > ? ? ? ? ? ? ?> > - the change doubles the size of > ? ? ? ? ? ? ?> > CollectedHeap::allocate_from_tlab_slow() above the "small and nice" > ? ? ? ? ? ? ?> > threshold. Maybe it could be refactored a bit. > ? ? ? ? ? ? ?> Done I think, it looks better to me :). > > ? ? ? ? ? ? ?In ThreadLocalAllocBuffer::handle_sample() I think the > ? ? ? ? ? ? ?set_back_actual_end()/pick_next_sample() calls could be hoisted out of > ? ? ? ? ? ? ?the "if" :) > > > ? ? ? ? Done! > > > ? ? ? ? ? ? ?> > - referenceProcessor.cpp:261: the change should add logging about > ? ? ? ? ? ? ?> > the number of references encountered, maybe after the corresponding > ? ? ? ? ? ? ?> > "JNI weak reference count" log message. > ? ? ? ? ? ? ?> Just to double check, are you saying that you'd like to have the heap > ? ? ? ? ? ? ?> sampler to keep in store how many sampled objects were encountered in > ? ? ? ? ? ? ?> the HeapMonitoring::weak_oops_do? > ? ? ? ? ? ? ?>? ? - Would a return of the method with the number of handled > ? ? ? ? ? ? ?> references and logging that work? > > ? ? ? ? ? ? ?Yes, it's fine if HeapMonitoring::weak_oops_do() only returned the > ? ? ? ? ? ? ?number of processed weak oops. > > > ? ? ? ? Done also (but I admit I have not tested the output yet) :) > > > ? ? ? ? ? ? ?>? ? - Additionally, would you prefer it in a separate block with its > ? ? ? ? ? ? ?> GCTraceTime? > > ? ? ? ? ? ? ?Yes. Both kinds of information is interesting: while the time taken is > ? ? ? ? ? ? ?typically more important, the next question would be why, and the > ? ? ? ? ? ? ?number of references typically goes a long way there. > > ? ? ? ? ? ? ?See above though, it is probably best to wait a bit. > > > ? ? ? ? Agreed that I "could" wait but, if it's ok, I'll just refactor/remove this when we get closer to something final. Either, JDK-8173335 > ? ? ? ? has gone in and I will notice it now or it will soon and I can change it then. > > > ? ? ? ? ? ? ?> > - threadLocalAllocBuffer.cpp:331: one more "TODO" > ? ? ? ? ? ? ?> Removed it and added it to my personal todos to look at. > ? ? ? ? ? ? ?>? ? ? > > > ? ? ? ? ? ? ?> > - threadLocalAllocBuffer.hpp: ThreadLocalAllocBuffer class > ? ? ? ? ? ? ?> > documentation should be updated about the sampling additions. I > ? ? ? ? ? ? ?> > would have no clue what the difference between "actual_end" and > ? ? ? ? ? ? ?> > "end" would be from the given information. > ? ? ? ? ? ? ?> If you are talking about the comments in this file, I made them more > ? ? ? ? ? ? ?> clear I hope in the new webrev. If it was somewhere else, let me know > ? ? ? ? ? ? ?> where to change. > > ? ? ? ? ? ? ?Thanks, that's much better. Maybe a note in the comment of the class > ? ? ? ? ? ? ?that ThreadLocalBuffer provides some sampling facility by modifying the > ? ? ? ? ? ? ?end() of the TLAB to cause "frequent" calls into the runtime call where > ? ? ? ? ? ? ?actual sampling takes place. > > > ? ? ? ? Done, I think it's better now. Added something about the slow_path_end as well. > > > ? ? ? ? ? ? ?> > - in heapMonitoring.hpp: there are some random comments about some > ? ? ? ? ? ? ?> > code that has been grabbed from "util/math/fastmath.[h|cc]". I > ? ? ? ? ? ? ?> > can't tell whether this is code that can be used but I assume that > ? ? ? ? ? ? ?> > Noam Shazeer is okay with that (i.e. that's all Google code). > ? ? ? ? ? ? ?> Jeremy and I double checked and we can release that as I thought. I > ? ? ? ? ? ? ?> removed the comment from that piece of code entirely. > > ? ? ? ? ? ? ?Thanks. > > ? ? ? ? ? ? ?> > - heapMonitoring.hpp/cpp static constant naming does not correspond > ? ? ? ? ? ? ?> > to Hotspot's. Additionally, in Hotspot static methods are cased > ? ? ? ? ? ? ?> > like other methods. > ? ? ? ? ? ? ?> I think I fixed the methods to be cased the same way as all other > ? ? ? ? ? ? ?> methods. For static constants, I was not sure. I fixed a few other > ? ? ? ? ? ? ?> variables but I could not seem to really see a consistent trend for > ? ? ? ? ? ? ?> constants. I made them as variables but I'm not sure now. > > ? ? ? ? ? ? ?Sorry again, style is a kind of mess. The goal of my suggestions here > ? ? ? ? ? ? ?is only to prevent yet another style creeping in. > > ? ? ? ? ? ? ?> > - in heapMonitoring.cpp there are a few cryptic comments at the top > ? ? ? ? ? ? ?> > that seem to refer to internal stuff that should probably be > ? ? ? ? ? ? ?> > removed. > ? ? ? ? ? ? ?> Sorry about that! My personal todos not cleared out. > > ? ? ? ? ? ? ?I am happy about comments, but I simply did not understand any of that > ? ? ? ? ? ? ?and I do not know about other readers as well. > > ? ? ? ? ? ? ?If you think you will remember removing/updating them until the review > ? ? ? ? ? ? ?proper (I misunderstood the review situation a little it seems). > > ? ? ? ? ? ? ?> > I did not think through the impact of the TLAB changes on collector > ? ? ? ? ? ? ?> > behavior yet (if there are). Also I did not check for problems with > ? ? ? ? ? ? ?> > concurrent mark and SATB/G1 (if there are). > ? ? ? ? ? ? ?> I would love to know your thoughts on this, I think this is fine. I > > ? ? ? ? ? ? ?I think so too now. No objects are made live out of thin air :) > > ? ? ? ? ? ? ?> see issues with multiple threads right now hitting the stack storage > ? ? ? ? ? ? ?> instance. Previous webrevs had a mutex lock here but we took it out > ? ? ? ? ? ? ?> for simplificity (and only for now). > > ? ? ? ? ? ? ?:) When looking at this after some thinking I now assume for this > ? ? ? ? ? ? ?review that this code is not MT safe at all. There seems to be more > ? ? ? ? ? ? ?synchronization missing than just the one for the StackTraceStorage. So > ? ? ? ? ? ? ?no comments about this here. > > > ? ? ? ? I doubled checked a bit (quickly I admit) but it seems that synchronization in StackTraceStorage is really all you need (all methods > lead to a StackTraceStorage one > ? ? ? ? and can be multithreaded outside of that). > ? ? ? ? There is a question about the initialization where the method HeapMonitoring::initialize_profiling is not thread safe. > ? ? ? ? It would work (famous last words) and not crash if there was a race but we could add a synchronization point there as well (and > therefore on the stop as well). > > ? ? ? ? But anyway I will really check and do this once we add back synchronization. > > > ? ? ? ? ? ? ?Also, this would require some kind of specification of what is allowed > ? ? ? ? ? ? ?to be called when and where. > > > ? ? ? ? Would we specify this with the methods in the jvmti.xml file? We could start by specifying in each that they are not thread safe but I > saw no mention of that for > ? ? ? ? other methods. > > > ? ? ? ? ? ? ?One potentially relevant observation about locking here: depending on > ? ? ? ? ? ? ?sampling frequency, StackTraceStore::add_trace() may be rather > ? ? ? ? ? ? ?frequently called. I assume that you are going to do measurements :) > > > ? ? ? ? Though we don't have the TLAB implementation in our code, the compiler generated sampler uses 2% of overhead with a 512k sampling rate. > I can do real measurements > ? ? ? ? when the code settles and we can see how costly this is as a TLAB implementation. > ? ? ? ? However, my theory is that if the rate is 512k, the memory/performance overhead should be minimal since it is what we saw with our > code/workloads (though not called > ? ? ? ? the same way, we call it essentially at the same rate). > ? ? ? ? If you have a benchmark you'd like me to test, let me know! > > ? ? ? ? Right now, with my really small test, this does use a bit of overhead even for a 512k sample size. I don't know yet why, I'm going to > see what is going on. > > ? ? ? ? Finally, I think it is not reasonable to suppose the overhead to be negligible if the sampling rate used is too low. The user should > know that the lower the rate, > ? ? ? ? the higher the overhead (documentation TODO?). > > > ? ? ? ? ? ? ?I am not sure what the expected usage of the API is, but > ? ? ? ? ? ? ?StackTraceStore::add_trace() seems to be able to grow without bounds. > ? ? ? ? ? ? ?Only a GC truncates them to the live ones. That in itself seems to be > ? ? ? ? ? ? ?problematic (GCs can be *wide* apart), and of course some of the API > ? ? ? ? ? ? ?methods add to that because they duplicate that unbounded array. Do you > ? ? ? ? ? ? ?have any concerns/measurements about this? > > > ? ? ? ? So, the theory is that yes add_trace can be able to grow without bounds but it grows at a sample per 512k of allocated space. The > stacks it gathers are currently > ? ? ? ? maxed at 64 (I'd like to expand that to an option to the user though at some point). So I have no concerns because: > > ? ? ? ? - If really this is taking a lot of space, that means the job is keeping a lot of objects in memory as well, therefore the entire heap > is getting huge > ? ? ? ? - If this is the case, you will be triggering a GC at some point anyway. > > ? ? ? ? (I'm putting under the rug the issue of "What if we set the rate to 1 for example" because as you lower the sampling rate, we cannot > guarantee low overhead; the > ? ? ? ? idea behind this feature is to have a means of having meaningful allocated samples at a low overhead) > > ? ? ? ? I have no measurements really right now but since I now have some statistics I can poll, I will look a bit more at this question. > > ? ? ? ? I have the same last sentence than above: the user should expect this to happen if the sampling rate is too small. That probably can be > reflected in the > ? ? ? ? StartHeapSampling as a note : careful this might impact your performance. > > > ? ? ? ? ? ? ?Also, these stack traces might hold on to huge arrays. Any > ? ? ? ? ? ? ?consideration of that? Particularly it might be the cause for OOMEs in > ? ? ? ? ? ? ?tight memory situations. > > > ? ? ? ? There is a stack size maximum that is set to 64 so it should not hold huge arrays. I don't think this is an issue but I can double > check with a test or two. > > > ? ? ? ? ? ? ?- please consider adding a safepoint check in > ? ? ? ? ? ? ?HeapMonitoring::weak_oops_do to prevent accidental misuse. > > ? ? ? ? ? ? ?- in struct StackTraceStorage, the public fields may also need > ? ? ? ? ? ? ?underscores. At least some files in the runtime directory have structs > ? ? ? ? ? ? ?with underscored public members (and some don't). The runtime team > ? ? ? ? ? ? ?should probably comment on that. > > > ? ? ? ? Agreed I did not know. I looked around and a lot of structs did not have them it seemed so I left it as is. I will happily change it if > someone prefers (I was not > ? ? ? ? sure if you really preferred or not, your sentence seemed to be more a note of "this might need to change but I don't know if the > runtime team enforces that", let > ? ? ? ? me know if I read that wrongly). > > > ? ? ? ? ? ? ?- In StackTraceStorage::weak_oops_do(), when examining the > ? ? ? ? ? ? ?StackTraceData, maybe it is useful to consider having a non-NULL > ? ? ? ? ? ? ?reference outside of the heap's reserved space an error. There should > ? ? ? ? ? ? ?be no oop outside of the heap's reserved space ever. > > ? ? ? ? ? ? ?Unless you allow storing random values in StackTraceData::obj, which I > ? ? ? ? ? ? ?would not encourage. > > > ? ? ? ? I suppose you are talking about this part: > ? ? ? ? if ((value != NULL && Universe::heap()->is_in_reserved(value)) && > ? ? ? ? ? ? ? ? ? ? (is_alive == NULL || is_alive->do_object_b(value))) { > > ? ? ? ? What you are saying is that I could have something like: > ? ? ? ? if (value != my_non_null_reference && > ? ? ? ? ? ? ? ? ? ? (is_alive == NULL || is_alive->do_object_b(value))) { > > ? ? ? ? Is that what you meant? Is there really a reason to do so? When I look at the code, is_in_reserved seems like a O(1) method call. I'm > not even sure we can have a > ? ? ? ? NULL value to be honest. I might have to study that to see if this was not a paranoid test to begin with. > > ? ? ? ? The is_alive code has now morphed due to the comment below. > > > > ? ? ? ? ? ? ?- HeapMonitoring::weak_oops_do() does not seem to use the > ? ? ? ? ? ? ?passed AbstractRefProcTaskExecutor. > > > ? ? ? ? It did use it: > ? ? ? ? ? ?size_t HeapMonitoring::weak_oops_do( > ? ? ? ? ? ? ? AbstractRefProcTaskExecutor *task_executor, > ? ? ? ? ? ? ? BoolObjectClosure* is_alive, > ? ? ? ? ? ? ? OopClosure *f, > ? ? ? ? ? ? ? VoidClosure *complete_gc) { > ? ? ? ? ? ? assert(SafepointSynchronize::is_at_safepoint(), "must be at safepoint"); > > ? ? ? ? ? ? if (task_executor != NULL) { > ? ? ? ? ? ? ? task_executor->set_single_threaded_mode(); > ? ? ? ? ? ? } > ? ? ? ? ? ? return StackTraceStorage::storage()->weak_oops_do(is_alive, f, complete_gc); > ? ? ? ? } > > ? ? ? ? But due to the comment below, I refactored this, so this is no longer here. Now I have an always true closure that is passed. > > > ? ? ? ? ? ? ?- I do not understand allowing to call this method with a NULL > ? ? ? ? ? ? ?complete_gc closure. This would mean that objects referenced from the > ? ? ? ? ? ? ?object that is referenced by the StackTraceData are not pulled, meaning > ? ? ? ? ? ? ?they would get stale. > > ? ? ? ? ? ? ?- same with is_alive parameter value of NULL > > > ? ? ? ? So these questions made me look a bit closer at this code. This code I think was written this way to have a very small impact on the > file but you are right, there > ? ? ? ? is no reason for this here. I've simplified the code by making in referenceProcessor.cpp a process_HeapSampling method that handles > everything there. > > ? ? ? ? The code allowed NULLs because it depended on where you were coming from and how the code was being called. > > ? ? ? ? - I added a static always_true variable and pass that now to be more consistent with the rest of the code. > ? ? ? ? - I moved the complete_gc into process_phaseHeapSampling now (new method) and handle the task_executor and the complete_gc there > ? ? ? ? ? ? ?- Newbie question: in our code we did a set_single_threaded_mode but I see that process_phaseJNI does it right before its call, do > I need to do it for the > ? ? ? ? process_phaseHeapSample? > ? ? ? ? That API is much cleaner (in my mind) and is consistent with what is done around it (again in my mind). > > > ? ? ? ? ? ? ?- heapMonitoring.cpp:590: I do not completely understand the purpose of > ? ? ? ? ? ? ?this code: in the end this results in a fixed value directly dependent > ? ? ? ? ? ? ?on the Thread address anyway? In the end this results in a fixed value > ? ? ? ? ? ? ?directly dependent on the Thread address anyway? > ? ? ? ? ? ? ?IOW, what is special about exactly 20 rounds? > > > ? ? ? ? So we really want a fast random number generator that has a specific mean (512k is the default we use). The code uses the thread > address as the start number of the > ? ? ? ? sequence (why not, it is random enough is rationale). Then instead of just starting there, we prime the sequence and really only start > at the 21st number, it is > ? ? ? ? arbitrary and I have not done a study to see if we could do more or less of that. > > ? ? ? ? As I have the statistics of the system up and running, I'll run some experiments to see if this is needed, is 20 good, or not. > > > ? ? ? ? ? ? ?- also I would consider stripping a few bits of the threads' address as > ? ? ? ? ? ? ?initialization value for your rng. The last three bits (and probably > ? ? ? ? ? ? ?more, check whether the Thread object is allocated on special > ? ? ? ? ? ? ?boundaries) are always zero for them. > ? ? ? ? ? ? ?Not sure if the given "random" value is random enough before/after, > ? ? ? ? ? ? ?this method, so just skip that comment if you think this is not > ? ? ? ? ? ? ?required. > > > ? ? ? ? I don't know is the honest answer. I think what is important is that we tend towards a mean and it is random "enough" to not fall in > pitfalls of only sampling a > ? ? ? ? subset of objects due to their allocation order. I added that as test to do to see if it changes the mean in any way for the 512k > default value and/or if the first > ? ? ? ? 1000 elements look better. > > > ? ? ? ? ? ? ?Some more random nits I did not find a place to put anywhere: > > ? ? ? ? ? ? ?- ThreadLocalAllocBuffer::_extra_space does not seem to be used > ? ? ? ? ? ? ?anywhere? > > > ? ? ? ? Good catch :). > > > ? ? ? ? ? ? ?- Maybe indent the declaration of ThreadLocalAllocBuffer::_bytes_until_sample to align below the other members of that group. > > > ? ? ? ? Done moved it up a bit to have non static members together and static separate. > > ? ? ? ? ? ? ?Thanks, > ? ? ? ? ? ? ? ? Thomas > > > ? ? ? ? Thanks for your review! > ? ? ? ? Jc > > > > > > > > From robbin.ehn at oracle.com Mon Oct 16 15:59:45 2017 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Mon, 16 Oct 2017 17:59:45 +0200 Subject: Status of JEP159? In-Reply-To: <9057365e-1f47-802c-1004-62b9267ba4e8@oracle.com> References: <4ac20e04-db8b-160c-0da2-91d4e8fbe8c0@oracle.com> <24b9381a-96fd-1a21-8e0d-069e0338ae5f@oracle.com> <4b3a9eb6-9ad0-37e7-596e-11a5130474a6@oracle.com> <50817987-269d-21d2-c61e-ab239e208a6b@oracle.com> <9057365e-1f47-802c-1004-62b9267ba4e8@oracle.com> Message-ID: On 10/16/2017 05:44 PM, Alan Bateman wrote: > On 16/10/2017 14:55, Robbin Ehn wrote: >> >> There is actually an issue, we start all transformation with 'on' disk version. >> If the agent that did the addition of a public method e.g. exits(removeTransformer) we can never re-transform it, instead we get: >> "error method delete" >> It have been suggested that we should use 'first published' class version as a baseline (the version after CFLH), but would break current agents (I assume). >> >> : >> Is it a bug or work as intended? (or a bug we can't fix) > If all agents (or JVM TI environments) are retransformation capable then retransformClasses should send the initial class file bytes (or the "on disk" version as you termed > it) to the CFLH of the first agent. If a retransformation capable agent adds a method in the initial load then it should add it again when called to retransform the class. > > On the other hand, if there are retransformation incapable agents in picture then the class file bytes sent to the CFLH of the first retransformation capable agent will be > the class bytes from the output from the retransformation incapable agents. So if retransformation incapable agent adds a method in the initial load then that method will > exist in the class bytes that the retransformation capable agents see when they retransform. I see, in my case it's several CFLH agents with retransformation capability. The one that added the public method is removed and no longer called. Leaving the other agents without the ability to retransform anymore since they get the class file bytes without the public method. /Robbin > > -Alan From jcbeyler at google.com Mon Oct 16 16:34:15 2017 From: jcbeyler at google.com (JC Beyler) Date: Mon, 16 Oct 2017 09:34:15 -0700 Subject: Low-Overhead Heap Profiling In-Reply-To: <5ec70351-910a-96bb-eb03-43ca88bd6259@oracle.com> References: <1497366226.2829.109.camel@oracle.com> <1498215147.2741.34.camel@oracle.com> <044f8c75-72f3-79fd-af47-7ee875c071fd@oracle.com> <23f4e6f5-c94e-01f7-ef1d-5e328d4823c8@oracle.com> <5ec70351-910a-96bb-eb03-43ca88bd6259@oracle.com> Message-ID: Hi Robbin, That is because version 11 to 12 was only a test change. I was going to write about it and say here are the webrev links: Incremental: http://cr.openjdk.java.net/~rasbold/8171119/webrev.11_12/ Full webrev: http://cr.openjdk.java.net/~rasbold/8171119/webrev.12/ This change focused only on refactoring the tests to be more manageable, readable, maintainable. As all tests are looking at allocations, I moved common code to a java class: http://cr.openjdk.java.net/~rasbold/8171119/webrev.11_12/test/hotspot/jtreg/serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitor.java.patch And then most tests call into that class to turn on/off the sampling, allocate, etc. This has removed almost 500 lines of test code so I'm happy about that. Thanks for your changes, a bit of relics of previous versions :). I've already integrated them into my code and will make a new webrev end of this week with a bit of refactor of the code handling the tlab slow path. I find it could use a bit of refactoring to make it easier to follow so I'm going to take a stab at it this week. Any other issues/comments? Thanks! Jc On Mon, Oct 16, 2017 at 8:46 AM, Robbin Ehn wrote: > Hi JC, > > I saw a webrev.12 in the directory, with only test changes(11->12), so I > took that version. > I had a look and tested the tests, worked fine! > > First glance at the code (looking at full v12) some minor things below, > mostly unused stuff. > > Thanks, Robbin > > diff -r 9047e0d726d6 src/hotspot/share/runtime/heapMonitoring.cpp > --- a/src/hotspot/share/runtime/heapMonitoring.cpp Mon Oct 16 > 16:54:06 2017 +0200 > +++ b/src/hotspot/share/runtime/heapMonitoring.cpp Mon Oct 16 > 17:42:42 2017 +0200 > @@ -211,2 +211,3 @@ > void initialize(int max_storage) { > + // validate max_storage to sane value ? What would 0 mean ? > MutexLocker mu(HeapMonitor_lock); > @@ -227,8 +228,4 @@ > bool initialized() { return _initialized; } > - volatile bool *initialized_address() { return &_initialized; } > > private: > - // Protects the traces currently sampled (below). > - volatile intptr_t _stack_storage_lock[1]; > - > // The traces currently sampled. > @@ -313,3 +310,2 @@ > _initialized(false) { > - _stack_storage_lock[0] = 0; > } > @@ -532,13 +528,2 @@ > > -// Delegate the initialization question to the underlying storage system. > -bool HeapMonitoring::initialized() { > - return StackTraceStorage::storage()->initialized(); > -} > - > -// Delegate the initialization question to the underlying storage system. > -bool *HeapMonitoring::initialized_address() { > - return > - const_cast(StackTraceStorage::storage()->initialized_ > address()); > -} > - > void HeapMonitoring::get_live_traces(jvmtiStackTraces *traces) { > diff -r 9047e0d726d6 src/hotspot/share/runtime/heapMonitoring.hpp > --- a/src/hotspot/share/runtime/heapMonitoring.hpp Mon Oct 16 > 16:54:06 2017 +0200 > +++ b/src/hotspot/share/runtime/heapMonitoring.hpp Mon Oct 16 > 17:42:42 2017 +0200 > @@ -35,3 +35,2 @@ > static uint64_t _rnd; > - static bool _initialized; > static jint _monitoring_rate; > @@ -92,7 +91,2 @@ > > - // Is the profiler initialized and where is the address to the > initialized > - // boolean. > - static bool initialized(); > - static bool *initialized_address(); > - > // Called when o is to be sampled from a given thread and a given size. > > > > On 10/10/2017 12:57 AM, JC Beyler wrote: > >> Dear all, >> >> Thread-safety is back!! Here is the update webrev: >> http://cr.openjdk.java.net/~rasbold/8171119/webrev.10_11/ >> >> Full webrev is here: >> http://cr.openjdk.java.net/~rasbold/8171119/webrev.11/ >> >> In order to really test this, I needed to add this so thought now was a >> good time. It required a few changes here for the creation to ensure >> correctness and safety. Now we keep the static pointer but clear the data >> internally so on re-initialize, it will be a bit more costly than before. I >> don't think this is a huge use-case so I did not think it was a problem. I >> used the internal MutexLocker, I think I used it well, let me know. >> >> I also added three tests: >> >> 1) Stack depth test: >> http://cr.openjdk.java.net/~rasbold/8171119/webrev.10_11/tes >> t/hotspot/jtreg/serviceability/jvmti/HeapMonitor/MyPackage/H >> eapMonitorStackDepthTest.java.patch >> >> This test shows that the maximum stack depth system is working. >> >> 2) Thread safety: >> http://cr.openjdk.java.net/~rasbold/8171119/webrev.10_11/tes >> t/hotspot/jtreg/serviceability/jvmti/HeapMonitor/MyPackage/H >> eapMonitorThreadTest.java.patch >> >> The test creates 24 threads and they all allocate at the same time. The >> test then checks it does find samples from all the threads. >> >> 3) Thread on/off safety >> http://cr.openjdk.java.net/~rasbold/8171119/webrev.10_11/tes >> t/hotspot/jtreg/serviceability/jvmti/HeapMonitor/MyPackage/H >> eapMonitorThreadOnOffTest.java.patch >> >> The test creates 24 threads that all allocate a bunch of memory. Then >> another thread turns the sampling on/off. >> >> Btw, both tests 2 & 3 failed without the locks. >> >> As I worked on this, I saw a lot of places where the tests are doing very >> similar things, I'm going to clean up the code a bit and make a >> HeapAllocator class that all tests can call directly. This will greatly >> simplify the code. >> >> Thanks for any comments/criticisms! >> Jc >> >> >> On Mon, Oct 2, 2017 at 8:52 PM, JC Beyler > jcbeyler at google.com>> wrote: >> >> Dear all, >> >> Small update to the webrev: >> http://cr.openjdk.java.net/~rasbold/8171119/webrev.09_10/ < >> http://cr.openjdk.java.net/~rasbold/8171119/webrev.09_10/> >> >> Full webrev is here: >> http://cr.openjdk.java.net/~rasbold/8171119/webrev.10/ < >> http://cr.openjdk.java.net/~rasbold/8171119/webrev.10/> >> >> I updated a bit of the naming, removed a TODO comment, and I added a >> test for testing the sampling rate. I also updated the maximum stack depth >> to 1024, there is no >> reason to keep it so small. I did a micro benchmark that tests the >> overhead and it seems relatively the same. >> >> I compared allocations from a stack depth of 10 and allocations from >> a stack depth of 1024 (allocations are from the same helper method in >> http://cr.openjdk.java.net/~rasbold/8171119/webrev.10/raw_fi >> les/new/test/hotspot/jtreg/serviceability/jvmti/HeapMonitor/ >> MyPackage/HeapMonitorStatRateTest.java >> > iles/new/test/hotspot/jtreg/serviceability/jvmti/HeapMonitor >> /MyPackage/HeapMonitorStatRateTest.java>): >> - For an array of 1 integer allocated in a loop; stack >> depth 1024 vs stack depth 10: 1% slower >> - For an array of 200k integers allocated in a loop; stack >> depth 1024 vs stack depth 10: 3% slower >> >> So basically now moving the maximum stack depth to 1024 but we only >> copy over the stack depths actually used. >> >> For the next webrev, I will be adding a stack depth test to show that >> it works and probably put back the mutex locking so that we can see how >> difficult it is to keep >> thread safe. >> >> Let me know what you think! >> Jc >> >> >> >> On Mon, Sep 25, 2017 at 3:02 PM, JC Beyler > > wrote: >> >> Forgot to say that for my numbers: >> - Not in the test are the actual numbers I got for the various >> array sizes, I ran the program 30 times and parsed the output; here are the >> averages and standard >> deviation: >> 1000: 1.28% average; 1.13% standard deviation >> 10000: 1.59% average; 1.25% standard deviation >> 100000: 1.26% average; 1.26% standard deviation >> >> The 1000/10000/100000 are the sizes of the arrays being >> allocated. These are allocated 100k times and the sampling rate is 111 >> times the size of the array. >> >> Thanks! >> Jc >> >> >> On Mon, Sep 25, 2017 at 3:01 PM, JC Beyler > > wrote: >> >> Hi all, >> >> After a bit of a break, I am back working on this :). As >> before, here are two webrevs: >> >> - Full change set: http://cr.openjdk.java.net/~ra >> sbold/8171119/webrev.09/ > asbold/8171119/webrev.09/> >> - Compared to version 8: http://cr.openjdk.java.net/~ra >> sbold/8171119/webrev.08_09/ > asbold/8171119/webrev.08_09/> >> (This version is compared to version 8 I last showed but >> ported to the new folder hierarchy) >> >> In this version I have: >> - Handled Thomas' comments from his email of 07/03: >> - Merged the logging to be standard >> - Fixed up the code a bit where asked >> - Added some notes about the code not being >> thread-safe yet >> - Removed additional dead code from the version that >> modifies interpreter/c1/c2 >> - Fixed compiler issues so that it compiles with >> --disable-precompiled-header >> - Tested with ./configure --with-boot-jdk= >> --with-debug-level=slowdebug --disable-precompiled-headers >> >> Additionally, I added a test to check the sanity of the >> sampler: HeapMonitorStatCorrectnessTest >> (http://cr.openjdk.java.net/~rasbold/8171119/webrev.08_09/te >> st/hotspot/jtreg/serviceability/jvmti/HeapMonitor/MyPackage/ >> HeapMonitorStatCorrectnessTest.java.patch > asbold/8171119/webrev.08_09/test/hotspot/jtreg/serviceabilit >> y/jvmti/HeapMonitor/MyPackage/HeapMonitorStatCorrectnessTest.java.patch>) >> - This allocates a number of arrays and checks that we >> obtain the number of samples we want with an accepted error of 5%. I tested >> it 100 times and it >> passed everytime, I can test more if wanted >> - Not in the test are the actual numbers I got for the >> various array sizes, I ran the program 30 times and parsed the output; here >> are the averages and >> standard deviation: >> 1000: 1.28% average; 1.13% standard deviation >> 10000: 1.59% average; 1.25% standard deviation >> 100000: 1.26% average; 1.26% standard deviation >> >> What this means is that we were always at about 1~2% of the >> number of samples the test expected. >> >> Let me know what you think, >> Jc >> >> On Wed, Jul 5, 2017 at 9:31 PM, JC Beyler < >> jcbeyler at google.com > wrote: >> >> Hi all, >> >> I apologize, I have not yet handled your remarks but >> thought this new webrev would also be useful to see and comment on perhaps. >> >> Here is the latest webrev, it is generated slightly >> different than the others since now I'm using webrev.ksh without the -N >> option: >> http://cr.openjdk.java.net/~rasbold/8171119/webrev.08/ < >> http://cr.openjdk.java.net/~rasbold/8171119/webrev.08/> >> >> And the webrev.07 to webrev.08 diff is here: >> http://cr.openjdk.java.net/~rasbold/8171119/webrev.07_08/ >> >> >> (Let me know if it works well) >> >> It's a small change between versions but it: >> - provides a fix that makes the average sample rate >> correct (more on that below). >> - fixes the code to actually have it play nicely with >> the fast tlab refill >> - cleaned up a bit the JVMTI text and now use >> jvmtiFrameInfo >> - moved the capability to be onload solo >> >> With this webrev, I've done a small study of the random >> number generator we use here for the sampling rate. I took a small program >> and it can be simplified to: >> >> for (outer loop) >> for (inner loop) >> int[] tmp = new int[arraySize]; >> >> - I've fixed the outer and inner loops to being 800 for >> this experiment, meaning we allocate 640000 times an array of a given array >> size. >> >> - Each program provides the average sample size used for >> the whole execution >> >> - Then, I ran each variation 30 times and then calculated >> the average of the average sample size used for various array sizes. I >> selected the array size to >> be one of the following: 1, 10, 100, 1000. >> >> - When compared to 512kb, the average sample size of 30 >> runs: >> 1: 4.62% of error >> 10: 3.09% of error >> 100: 0.36% of error >> 1000: 0.1% of error >> 10000: 0.03% of error >> >> What it shows is that, depending on the number of >> samples, the average does become better. This is because with an allocation >> of 1 element per array, it >> will take longer to hit one of the thresholds. This is >> seen by looking at the sample count statistic I put in. For the same number >> of iterations (800 * >> 800), the different array sizes provoke: >> 1: 62 samples >> 10: 125 samples >> 100: 788 samples >> 1000: 6166 samples >> 10000: 57721 samples >> >> And of course, the more samples you have, the more sample >> rates you pick, which means that your average gets closer using that math. >> >> Thanks, >> Jc >> >> On Thu, Jun 29, 2017 at 10:01 PM, JC Beyler < >> jcbeyler at google.com > wrote: >> >> Thanks Robbin, >> >> This seems to have worked. When I have the next >> webrev ready, we will find out but I'm fairly confident it will work! >> >> Thanks agian! >> Jc >> >> On Wed, Jun 28, 2017 at 11:46 PM, Robbin Ehn < >> robbin.ehn at oracle.com > wrote: >> >> Hi JC, >> >> On 06/29/2017 12:15 AM, JC Beyler wrote: >> >> B) Incremental changes >> >> >> I guess the most common work flow here is using >> mq : >> hg qnew fix_v1 >> edit files >> hg qrefresh >> hg qnew fix_v2 >> edit files >> hg qrefresh >> >> if you do hg log you will see 2 commits >> >> webrev.ksh -r -2 -o my_inc_v1_v2 >> webrev.ksh -o my_full_v2 >> >> >> In your .hgrc you might need: >> [extensions] >> mq = >> >> /Robbin >> >> >> Again another newbiew question here... >> >> For showing the incremental changes, is there >> a link that explains how to do that? I apologize for my newbie questions >> all the time :) >> >> Right now, I do: >> >> ksh ../webrev.ksh -m -N >> >> That generates a webrev.zip and send it to >> Chuck Rasbold. He then uploads it to a new webrev. >> >> I tried commiting my change and adding a >> small change. Then if I just do ksh ../webrev.ksh without any options, it >> seems to produce a similar >> page but now with only the changes I had (so >> the 06-07 comparison you were talking about) and a changeset that has it >> all. I imagine that is >> what you meant. >> >> Which means that my workflow would become: >> >> 1) Make changes >> 2) Make a webrev without any options to show >> just the differences with the tip >> 3) Amend my changes to my local commit so >> that I have it done with >> 4) Go to 1 >> >> Does that seem correct to you? >> >> Note that when I do this, I only see the full >> change of a file in the full change set (Side note here: now the page says >> change set and not >> patch, which is maybe why Serguei was having >> issues?). >> >> Thanks! >> Jc >> >> >> >> On Wed, Jun 28, 2017 at 1:12 AM, Robbin Ehn < >> robbin.ehn at oracle.com > robbin.ehn at oracle.com >> >> wrote: >> >> Hi, >> >> On 06/28/2017 12:04 AM, JC Beyler wrote: >> >> Dear Thomas et al, >> >> Here is the newest webrev: >> http://cr.openjdk.java.net/~ra >> sbold/8171119/webrev.07/ > asbold/8171119/webrev.07/> >> > asbold/8171119/webrev.07/ > asbold/8171119/webrev.07/>> >> >> >> >> You have some more bits to in there but >> generally this looks good and really nice with more tests. >> I'll do and deep dive and re-test this >> when I get back from my long vacation with whatever patch version you have >> then. >> >> Also I think it's time you provide >> incremental (v06->07 changes) as well as complete change-sets. >> >> Thanks, Robbin >> >> >> >> >> Thomas, I "think" I have answered >> all your remarks. The summary is: >> >> - The statistic system is up and >> provides insight on what the heap sampler is doing >> - I've noticed that, though the >> sampling rate is at the right mean, we are missing some samples, I have not >> yet tracked out why >> (details below) >> >> - I've run a tiny benchmark that is >> the worse case: it is a very tight loop and allocated a small array >> - In this case, I see no >> overhead when the system is off so that is a good start :) >> - I see right now a high >> overhead in this case when sampling is on. This is not a really too >> surprising but I'm going to see if >> this is consistent with our >> internal implementation. The >> benchmark is really allocation stressful so I'm not too surprised but I >> want to do the due diligence. >> >> - The statistic system up is up >> and I have a new test >> http://cr.openjdk.java.net/~ra >> sbold/8171119/webrev.07/test/serviceability/jvmti/HeapMonit >> or/MyPackage/HeapMonitorStatTest.java.patch >> > asbold/8171119/webrev.07/test/serviceability/jvmti/HeapMonit >> or/MyPackage/HeapMonitorStatTest.java.patch> >> > asbold/8171119/webrev.07/test/serviceability/jvmti/HeapMonit >> or/MyPackage/HeapMonitorStatTest.java.patch >> > asbold/8171119/webrev.07/test/serviceability/jvmti/HeapMonit >> or/MyPackage/HeapMonitorStatTest.java.patch>> >> - I did a bit of a study about >> the random generator here, more details are below but basically it seems to >> work well >> >> - I added a capability but since >> this is the first time doing this, I was not sure I did it right >> - I did add a test though for >> it and the test seems to do what I expect (all methods are failing with the >> JVMTI_ERROR_MUST_POSSESS_CAPABILITY error). >> - >> http://cr.openjdk.java.net/~ra >> sbold/8171119/webrev.07/test/serviceability/jvmti/HeapMonit >> or/MyPackage/HeapMonitorNoCapabilityTest.java.patch >> > asbold/8171119/webrev.07/test/serviceability/jvmti/HeapMonit >> or/MyPackage/HeapMonitorNoCapabilityTest.java.patch> >> < >> http://cr.openjdk.java.net/~rasbold/8171119/webrev.07/test/ >> serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorNoCapa >> bilityTest.java.patch >> > asbold/8171119/webrev.07/test/serviceability/jvmti/HeapMonit >> or/MyPackage/HeapMonitorNoCapabilityTest.java.patch>> >> >> - I still need to figure out what >> to do about the multi-agent vs single-agent issue >> >> - As far as measurements, it >> seems I still need to look at: >> - Why we do the 20 random calls >> first, are they necessary? >> - Look at the mean of the >> sampling rate that the random generator does and also what is actually >> sampled >> - What is the overhead in terms >> of memory/performance when on? >> >> I have inlined my answers, I think I >> got them all in the new webrev, let me know your thoughts. >> >> Thanks again! >> Jc >> >> >> On Fri, Jun 23, 2017 at 3:52 AM, >> Thomas Schatzl > com> >> > thomas.schatzl at oracle.com>> > thomas.schatzl at oracle.com> >> >> > >>> wrote: >> >> Hi, >> >> On Wed, 2017-06-21 at 13:45 >> -0700, JC Beyler wrote: >> > Hi all, >> > >> > First off: Thanks again to >> Robbin and Thomas for their reviews :) >> > >> > Next, I've uploaded a new >> webrev: >> > >> http://cr.openjdk.java.net/~rasbold/8171119/webrev.06/ < >> http://cr.openjdk.java.net/~rasbold/8171119/webrev.06/> >> > asbold/8171119/webrev.06/ > asbold/8171119/webrev.06/>> >> > asbold/8171119/webrev.06/ > asbold/8171119/webrev.06/> >> > asbold/8171119/webrev.06/ > asbold/8171119/webrev.06/>>> >> >> > >> > Here is an update: >> > >> > - @Robbin, I forgot to say >> that yes I need to look at implementing >> > this for the other >> architectures and testing it before it is all >> > ready to go. Is it common to >> have it working on all possible >> > combinations or is there a >> subset that I should be doing first and we >> > can do the others later? >> > - I've tested slowdebug, >> built and ran the JTreg tests I wrote with >> > slowdebug and fixed a few >> more issues >> > - I've refactored a bit of >> the code following Thomas' comments >> > - I think I've handled all >> the comments from Thomas (I put >> > comments inline below for the >> specifics) >> >> Thanks for handling all those. >> >> > - Following Thomas' comments >> on statistics, I want to add some >> > quality assurance tests and >> find that the easiest way would be to >> > have a few counters of what >> is happening in the sampler and expose >> > that to the user. >> > - I'll be adding that in >> the next version if no one sees any >> > objections to that. >> > - This will allow me to >> add a sanity test in JTreg about number of >> > samples and average of >> sampling rate >> > >> > @Thomas: I had a few >> questions that I inlined below but I will >> > summarize the "bigger ones" >> here: >> > - You mentioned constants >> are not using the right conventions, I >> > looked around and didn't see >> any convention except normal naming then >> > for static constants. Is that >> right? >> >> I looked through >> https://wiki.openjdk.java.net/display/HotSpot/StyleGui < >> https://wiki.openjdk.java.net/display/HotSpot/StyleGui> >> > /display/HotSpot/StyleGui > /display/HotSpot/StyleGui>> >> > /display/HotSpot/StyleGui > /display/HotSpot/StyleGui> >> > /display/HotSpot/StyleGui > /display/HotSpot/StyleGui>>> >> de and the rule is to "follow >> an existing pattern and must have a >> distinct appearance from other >> names". Which does not help a lot I >> guess :/ The GC team started >> using upper camel case, e.g. >> SomeOtherConstant, but very >> likely this is probably not applied >> consistently throughout. So I >> am fine with not adding another style >> (like kMaxStackDepth with the >> "k" in front with some unknown meaning) >> is fine. >> >> (Chances are you will find that >> style somewhere used anyway too, >> apologies if so :/) >> >> >> Thanks for that link, now I know >> where to look. I used the upper camel case in my code as well then :) I >> should have gotten them all. >> >> >> > PS: I've also inlined my >> answers to Thomas below: >> > >> > On Tue, Jun 13, 2017 at 8:03 >> AM, Thomas Schatzl > > e.com < >> http://e.com> > wrote: >> > > Hi all, >> > > >> > > On Mon, 2017-06-12 at >> 11:11 -0700, JC Beyler wrote: >> > > > Dear all, >> > > > >> > > > I've continued working >> on this and have done the following >> > > webrev: >> > > > >> http://cr.openjdk.java.net/~rasbold/8171119/webrev.05/ < >> http://cr.openjdk.java.net/~rasbold/8171119/webrev.05/> >> > asbold/8171119/webrev.05/ > asbold/8171119/webrev.05/>> >> > asbold/8171119/webrev.05/ > asbold/8171119/webrev.05/> >> > asbold/8171119/webrev.05/ > asbold/8171119/webrev.05/>>> >> >> > > >> > > [...] >> > > > Things I still need to >> do: >> > > > - Have to fix that >> TLAB case for the FastTLABRefill >> > > > - Have to start >> looking at the data to see that it is >> > > consistent and does gather >> the right samples, right frequency, etc. >> > > > - Have to check the >> GC elements and what that produces >> > > > - Run a slowdebug run >> and ensure I fixed all those issues you >> > > saw > Robbin >> > > > >> > > > Thanks for looking at >> the webrev and have a great week! >> > > >> > > scratching a bit on the >> surface of this change, so apologies for >> > > rather shallow comments: >> > > >> > > - >> macroAssembler_x86.cpp:5604: while this is compiler code, and I >> > > am not sure this is final, >> please avoid littering the code with >> > > TODO remarks :) They tend >> to be candidates for later wtf moments >> > > only. >> > > >> > > Just file a CR for that. >> > > >> > Newcomer question: what is a >> CR and not sure I have the rights to do >> > that yet ? :) >> >> Apologies. CR is a change >> request, this suggests to file a bug in the >> bug tracker. And you are right, >> you can't just create a new account in >> the OpenJDK JIRA yourselves. :( >> >> >> Ok good to know, I'll continue with >> my own todo list but I'll work hard on not letting it slip in the webrevs >> anymore :) >> >> >> I was mostly referring to the >> "... but it is a TODO" part of that >> comment in >> macroassembler_x86.cpp. Comments about the why of the code >> are appreciated. >> >> [Note that I now understand >> that this is to some degree still work in >> progress. As long as the final >> changeset does no contain TODO's I am >> fine (and it's not a hard >> objection, rather their use in "final" code >> is typically limited in my >> experience)] >> >> 5603 // Currently, if this >> happens, just set back the actual end to >> where it was. >> 5604 // We miss a chance to >> sample here. >> >> Would be okay, if explaining >> "this" and the "why" of missing a chance >> to sample here would be best. >> >> Like maybe: >> >> // If we needed to refill >> TLABs, just set the actual end point to >> // the end of the TLAB again. >> We do not sample here although we could. >> >> Done with your comment, it works >> well in my mind. >> >> I am not sure whether "miss a >> chance to sample" meant "we could, but >> consciously don't because it's >> not that useful" or "it would be >> necessary but don't because >> it's too complicated to do.". >> >> Looking at the original comment >> once more, I am also not sure if that >> comment shouldn't referring to >> the "end" variable (not actual_end) >> because that's the variable >> that is responsible for taking the sampling >> path? (Going from the member >> description of ThreadLocalAllocBuffer). >> >> >> I've moved this code and it no >> longer shows up here but the rationale and answer was: >> >> So.. Yes, end is the variable >> provoking the sampling. Actual end is the actual end of the TLAB. >> >> What was happening here is that the >> code is resetting _end to point towards the end of the new TLAB. Because, >> we now have the end for >> sampling and _actual_end for >> the actual end, we need to update >> the actual_end as well. >> >> Normally, were we to do the real >> work here, we would calculate the (end - start) offset, then do: >> >> - Set the new end to : start + >> (old_end - old_start) >> - Set the actual end like we do here >> now where it because it is the actual end. >> >> Why is this not done here now >> anymore? >> - I was still debating which >> path to take: >> - Do it in the fast refill >> code, it has its perks: >> - In a world where fast >> refills are happening all the time or a lot, we can augment there the code >> to do the sampling >> - Remember what we had as an >> end before leaving the slowpath and check on return >> - This is what I'm doing >> now, it removes the need to go fix up all fast refill paths but if you >> remain in fast refill paths, >> you won't get sampling. I >> have to think of the consequences of >> that, maybe a future change later on? >> - I have the >> statistics now so I'm going to study that >> -> By the way, >> though my statistics are showing I'm missing some samples, if I turn off >> FastTlabRefill, it is the same >> loss so for now, it seems >> this does not occur in my simple >> test. >> >> >> >> But maybe I am only confused >> and it's best to just leave the comment >> away. :) >> >> Thinking about it some more, >> doesn't this not-sampling in this case >> mean that sampling does not >> work in any collector that does inline TLAB >> allocation at the moment? (Or >> is inline TLAB alloc automatically >> disabled with sampling somehow?) >> >> That would indeed be a bigger >> TODO then :) >> >> >> Agreed, this remark made me think >> that perhaps as a first step the new way of doing it is better but I did >> have to: >> - Remove the const of the >> ThreadLocalBuffer remaining and hard_end methods >> - Move hard_end out of the header >> file to have a bit more logic there >> >> Please let me know what you think of >> that and if you prefer it this way or changing the fast refills. (I prefer >> this way now because it >> is more incremental). >> >> >> > > - calling >> HeapMonitoring::do_weak_oops() (which should probably be >> > > called weak_oops_do() like >> other similar methods) only if string >> > > deduplication is enabled >> (in g1CollectedHeap.cpp:4511) seems wrong. >> > >> > The call should be at least >> around 6 lines up outside the if. >> > >> > Preferentially in a method >> like process_weak_jni_handles(), including >> > additional logging. (No new >> (G1) gc phase without minimal logging >> > :)). >> > Done but really not sure >> because: >> > >> > I put for logging: >> > log_develop_trace(gc, >> freelist)("G1ConcRegionFreeing [other] : heap >> > monitoring"); >> >> I would think that "gc, ref" >> would be more appropriate log tags for >> this similar to jni handles. >> (I am als not sure what weak >> reference handling has to do with >> G1ConcRegionFreeing, so I am a >> bit puzzled) >> >> >> I was not sure what to put for the >> tags or really as the message. I cleaned it up a bit now to: >> log_develop_trace(gc, >> ref)("HeapSampling [other] : heap monitoring processing"); >> >> >> >> > Since weak_jni_handles didn't >> have logging for me to be inspired >> > from, I did that but >> unconvinced this is what should be done. >> >> The JNI handle processing does >> have logging, but only in >> ReferenceProcessor::process_discovered_references(). >> In >> process_weak_jni_handles() only >> overall time is measured (in a G1 >> specific way, since only G1 >> supports disabling reference procesing) :/ >> >> The code in ReferenceProcessor >> prints both time taken >> referenceProcessor.cpp:254, as >> well as the count, but strangely only in >> debug VMs. >> >> I have no idea why this logging >> is that unimportant to only print that >> in a debug VM. However there >> are reviews out for changing this area a >> bit, so it might be useful to >> wait for that (JDK-8173335). >> >> >> I cleaned it up a bit anyway and now >> it returns the count of objects that are in the system. >> >> >> > > - the change doubles the >> size of >> > > >> CollectedHeap::allocate_from_tlab_slow() above the "small and nice" >> > > threshold. Maybe it could >> be refactored a bit. >> > Done I think, it looks better >> to me :). >> >> In >> ThreadLocalAllocBuffer::handle_sample() I think the >> set_back_actual_end()/pick_next_sample() >> calls could be hoisted out of >> the "if" :) >> >> >> Done! >> >> >> > > - >> referenceProcessor.cpp:261: the change should add logging about >> > > the number of references >> encountered, maybe after the corresponding >> > > "JNI weak reference count" >> log message. >> > Just to double check, are you >> saying that you'd like to have the heap >> > sampler to keep in store how >> many sampled objects were encountered in >> > the >> HeapMonitoring::weak_oops_do? >> > - Would a return of the >> method with the number of handled >> > references and logging that >> work? >> >> Yes, it's fine if >> HeapMonitoring::weak_oops_do() only returned the >> number of processed weak oops. >> >> >> Done also (but I admit I have not >> tested the output yet) :) >> >> >> > - Additionally, would you >> prefer it in a separate block with its >> > GCTraceTime? >> >> Yes. Both kinds of information >> is interesting: while the time taken is >> typically more important, the >> next question would be why, and the >> number of references typically >> goes a long way there. >> >> See above though, it is >> probably best to wait a bit. >> >> >> Agreed that I "could" wait but, if >> it's ok, I'll just refactor/remove this when we get closer to something >> final. Either, JDK-8173335 >> has gone in and I will notice it now >> or it will soon and I can change it then. >> >> >> > > - >> threadLocalAllocBuffer.cpp:331: one more "TODO" >> > Removed it and added it to my >> personal todos to look at. >> > > > >> > > - >> threadLocalAllocBuffer.hpp: ThreadLocalAllocBuffer class >> > > documentation should be >> updated about the sampling additions. I >> > > would have no clue what the >> difference between "actual_end" and >> > > "end" would be from the >> given information. >> > If you are talking about the >> comments in this file, I made them more >> > clear I hope in the new >> webrev. If it was somewhere else, let me know >> > where to change. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Alan.Bateman at oracle.com Mon Oct 16 20:26:28 2017 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Mon, 16 Oct 2017 21:26:28 +0100 Subject: Status of JEP159? In-Reply-To: References: <4ac20e04-db8b-160c-0da2-91d4e8fbe8c0@oracle.com> <24b9381a-96fd-1a21-8e0d-069e0338ae5f@oracle.com> <4b3a9eb6-9ad0-37e7-596e-11a5130474a6@oracle.com> <50817987-269d-21d2-c61e-ab239e208a6b@oracle.com> <9057365e-1f47-802c-1004-62b9267ba4e8@oracle.com> Message-ID: On 16/10/2017 16:59, Robbin Ehn wrote: > > I see, in my case it's several CFLH agents with retransformation > capability. The one that added the public method is removed and no > longer called. > Leaving the other agents without the ability to retransform anymore > since they get the class file bytes without the public method. That case is tricky. Minimally we should put something in the spec and javadoc to discourage disabling the event or removing the transformer when the agent is retransformation capable and it has performed schema changes or added/removed methods at initial load. There are several technical solutions possible of course but hard to know whether it's worth introducing more complexity for such rare cases. -Alan From david.holmes at oracle.com Mon Oct 16 21:46:06 2017 From: david.holmes at oracle.com (David Holmes) Date: Tue, 17 Oct 2017 07:46:06 +1000 Subject: Status of JEP159? In-Reply-To: References: <4ac20e04-db8b-160c-0da2-91d4e8fbe8c0@oracle.com> Message-ID: <2c912a09-6cf5-3dd7-3fbf-269b528254f3@oracle.com> On 17/10/2017 1:03 AM, Thomas St?fe wrote: > Hi David, > > On Mon, Oct 16, 2017 at 1:20 PM, David Holmes > wrote: > > Hi Thomas, > > On 16/10/2017 8:40 PM, Thomas St?fe wrote: > > Hi all, > > just a small question. > > While examining a crash in jvmti_GetClassMethods (jdk9) I > noticed that I am able to successfully add and remove methods in > a redefined class. > > But JEP159 is still only in "submitted" stage. Was this feature > added for another JEP? > > > According to the spec, you are not allowed to add/remove methods. > How did you add/remove them? > > https://docs.oracle.com/javase/9/docs/specs/jvmti.html#RedefineClasses > > > David > ----- > > > I used jdb (redefine). I found that add/remove method worked for private > methods, but not for public ones, so that explains it.?I was examining a > bug which now turned out to be a regression of > https://bugs.openjdk.java.net/browse/JDK-8149743 - only in my case it > was not a lambda method but just an ordinary private method. > > Sorry for the noise. It isn't noise. The spec prohibits adding/removing methods - period! It doesn't make an exception for private methods (even if it may seem reasonable to do so). David > Thomas > > > > Thank you! > > Kind Regards, Thomas > > From david.holmes at oracle.com Mon Oct 16 21:47:38 2017 From: david.holmes at oracle.com (David Holmes) Date: Tue, 17 Oct 2017 07:47:38 +1000 Subject: Status of JEP159? In-Reply-To: <24b9381a-96fd-1a21-8e0d-069e0338ae5f@oracle.com> References: <4ac20e04-db8b-160c-0da2-91d4e8fbe8c0@oracle.com> <24b9381a-96fd-1a21-8e0d-069e0338ae5f@oracle.com> Message-ID: On 16/10/2017 11:21 PM, Robbin Ehn wrote: > Hi, if you use class file load hook you can add/remove public methods. > Since this is before the class have been published we don't know how it > should look. > Whether this is according to spec or not, I have no clue. There's no special dispensation in the spec for redefinition at CFLH time AFAICS, so this seems like a bug to me! David > Is it on CFLH ? > > /Robbin > > On 10/16/2017 01:20 PM, David Holmes wrote: >> Hi Thomas, >> >> On 16/10/2017 8:40 PM, Thomas St?fe wrote: >>> Hi all, >>> >>> just a small question. >>> >>> While examining a crash in jvmti_GetClassMethods (jdk9) I noticed >>> that I am able to successfully add and remove methods in a redefined >>> class. >>> >>> But JEP159 is still only in "submitted" stage. Was this feature added >>> for another JEP? >> >> According to the spec, you are not allowed to add/remove methods. How >> did you add/remove them? >> >> https://docs.oracle.com/javase/9/docs/specs/jvmti.html#RedefineClasses >> >> David >> ----- >> >>> Thank you! >>> >>> Kind Regards, Thomas From serguei.spitsyn at oracle.com Tue Oct 17 05:35:51 2017 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 16 Oct 2017 22:35:51 -0700 Subject: RFR (XS): 8173936 [TESTBUG] test/serviceability/jvmti/ModuleAwareAgents/ClassFileLoadHook/MAAClassFileLoadHook.java needs to be re-examined Message-ID: <86cb693e-85d2-7ccc-52c5-d2769b77cca3@oracle.com> An HTML attachment was scrubbed... URL: From david.holmes at oracle.com Tue Oct 17 06:24:03 2017 From: david.holmes at oracle.com (David Holmes) Date: Tue, 17 Oct 2017 16:24:03 +1000 Subject: RFR (XS): 8173936 [TESTBUG] test/serviceability/jvmti/ModuleAwareAgents/ClassFileLoadHook/MAAClassFileLoadHook.java needs to be re-examined In-Reply-To: <86cb693e-85d2-7ccc-52c5-d2769b77cca3@oracle.com> References: <86cb693e-85d2-7ccc-52c5-d2769b77cca3@oracle.com> Message-ID: <5a4d588a-5555-9b3c-d631-6840353a95ae@oracle.com> Hi Serguei, On 17/10/2017 3:35 PM, serguei.spitsyn at oracle.com wrote: > Please, review a fix for the test bug: > https://bugs.openjdk.java.net/browse/JDK-8173936 > > > Webrev: > http://cr.openjdk.java.net/~sspitsyn/webrevs/2017/hotspot/8173936-MAA-cflh.1/ > > > Summary: > ? This test expects CFLH events in the JVMTI start phase but it no > longer gets these events because > ? the Jigsaw implementation has changed in a way that no longer loads > any classes in this phase > ? unless the capability can_generate_early_vmstart is enabled. > ? The fix is to expect CFLH events in the JVMTI start phase only if > this capability is enabled. That description confused me somewhat but now I get it. :) The class the test is looking for is not loaded in the "start phase" now but in the "primordial phase" - unless you set the can_generate_early_vmstart capability to move the start phase back to where it used to be. Okay - seems fine. Thanks, David > > Testing: > ? The fixed test ClassFileLoadHook/MAAClassFileLoadHook.java is passed now. > > Thanks, > Serguei From serguei.spitsyn at oracle.com Tue Oct 17 06:47:54 2017 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 16 Oct 2017 23:47:54 -0700 Subject: RFR (XS): 8173936 [TESTBUG] test/serviceability/jvmti/ModuleAwareAgents/ClassFileLoadHook/MAAClassFileLoadHook.java needs to be re-examined In-Reply-To: <5a4d588a-5555-9b3c-d631-6840353a95ae@oracle.com> References: <86cb693e-85d2-7ccc-52c5-d2769b77cca3@oracle.com> <5a4d588a-5555-9b3c-d631-6840353a95ae@oracle.com> Message-ID: Hi David, On 10/16/17 23:24, David Holmes wrote: > Hi Serguei, > > On 17/10/2017 3:35 PM, serguei.spitsyn at oracle.com wrote: >> Please, review a fix for the test bug: >> https://bugs.openjdk.java.net/browse/JDK-8173936 >> >> >> Webrev: >> http://cr.openjdk.java.net/~sspitsyn/webrevs/2017/hotspot/8173936-MAA-cflh.1/ >> >> >> >> Summary: >> ?? This test expects CFLH events in the JVMTI start phase but it no >> longer gets these events because >> ?? the Jigsaw implementation has changed in a way that no longer >> loads any classes in this phase >> ?? unless the capability can_generate_early_vmstart is enabled. >> ?? The fix is to expect CFLH events in the JVMTI start phase only if >> this capability is enabled. > > That description confused me somewhat but now I get it. :) The class > the test is looking for is not loaded in the "start phase" now but in > the "primordial phase" - unless you set the can_generate_early_vmstart > capability to move the start phase back to where it used to be. Right. Sorry, I was not explicit about it. > > Okay - seems fine. Thank you a lot for quick review! Thanks, Serguei > > Thanks, > David > >> >> Testing: >> ?? The fixed test ClassFileLoadHook/MAAClassFileLoadHook.java is >> passed now. >> >> Thanks, >> Serguei From george.triantafillou at oracle.com Tue Oct 17 12:40:50 2017 From: george.triantafillou at oracle.com (George Triantafillou) Date: Tue, 17 Oct 2017 08:40:50 -0400 Subject: RFR (XS): 8173936 [TESTBUG] test/serviceability/jvmti/ModuleAwareAgents/ClassFileLoadHook/MAAClassFileLoadHook.java needs to be re-examined In-Reply-To: <86cb693e-85d2-7ccc-52c5-d2769b77cca3@oracle.com> References: <86cb693e-85d2-7ccc-52c5-d2769b77cca3@oracle.com> Message-ID: <5b245360-bc77-9595-0f2c-7071e115dbd8@oracle.com> Hi Serguei, This looks good. -George On 10/17/2017 1:35 AM, serguei.spitsyn at oracle.com wrote: > Please, review a fix for the test bug: > https://bugs.openjdk.java.net/browse/JDK-8173936 > > > Webrev: > http://cr.openjdk.java.net/~sspitsyn/webrevs/2017/hotspot/8173936-MAA-cflh.1/ > > > Summary: > ? This test expects CFLH events in the JVMTI start phase but it no > longer gets these events because > ? the Jigsaw implementation has changed in a way that no longer loads > any classes in this phase > ? unless the capability can_generate_early_vmstart is enabled. > ? The fix is to expect CFLH events in the JVMTI start phase only if > this capability is enabled. > > > Testing: > ? The fixed test ClassFileLoadHook/MAAClassFileLoadHook.java is passed > now. > > Thanks, > Serguei -------------- next part -------------- An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Tue Oct 17 14:48:44 2017 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 17 Oct 2017 07:48:44 -0700 Subject: RFR (XS): 8173936 [TESTBUG] test/serviceability/jvmti/ModuleAwareAgents/ClassFileLoadHook/MAAClassFileLoadHook.java needs to be re-examined In-Reply-To: <5b245360-bc77-9595-0f2c-7071e115dbd8@oracle.com> References: <86cb693e-85d2-7ccc-52c5-d2769b77cca3@oracle.com> <5b245360-bc77-9595-0f2c-7071e115dbd8@oracle.com> Message-ID: <2026c017-ca45-7dfb-f4fb-86d0d19e1a61@oracle.com> An HTML attachment was scrubbed... URL: From david.holmes at oracle.com Wed Oct 18 01:53:21 2017 From: david.holmes at oracle.com (David Holmes) Date: Wed, 18 Oct 2017 11:53:21 +1000 Subject: PING: RFR: JDK-8151815: Could not parse core image with JSnap. In-Reply-To: <0a760f75-58bd-c334-26c5-0e9adddfe5b7@gmail.com> References: <7b897b36-1824-606d-b206-df577a6afe02@gmail.com> <0a760f75-58bd-c334-26c5-0e9adddfe5b7@gmail.com> Message-ID: Hi Yasumasa, By chance we ran into this bug which I analysed yesterday: https://bugs.openjdk.java.net/browse/JDK-8189390 We hit the assertion: # Internal Error (/open/src/hotspot/share/runtime/perfMemory.cpp:216), pid=17874, tid=17875 # assert(_prologue != __null) failed: called before initialization # which is misleading because it can fail if called before initialization, or after PerfMemory::destroy has been called. With your changes you no longer null out _prologue so the assertion would now not fail and we'd proceed to access the deleted memory region! I'm unclear why you no longer clear all the fields set during initialization? But it seems to me that there are various checks of _prologue that should really be checking is_initialized() and/or is_destroyed() as a guard. Thanks, David On 16/10/2017 11:25 PM, Yasumasa Suenaga wrote: > PING: > > Could you review it? > >> ?? http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.05/ > > > Thanks, > > Yasumasa > > > On 2017/10/03 13:18, Yasumasa Suenaga wrote: >> Hi all, >> >> I added gtest unit test case for this change in new webrev: >> >> ?? http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.05/ >> >> Could you review it? >> >> >> Thanks, >> >> Yasumasa >> >> >> >> 2017-09-27 0:01 GMT+09:00 Yasumasa Suenaga : >>> Hi all, >>> >>> I uploaded new webrev to be adapted to jdk10/hs: >>> >>> ?? http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.04/ >>> >>> >>> Thanks, >>> >>> Yasumasa >>> >>> >>> On 2017/09/21 7:45, Yasumasa Suenaga wrote: >>>> >>>> PING: >>>> >>>> Have you checked this issue? >>>> >>>>>> ??? http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.03/ >>>> >>>> >>>> >>>> Yasumasa >>>> >>>> >>>> On 2017/07/01 23:43, Yasumasa Suenaga wrote: >>>>> >>>>> PING: >>>>> >>>>> Have you checked this issue? >>>>> >>>>> >>>>> Yasumasa >>>>> >>>>> >>>>> On 2017/06/13 14:10, Yasumasa Suenaga wrote: >>>>>> >>>>>> Hi all, >>>>>> >>>>>> I want to discuss about JDK-8151815: Could not parse core image with >>>>>> JSnap. >>>>>> >>>>>> >>>>>> In last year, I found JSnap cannot parse coredump and I've sent >>>>>> review >>>>>> request for it as JDK-8151815. However it has not been reviewed yet >>>>>> [1]. >>>>>> >>>>>> We've discussed about safety implementation, but we could not get >>>>>> consensus. >>>>>> IMHO all SA tools should be handled java processes and core images, >>>>>> and PerfCounter value is useful. So I fix this issue. >>>>>> >>>>>> I uploaded new webrev for this issue. I think this patch is safety >>>>>> because new flag PerfMemory::_destroyed guards double free, and all >>>>>> members in PerfMemory is accessible (they are not munmap'ed) >>>>>> >>>>>> ??? http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.03/ >>>>>> >>>>>> >>>>>> Can you cooperate? >>>>>> >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Yasumasa >>>>>> >>>>>> >>>>>> [1] >>>>>> http://mail.openjdk.java.net/pipermail/serviceability-dev/2016-April/019480.html >>>>>> >>>>>> >>> From erik.gahlin at oracle.com Wed Oct 18 02:23:35 2017 From: erik.gahlin at oracle.com (Erik Gahlin) Date: Wed, 18 Oct 2017 04:23:35 +0200 Subject: RFR(S): 8189425: Minor updates in support of closed changes Message-ID: <59E6BB27.8020605@oracle.com> Hi, Could I have a review of this change that will adjust an assertion and remove a lock associated with JFR. Webrev: http://cr.openjdk.java.net/~egahlin/8189425_0 Bug: https://bugs.openjdk.java.net/browse/JDK-8189425 Thanks Erik From david.holmes at oracle.com Wed Oct 18 02:33:45 2017 From: david.holmes at oracle.com (David Holmes) Date: Wed, 18 Oct 2017 12:33:45 +1000 Subject: RFR(S): 8189425: Minor updates in support of closed changes In-Reply-To: <59E6BB27.8020605@oracle.com> References: <59E6BB27.8020605@oracle.com> Message-ID: <9acd0472-de6b-4f6f-454e-7a325492af15@oracle.com> Hi Erik, On 18/10/2017 12:23 PM, Erik Gahlin wrote: > Hi, > > Could I have a review of this change that will adjust an assertion and Can you explain the adjustment please. > remove a lock associated with JFR. That bit is fine :) Thanks, David > Webrev: > http://cr.openjdk.java.net/~egahlin/8189425_0 > > Bug: > https://bugs.openjdk.java.net/browse/JDK-8189425 > > Thanks > Erik > > From erik.gahlin at oracle.com Wed Oct 18 02:34:18 2017 From: erik.gahlin at oracle.com (Erik Gahlin) Date: Wed, 18 Oct 2017 04:34:18 +0200 Subject: RFR: 8189440: Event tracing macros for allocation and weak oops processing Message-ID: <59E6BDAA.3080502@oracle.com> Hi, Could I have a review of a change that adds two macros to be used with event-based JVM tracing. Bug: https://bugs.openjdk.java.net/browse/JDK-8189440 Webrev: http://cr.openjdk.java.net/~egahlin/8189440_0 Thanks Erik From yasuenag at gmail.com Wed Oct 18 02:37:11 2017 From: yasuenag at gmail.com (Yasumasa Suenaga) Date: Wed, 18 Oct 2017 11:37:11 +0900 Subject: PING: RFR: JDK-8151815: Could not parse core image with JSnap. In-Reply-To: References: <7b897b36-1824-606d-b206-df577a6afe02@gmail.com> <0a760f75-58bd-c334-26c5-0e9adddfe5b7@gmail.com> Message-ID: Hi David, > With your changes you no longer null out _prologue so the assertion would > now not fail and we'd proceed to access the deleted memory region! On Linux, PerfMemory::delete_memory_region() does not call munmap() for PerfMemory. > I'm unclear why you no longer clear all the fields set during > initialization? PerfMemory.java in jdk.hotspot.agent needs these field values. `jhsdb jsnap --core` is failed if they are cleared. > But it seems to me that there are various checks of > _prologue that should really be checking is_initialized() and/or > is_destroyed() as a guard. Should I change all assertions for _prologue? Thanks, Yasumasa 2017-10-18 10:53 GMT+09:00 David Holmes : > Hi Yasumasa, > > By chance we ran into this bug which I analysed yesterday: > > https://bugs.openjdk.java.net/browse/JDK-8189390 > > We hit the assertion: > > # Internal Error (/open/src/hotspot/share/runtime/perfMemory.cpp:216), > pid=17874, tid=17875 > # assert(_prologue != __null) failed: called before initialization > # > > which is misleading because it can fail if called before initialization, or > after PerfMemory::destroy has been called. > > With your changes you no longer null out _prologue so the assertion would > now not fail and we'd proceed to access the deleted memory region! > > I'm unclear why you no longer clear all the fields set during > initialization? But it seems to me that there are various checks of > _prologue that should really be checking is_initialized() and/or > is_destroyed() as a guard. > > Thanks, > David > > > On 16/10/2017 11:25 PM, Yasumasa Suenaga wrote: >> >> PING: >> >> Could you review it? >> >>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.05/ >> >> >> >> Thanks, >> >> Yasumasa >> >> >> On 2017/10/03 13:18, Yasumasa Suenaga wrote: >>> >>> Hi all, >>> >>> I added gtest unit test case for this change in new webrev: >>> >>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.05/ >>> >>> Could you review it? >>> >>> >>> Thanks, >>> >>> Yasumasa >>> >>> >>> >>> 2017-09-27 0:01 GMT+09:00 Yasumasa Suenaga : >>>> >>>> Hi all, >>>> >>>> I uploaded new webrev to be adapted to jdk10/hs: >>>> >>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.04/ >>>> >>>> >>>> Thanks, >>>> >>>> Yasumasa >>>> >>>> >>>> On 2017/09/21 7:45, Yasumasa Suenaga wrote: >>>>> >>>>> >>>>> PING: >>>>> >>>>> Have you checked this issue? >>>>> >>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.03/ >>>>> >>>>> >>>>> >>>>> >>>>> Yasumasa >>>>> >>>>> >>>>> On 2017/07/01 23:43, Yasumasa Suenaga wrote: >>>>>> >>>>>> >>>>>> PING: >>>>>> >>>>>> Have you checked this issue? >>>>>> >>>>>> >>>>>> Yasumasa >>>>>> >>>>>> >>>>>> On 2017/06/13 14:10, Yasumasa Suenaga wrote: >>>>>>> >>>>>>> >>>>>>> Hi all, >>>>>>> >>>>>>> I want to discuss about JDK-8151815: Could not parse core image with >>>>>>> JSnap. >>>>>>> >>>>>>> >>>>>>> In last year, I found JSnap cannot parse coredump and I've sent >>>>>>> review >>>>>>> request for it as JDK-8151815. However it has not been reviewed yet >>>>>>> [1]. >>>>>>> >>>>>>> We've discussed about safety implementation, but we could not get >>>>>>> consensus. >>>>>>> IMHO all SA tools should be handled java processes and core images, >>>>>>> and PerfCounter value is useful. So I fix this issue. >>>>>>> >>>>>>> I uploaded new webrev for this issue. I think this patch is safety >>>>>>> because new flag PerfMemory::_destroyed guards double free, and all >>>>>>> members in PerfMemory is accessible (they are not munmap'ed) >>>>>>> >>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.03/ >>>>>>> >>>>>>> >>>>>>> Can you cooperate? >>>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Yasumasa >>>>>>> >>>>>>> >>>>>>> [1] >>>>>>> >>>>>>> http://mail.openjdk.java.net/pipermail/serviceability-dev/2016-April/019480.html >>>>>>> >>>> > From david.holmes at oracle.com Wed Oct 18 03:44:32 2017 From: david.holmes at oracle.com (David Holmes) Date: Wed, 18 Oct 2017 13:44:32 +1000 Subject: RFR: 8189440: Event tracing macros for allocation and weak oops processing In-Reply-To: <59E6BDAA.3080502@oracle.com> References: <59E6BDAA.3080502@oracle.com> Message-ID: Hi Erik, On 18/10/2017 12:34 PM, Erik Gahlin wrote: > Hi, > > Could I have a review of a change that adds two macros to be used with > event-based JVM tracing. > > Bug: > https://bugs.openjdk.java.net/browse/JDK-8189440 > > Webrev: > http://cr.openjdk.java.net/~egahlin/8189440_0 Reviewed - though all somewhat mysterious in isolation :( My only real query is in jniHandles.cpp: JvmtiExport::weak_oops_do(is_alive, f); + TRACE_WEAK_OOPS_DO(is_alive, f); Can't/shouldn't the tracing be done inside weak_oops_do? Thanks, David > Thanks > Erik From david.holmes at oracle.com Wed Oct 18 03:55:10 2017 From: david.holmes at oracle.com (David Holmes) Date: Wed, 18 Oct 2017 13:55:10 +1000 Subject: PING: RFR: JDK-8151815: Could not parse core image with JSnap. In-Reply-To: References: <7b897b36-1824-606d-b206-df577a6afe02@gmail.com> <0a760f75-58bd-c334-26c5-0e9adddfe5b7@gmail.com> Message-ID: <643d9ea2-5bb3-f6ef-8007-14cd1c580137@oracle.com> On 18/10/2017 12:37 PM, Yasumasa Suenaga wrote: > Hi David, > >> With your changes you no longer null out _prologue so the assertion would >> now not fail and we'd proceed to access the deleted memory region! > > On Linux, PerfMemory::delete_memory_region() does not call munmap() > for PerfMemory. Perhaps not but there are still other actions that happen and the point is we should not be able to continue to use PerfMemory once it has been destroyed (even if the destruction is only logical). > >> I'm unclear why you no longer clear all the fields set during >> initialization? > > PerfMemory.java in jdk.hotspot.agent needs these field values. > `jhsdb jsnap --core` is failed if they are cleared. I'm not familiar with these tools. When do we produce a core file after calling PerfMemory::destroy ? > >> But it seems to me that there are various checks of >> _prologue that should really be checking is_initialized() and/or >> is_destroyed() as a guard. > > Should I change all assertions for _prologue? Assertions and direct guards. Checking _prologue is a placeholder for the real check. Thanks, David > > Thanks, > > Yasumasa > > > 2017-10-18 10:53 GMT+09:00 David Holmes : >> Hi Yasumasa, >> >> By chance we ran into this bug which I analysed yesterday: >> >> https://bugs.openjdk.java.net/browse/JDK-8189390 >> >> We hit the assertion: >> >> # Internal Error (/open/src/hotspot/share/runtime/perfMemory.cpp:216), >> pid=17874, tid=17875 >> # assert(_prologue != __null) failed: called before initialization >> # >> >> which is misleading because it can fail if called before initialization, or >> after PerfMemory::destroy has been called. >> >> With your changes you no longer null out _prologue so the assertion would >> now not fail and we'd proceed to access the deleted memory region! >> >> I'm unclear why you no longer clear all the fields set during >> initialization? But it seems to me that there are various checks of >> _prologue that should really be checking is_initialized() and/or >> is_destroyed() as a guard. >> >> Thanks, >> David >> >> >> On 16/10/2017 11:25 PM, Yasumasa Suenaga wrote: >>> >>> PING: >>> >>> Could you review it? >>> >>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.05/ >>> >>> >>> >>> Thanks, >>> >>> Yasumasa >>> >>> >>> On 2017/10/03 13:18, Yasumasa Suenaga wrote: >>>> >>>> Hi all, >>>> >>>> I added gtest unit test case for this change in new webrev: >>>> >>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.05/ >>>> >>>> Could you review it? >>>> >>>> >>>> Thanks, >>>> >>>> Yasumasa >>>> >>>> >>>> >>>> 2017-09-27 0:01 GMT+09:00 Yasumasa Suenaga : >>>>> >>>>> Hi all, >>>>> >>>>> I uploaded new webrev to be adapted to jdk10/hs: >>>>> >>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.04/ >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Yasumasa >>>>> >>>>> >>>>> On 2017/09/21 7:45, Yasumasa Suenaga wrote: >>>>>> >>>>>> >>>>>> PING: >>>>>> >>>>>> Have you checked this issue? >>>>>> >>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.03/ >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> Yasumasa >>>>>> >>>>>> >>>>>> On 2017/07/01 23:43, Yasumasa Suenaga wrote: >>>>>>> >>>>>>> >>>>>>> PING: >>>>>>> >>>>>>> Have you checked this issue? >>>>>>> >>>>>>> >>>>>>> Yasumasa >>>>>>> >>>>>>> >>>>>>> On 2017/06/13 14:10, Yasumasa Suenaga wrote: >>>>>>>> >>>>>>>> >>>>>>>> Hi all, >>>>>>>> >>>>>>>> I want to discuss about JDK-8151815: Could not parse core image with >>>>>>>> JSnap. >>>>>>>> >>>>>>>> >>>>>>>> In last year, I found JSnap cannot parse coredump and I've sent >>>>>>>> review >>>>>>>> request for it as JDK-8151815. However it has not been reviewed yet >>>>>>>> [1]. >>>>>>>> >>>>>>>> We've discussed about safety implementation, but we could not get >>>>>>>> consensus. >>>>>>>> IMHO all SA tools should be handled java processes and core images, >>>>>>>> and PerfCounter value is useful. So I fix this issue. >>>>>>>> >>>>>>>> I uploaded new webrev for this issue. I think this patch is safety >>>>>>>> because new flag PerfMemory::_destroyed guards double free, and all >>>>>>>> members in PerfMemory is accessible (they are not munmap'ed) >>>>>>>> >>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.03/ >>>>>>>> >>>>>>>> >>>>>>>> Can you cooperate? >>>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Yasumasa >>>>>>>> >>>>>>>> >>>>>>>> [1] >>>>>>>> >>>>>>>> http://mail.openjdk.java.net/pipermail/serviceability-dev/2016-April/019480.html >>>>>>>> >>>>> >> From yasuenag at gmail.com Wed Oct 18 04:27:42 2017 From: yasuenag at gmail.com (Yasumasa Suenaga) Date: Wed, 18 Oct 2017 13:27:42 +0900 Subject: PING: RFR: JDK-8151815: Could not parse core image with JSnap. In-Reply-To: <643d9ea2-5bb3-f6ef-8007-14cd1c580137@oracle.com> References: <7b897b36-1824-606d-b206-df577a6afe02@gmail.com> <0a760f75-58bd-c334-26c5-0e9adddfe5b7@gmail.com> <643d9ea2-5bb3-f6ef-8007-14cd1c580137@oracle.com> Message-ID: Hi David, 2017-10-18 12:55 GMT+09:00 David Holmes : > On 18/10/2017 12:37 PM, Yasumasa Suenaga wrote: >> >> Hi David, >> >>> With your changes you no longer null out _prologue so the assertion would >>> now not fail and we'd proceed to access the deleted memory region! >> >> >> On Linux, PerfMemory::delete_memory_region() does not call munmap() >> for PerfMemory. > > > Perhaps not but there are still other actions that happen and the point is > we should not be able to continue to use PerfMemory once it has been > destroyed (even if the destruction is only logical). I received same comment from Dmitry in the past, but we couldn't decide how should we do. http://mail.openjdk.java.net/pipermail/serviceability-dev/2016-May/019728.html In that discussion, I uploaded another webrev which adds other fields for JSnap. Is it suitable? http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.02/ >>> I'm unclear why you no longer clear all the fields set during >>> initialization? >> >> >> PerfMemory.java in jdk.hotspot.agent needs these field values. >> `jhsdb jsnap --core` is failed if they are cleared. > > > I'm not familiar with these tools. When do we produce a core file after > calling PerfMemory::destroy ? PerfMemory::destroy() is called before aborting. ----------------------- #0 perfMemory_exit () at /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/share/vm/runtime/perfMemory.cpp:80 #1 0x00007f99b091c949 in os::shutdown () at /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:1483 #2 0x00007f99b091c980 in os::abort (dump_core=) at /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:1503 #3 0x00007f99b0b689c3 in VMError::report_and_die ( this=this at entry=0x7ffcacf40b50) at /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/share/vm/utilities/vmError.cpp:1060 #4 0x00007f99b0926f04 in JVM_handle_linux_signal (sig=sig at entry=11, info=info at entry=0x7ffcacf40df0, ucVoid=ucVoid at entry=0x7ffcacf40cc0, abort_if_unrecognized=abort_if_unrecognized at entry=1) at /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/os_cpu/linux_x86/vm/os_linux_x86.cpp:541 ----------------------- Thanks, Yasumasa >>> But it seems to me that there are various checks of >>> _prologue that should really be checking is_initialized() and/or >>> is_destroyed() as a guard. >> >> >> Should I change all assertions for _prologue? > > > Assertions and direct guards. Checking _prologue is a placeholder for the > real check. > > > Thanks, > David > > >> >> Thanks, >> >> Yasumasa >> >> >> 2017-10-18 10:53 GMT+09:00 David Holmes : >>> >>> Hi Yasumasa, >>> >>> By chance we ran into this bug which I analysed yesterday: >>> >>> https://bugs.openjdk.java.net/browse/JDK-8189390 >>> >>> We hit the assertion: >>> >>> # Internal Error (/open/src/hotspot/share/runtime/perfMemory.cpp:216), >>> pid=17874, tid=17875 >>> # assert(_prologue != __null) failed: called before initialization >>> # >>> >>> which is misleading because it can fail if called before initialization, >>> or >>> after PerfMemory::destroy has been called. >>> >>> With your changes you no longer null out _prologue so the assertion would >>> now not fail and we'd proceed to access the deleted memory region! >>> >>> I'm unclear why you no longer clear all the fields set during >>> initialization? But it seems to me that there are various checks of >>> _prologue that should really be checking is_initialized() and/or >>> is_destroyed() as a guard. >>> >>> Thanks, >>> David >>> >>> >>> On 16/10/2017 11:25 PM, Yasumasa Suenaga wrote: >>>> >>>> >>>> PING: >>>> >>>> Could you review it? >>>> >>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.05/ >>>> >>>> >>>> >>>> >>>> Thanks, >>>> >>>> Yasumasa >>>> >>>> >>>> On 2017/10/03 13:18, Yasumasa Suenaga wrote: >>>>> >>>>> >>>>> Hi all, >>>>> >>>>> I added gtest unit test case for this change in new webrev: >>>>> >>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.05/ >>>>> >>>>> Could you review it? >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Yasumasa >>>>> >>>>> >>>>> >>>>> 2017-09-27 0:01 GMT+09:00 Yasumasa Suenaga : >>>>>> >>>>>> >>>>>> Hi all, >>>>>> >>>>>> I uploaded new webrev to be adapted to jdk10/hs: >>>>>> >>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.04/ >>>>>> >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Yasumasa >>>>>> >>>>>> >>>>>> On 2017/09/21 7:45, Yasumasa Suenaga wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>> PING: >>>>>>> >>>>>>> Have you checked this issue? >>>>>>> >>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.03/ >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> Yasumasa >>>>>>> >>>>>>> >>>>>>> On 2017/07/01 23:43, Yasumasa Suenaga wrote: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> PING: >>>>>>>> >>>>>>>> Have you checked this issue? >>>>>>>> >>>>>>>> >>>>>>>> Yasumasa >>>>>>>> >>>>>>>> >>>>>>>> On 2017/06/13 14:10, Yasumasa Suenaga wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Hi all, >>>>>>>>> >>>>>>>>> I want to discuss about JDK-8151815: Could not parse core image >>>>>>>>> with >>>>>>>>> JSnap. >>>>>>>>> >>>>>>>>> >>>>>>>>> In last year, I found JSnap cannot parse coredump and I've sent >>>>>>>>> review >>>>>>>>> request for it as JDK-8151815. However it has not been reviewed yet >>>>>>>>> [1]. >>>>>>>>> >>>>>>>>> We've discussed about safety implementation, but we could not get >>>>>>>>> consensus. >>>>>>>>> IMHO all SA tools should be handled java processes and core images, >>>>>>>>> and PerfCounter value is useful. So I fix this issue. >>>>>>>>> >>>>>>>>> I uploaded new webrev for this issue. I think this patch is safety >>>>>>>>> because new flag PerfMemory::_destroyed guards double free, and all >>>>>>>>> members in PerfMemory is accessible (they are not munmap'ed) >>>>>>>>> >>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.03/ >>>>>>>>> >>>>>>>>> >>>>>>>>> Can you cooperate? >>>>>>>>> >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> Yasumasa >>>>>>>>> >>>>>>>>> >>>>>>>>> [1] >>>>>>>>> >>>>>>>>> >>>>>>>>> http://mail.openjdk.java.net/pipermail/serviceability-dev/2016-April/019480.html >>>>>>>>> >>>>>> >>> > From david.holmes at oracle.com Wed Oct 18 04:44:58 2017 From: david.holmes at oracle.com (David Holmes) Date: Wed, 18 Oct 2017 14:44:58 +1000 Subject: PING: RFR: JDK-8151815: Could not parse core image with JSnap. In-Reply-To: References: <7b897b36-1824-606d-b206-df577a6afe02@gmail.com> <0a760f75-58bd-c334-26c5-0e9adddfe5b7@gmail.com> <643d9ea2-5bb3-f6ef-8007-14cd1c580137@oracle.com> Message-ID: <4370fcb0-d06d-865b-8bab-e03ec1812e89@oracle.com> On 18/10/2017 2:27 PM, Yasumasa Suenaga wrote: > Hi David, > > 2017-10-18 12:55 GMT+09:00 David Holmes : >> On 18/10/2017 12:37 PM, Yasumasa Suenaga wrote: >>> >>> Hi David, >>> >>>> With your changes you no longer null out _prologue so the assertion would >>>> now not fail and we'd proceed to access the deleted memory region! >>> >>> >>> On Linux, PerfMemory::delete_memory_region() does not call munmap() >>> for PerfMemory. >> >> >> Perhaps not but there are still other actions that happen and the point is >> we should not be able to continue to use PerfMemory once it has been >> destroyed (even if the destruction is only logical). > > I received same comment from Dmitry in the past, but we couldn't > decide how should we do. > > http://mail.openjdk.java.net/pipermail/serviceability-dev/2016-May/019728.html > > In that discussion, I uploaded another webrev which adds other fields for JSnap. > Is it suitable? > > http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.02/ I don't think we need the extra fields, just ensure the existing ones can't be accessed (other than by the tools) after destroy is called. > >>>> I'm unclear why you no longer clear all the fields set during >>>> initialization? >>> >>> >>> PerfMemory.java in jdk.hotspot.agent needs these field values. >>> `jhsdb jsnap --core` is failed if they are cleared. >> >> >> I'm not familiar with these tools. When do we produce a core file after >> calling PerfMemory::destroy ? > > PerfMemory::destroy() is called before aborting. Ah - right. I assume we need to close off the perfdata file before we abort. Thanks, David > ----------------------- > #0 perfMemory_exit () > at /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/share/vm/runtime/perfMemory.cpp:80 > #1 0x00007f99b091c949 in os::shutdown () > at /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:1483 > #2 0x00007f99b091c980 in os::abort (dump_core=) > at /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:1503 > #3 0x00007f99b0b689c3 in VMError::report_and_die ( > this=this at entry=0x7ffcacf40b50) > at /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/share/vm/utilities/vmError.cpp:1060 > #4 0x00007f99b0926f04 in JVM_handle_linux_signal (sig=sig at entry=11, > info=info at entry=0x7ffcacf40df0, ucVoid=ucVoid at entry=0x7ffcacf40cc0, > abort_if_unrecognized=abort_if_unrecognized at entry=1) > at /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/os_cpu/linux_x86/vm/os_linux_x86.cpp:541 > ----------------------- > > > Thanks, > > Yasumasa > > >>>> But it seems to me that there are various checks of >>>> _prologue that should really be checking is_initialized() and/or >>>> is_destroyed() as a guard. >>> >>> >>> Should I change all assertions for _prologue? >> >> >> Assertions and direct guards. Checking _prologue is a placeholder for the >> real check. >> >> >> Thanks, >> David >> >> >>> >>> Thanks, >>> >>> Yasumasa >>> >>> >>> 2017-10-18 10:53 GMT+09:00 David Holmes : >>>> >>>> Hi Yasumasa, >>>> >>>> By chance we ran into this bug which I analysed yesterday: >>>> >>>> https://bugs.openjdk.java.net/browse/JDK-8189390 >>>> >>>> We hit the assertion: >>>> >>>> # Internal Error (/open/src/hotspot/share/runtime/perfMemory.cpp:216), >>>> pid=17874, tid=17875 >>>> # assert(_prologue != __null) failed: called before initialization >>>> # >>>> >>>> which is misleading because it can fail if called before initialization, >>>> or >>>> after PerfMemory::destroy has been called. >>>> >>>> With your changes you no longer null out _prologue so the assertion would >>>> now not fail and we'd proceed to access the deleted memory region! >>>> >>>> I'm unclear why you no longer clear all the fields set during >>>> initialization? But it seems to me that there are various checks of >>>> _prologue that should really be checking is_initialized() and/or >>>> is_destroyed() as a guard. >>>> >>>> Thanks, >>>> David >>>> >>>> >>>> On 16/10/2017 11:25 PM, Yasumasa Suenaga wrote: >>>>> >>>>> >>>>> PING: >>>>> >>>>> Could you review it? >>>>> >>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.05/ >>>>> >>>>> >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Yasumasa >>>>> >>>>> >>>>> On 2017/10/03 13:18, Yasumasa Suenaga wrote: >>>>>> >>>>>> >>>>>> Hi all, >>>>>> >>>>>> I added gtest unit test case for this change in new webrev: >>>>>> >>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.05/ >>>>>> >>>>>> Could you review it? >>>>>> >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Yasumasa >>>>>> >>>>>> >>>>>> >>>>>> 2017-09-27 0:01 GMT+09:00 Yasumasa Suenaga : >>>>>>> >>>>>>> >>>>>>> Hi all, >>>>>>> >>>>>>> I uploaded new webrev to be adapted to jdk10/hs: >>>>>>> >>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.04/ >>>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Yasumasa >>>>>>> >>>>>>> >>>>>>> On 2017/09/21 7:45, Yasumasa Suenaga wrote: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> PING: >>>>>>>> >>>>>>>> Have you checked this issue? >>>>>>>> >>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.03/ >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Yasumasa >>>>>>>> >>>>>>>> >>>>>>>> On 2017/07/01 23:43, Yasumasa Suenaga wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> PING: >>>>>>>>> >>>>>>>>> Have you checked this issue? >>>>>>>>> >>>>>>>>> >>>>>>>>> Yasumasa >>>>>>>>> >>>>>>>>> >>>>>>>>> On 2017/06/13 14:10, Yasumasa Suenaga wrote: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Hi all, >>>>>>>>>> >>>>>>>>>> I want to discuss about JDK-8151815: Could not parse core image >>>>>>>>>> with >>>>>>>>>> JSnap. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> In last year, I found JSnap cannot parse coredump and I've sent >>>>>>>>>> review >>>>>>>>>> request for it as JDK-8151815. However it has not been reviewed yet >>>>>>>>>> [1]. >>>>>>>>>> >>>>>>>>>> We've discussed about safety implementation, but we could not get >>>>>>>>>> consensus. >>>>>>>>>> IMHO all SA tools should be handled java processes and core images, >>>>>>>>>> and PerfCounter value is useful. So I fix this issue. >>>>>>>>>> >>>>>>>>>> I uploaded new webrev for this issue. I think this patch is safety >>>>>>>>>> because new flag PerfMemory::_destroyed guards double free, and all >>>>>>>>>> members in PerfMemory is accessible (they are not munmap'ed) >>>>>>>>>> >>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.03/ >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Can you cooperate? >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> >>>>>>>>>> Yasumasa >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> [1] >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> http://mail.openjdk.java.net/pipermail/serviceability-dev/2016-April/019480.html >>>>>>>>>> >>>>>>> >>>> >> From yasuenag at gmail.com Wed Oct 18 06:34:13 2017 From: yasuenag at gmail.com (Yasumasa Suenaga) Date: Wed, 18 Oct 2017 15:34:13 +0900 Subject: PING: RFR: JDK-8151815: Could not parse core image with JSnap. In-Reply-To: <4370fcb0-d06d-865b-8bab-e03ec1812e89@oracle.com> References: <7b897b36-1824-606d-b206-df577a6afe02@gmail.com> <0a760f75-58bd-c334-26c5-0e9adddfe5b7@gmail.com> <643d9ea2-5bb3-f6ef-8007-14cd1c580137@oracle.com> <4370fcb0-d06d-865b-8bab-e03ec1812e89@oracle.com> Message-ID: Hi David, > I don't think we need the extra fields, just ensure the existing ones can't > be accessed (other than by the tools) after destroy is called. I've added PerfMemory::is_useable() to check whether we can access to PerfMemory. I think this webrev prevent to access to PerfMemory after destroy() call. http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.06/ Thanks, Yasumasa 2017-10-18 13:44 GMT+09:00 David Holmes : > On 18/10/2017 2:27 PM, Yasumasa Suenaga wrote: >> >> Hi David, >> >> 2017-10-18 12:55 GMT+09:00 David Holmes : >>> >>> On 18/10/2017 12:37 PM, Yasumasa Suenaga wrote: >>>> >>>> >>>> Hi David, >>>> >>>>> With your changes you no longer null out _prologue so the assertion >>>>> would >>>>> now not fail and we'd proceed to access the deleted memory region! >>>> >>>> >>>> >>>> On Linux, PerfMemory::delete_memory_region() does not call munmap() >>>> for PerfMemory. >>> >>> >>> >>> Perhaps not but there are still other actions that happen and the point >>> is >>> we should not be able to continue to use PerfMemory once it has been >>> destroyed (even if the destruction is only logical). >> >> >> I received same comment from Dmitry in the past, but we couldn't >> decide how should we do. >> >> >> http://mail.openjdk.java.net/pipermail/serviceability-dev/2016-May/019728.html >> >> In that discussion, I uploaded another webrev which adds other fields for >> JSnap. >> Is it suitable? >> >> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.02/ > > > I don't think we need the extra fields, just ensure the existing ones can't > be accessed (other than by the tools) after destroy is called. > >> >>>>> I'm unclear why you no longer clear all the fields set during >>>>> initialization? >>>> >>>> >>>> >>>> PerfMemory.java in jdk.hotspot.agent needs these field values. >>>> `jhsdb jsnap --core` is failed if they are cleared. >>> >>> >>> >>> I'm not familiar with these tools. When do we produce a core file after >>> calling PerfMemory::destroy ? >> >> >> PerfMemory::destroy() is called before aborting. > > > Ah - right. I assume we need to close off the perfdata file before we abort. > > Thanks, > David > > >> ----------------------- >> #0 perfMemory_exit () >> at >> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/share/vm/runtime/perfMemory.cpp:80 >> #1 0x00007f99b091c949 in os::shutdown () >> at >> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:1483 >> #2 0x00007f99b091c980 in os::abort (dump_core=) >> at >> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:1503 >> #3 0x00007f99b0b689c3 in VMError::report_and_die ( >> this=this at entry=0x7ffcacf40b50) >> at >> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/share/vm/utilities/vmError.cpp:1060 >> #4 0x00007f99b0926f04 in JVM_handle_linux_signal (sig=sig at entry=11, >> info=info at entry=0x7ffcacf40df0, ucVoid=ucVoid at entry=0x7ffcacf40cc0, >> abort_if_unrecognized=abort_if_unrecognized at entry=1) >> at >> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/os_cpu/linux_x86/vm/os_linux_x86.cpp:541 >> ----------------------- >> >> >> Thanks, >> >> Yasumasa >> >> >>>>> But it seems to me that there are various checks of >>>>> _prologue that should really be checking is_initialized() and/or >>>>> is_destroyed() as a guard. >>>> >>>> >>>> >>>> Should I change all assertions for _prologue? >>> >>> >>> >>> Assertions and direct guards. Checking _prologue is a placeholder for the >>> real check. >>> >>> >>> Thanks, >>> David >>> >>> >>>> >>>> Thanks, >>>> >>>> Yasumasa >>>> >>>> >>>> 2017-10-18 10:53 GMT+09:00 David Holmes : >>>>> >>>>> >>>>> Hi Yasumasa, >>>>> >>>>> By chance we ran into this bug which I analysed yesterday: >>>>> >>>>> https://bugs.openjdk.java.net/browse/JDK-8189390 >>>>> >>>>> We hit the assertion: >>>>> >>>>> # Internal Error (/open/src/hotspot/share/runtime/perfMemory.cpp:216), >>>>> pid=17874, tid=17875 >>>>> # assert(_prologue != __null) failed: called before initialization >>>>> # >>>>> >>>>> which is misleading because it can fail if called before >>>>> initialization, >>>>> or >>>>> after PerfMemory::destroy has been called. >>>>> >>>>> With your changes you no longer null out _prologue so the assertion >>>>> would >>>>> now not fail and we'd proceed to access the deleted memory region! >>>>> >>>>> I'm unclear why you no longer clear all the fields set during >>>>> initialization? But it seems to me that there are various checks of >>>>> _prologue that should really be checking is_initialized() and/or >>>>> is_destroyed() as a guard. >>>>> >>>>> Thanks, >>>>> David >>>>> >>>>> >>>>> On 16/10/2017 11:25 PM, Yasumasa Suenaga wrote: >>>>>> >>>>>> >>>>>> >>>>>> PING: >>>>>> >>>>>> Could you review it? >>>>>> >>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.05/ >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Yasumasa >>>>>> >>>>>> >>>>>> On 2017/10/03 13:18, Yasumasa Suenaga wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>> Hi all, >>>>>>> >>>>>>> I added gtest unit test case for this change in new webrev: >>>>>>> >>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.05/ >>>>>>> >>>>>>> Could you review it? >>>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Yasumasa >>>>>>> >>>>>>> >>>>>>> >>>>>>> 2017-09-27 0:01 GMT+09:00 Yasumasa Suenaga : >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Hi all, >>>>>>>> >>>>>>>> I uploaded new webrev to be adapted to jdk10/hs: >>>>>>>> >>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.04/ >>>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Yasumasa >>>>>>>> >>>>>>>> >>>>>>>> On 2017/09/21 7:45, Yasumasa Suenaga wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> PING: >>>>>>>>> >>>>>>>>> Have you checked this issue? >>>>>>>>> >>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.03/ >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Yasumasa >>>>>>>>> >>>>>>>>> >>>>>>>>> On 2017/07/01 23:43, Yasumasa Suenaga wrote: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> PING: >>>>>>>>>> >>>>>>>>>> Have you checked this issue? >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Yasumasa >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 2017/06/13 14:10, Yasumasa Suenaga wrote: >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Hi all, >>>>>>>>>>> >>>>>>>>>>> I want to discuss about JDK-8151815: Could not parse core image >>>>>>>>>>> with >>>>>>>>>>> JSnap. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> In last year, I found JSnap cannot parse coredump and I've sent >>>>>>>>>>> review >>>>>>>>>>> request for it as JDK-8151815. However it has not been reviewed >>>>>>>>>>> yet >>>>>>>>>>> [1]. >>>>>>>>>>> >>>>>>>>>>> We've discussed about safety implementation, but we could not get >>>>>>>>>>> consensus. >>>>>>>>>>> IMHO all SA tools should be handled java processes and core >>>>>>>>>>> images, >>>>>>>>>>> and PerfCounter value is useful. So I fix this issue. >>>>>>>>>>> >>>>>>>>>>> I uploaded new webrev for this issue. I think this patch is >>>>>>>>>>> safety >>>>>>>>>>> because new flag PerfMemory::_destroyed guards double free, and >>>>>>>>>>> all >>>>>>>>>>> members in PerfMemory is accessible (they are not munmap'ed) >>>>>>>>>>> >>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.03/ >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Can you cooperate? >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> >>>>>>>>>>> Yasumasa >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> [1] >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> http://mail.openjdk.java.net/pipermail/serviceability-dev/2016-April/019480.html >>>>>>>>>>> >>>>>>>> >>>>> >>> > From david.holmes at oracle.com Wed Oct 18 07:09:22 2017 From: david.holmes at oracle.com (David Holmes) Date: Wed, 18 Oct 2017 17:09:22 +1000 Subject: PING: RFR: JDK-8151815: Could not parse core image with JSnap. In-Reply-To: References: <7b897b36-1824-606d-b206-df577a6afe02@gmail.com> <0a760f75-58bd-c334-26c5-0e9adddfe5b7@gmail.com> <643d9ea2-5bb3-f6ef-8007-14cd1c580137@oracle.com> <4370fcb0-d06d-865b-8bab-e03ec1812e89@oracle.com> Message-ID: <84922d13-79ac-8904-1885-79ee0c2b4232@oracle.com> Hi Yasumasa, On 18/10/2017 4:34 PM, Yasumasa Suenaga wrote: > Hi David, > >> I don't think we need the extra fields, just ensure the existing ones can't >> be accessed (other than by the tools) after destroy is called. > > I've added PerfMemory::is_useable() to check whether we can access to > PerfMemory. > I think this webrev prevent to access to PerfMemory after destroy() call. > > http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.06/ This: 90 void PerfMemory::initialize() { 91 92 if (_prologue != NULL) 93 // initialization already performed 94 return; shouldn't check _prologue, but is_initialized(). 213 assert(is_useable(), "called before initialization"); -> "called before init or after destroy" Could add a similar assert in PerfMemory::mark_updated(). Let's see what Serguei thinks. :) Thanks, David > > Thanks, > > Yasumasa > > > 2017-10-18 13:44 GMT+09:00 David Holmes : >> On 18/10/2017 2:27 PM, Yasumasa Suenaga wrote: >>> >>> Hi David, >>> >>> 2017-10-18 12:55 GMT+09:00 David Holmes : >>>> >>>> On 18/10/2017 12:37 PM, Yasumasa Suenaga wrote: >>>>> >>>>> >>>>> Hi David, >>>>> >>>>>> With your changes you no longer null out _prologue so the assertion >>>>>> would >>>>>> now not fail and we'd proceed to access the deleted memory region! >>>>> >>>>> >>>>> >>>>> On Linux, PerfMemory::delete_memory_region() does not call munmap() >>>>> for PerfMemory. >>>> >>>> >>>> >>>> Perhaps not but there are still other actions that happen and the point >>>> is >>>> we should not be able to continue to use PerfMemory once it has been >>>> destroyed (even if the destruction is only logical). >>> >>> >>> I received same comment from Dmitry in the past, but we couldn't >>> decide how should we do. >>> >>> >>> http://mail.openjdk.java.net/pipermail/serviceability-dev/2016-May/019728.html >>> >>> In that discussion, I uploaded another webrev which adds other fields for >>> JSnap. >>> Is it suitable? >>> >>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.02/ >> >> >> I don't think we need the extra fields, just ensure the existing ones can't >> be accessed (other than by the tools) after destroy is called. >> >>> >>>>>> I'm unclear why you no longer clear all the fields set during >>>>>> initialization? >>>>> >>>>> >>>>> >>>>> PerfMemory.java in jdk.hotspot.agent needs these field values. >>>>> `jhsdb jsnap --core` is failed if they are cleared. >>>> >>>> >>>> >>>> I'm not familiar with these tools. When do we produce a core file after >>>> calling PerfMemory::destroy ? >>> >>> >>> PerfMemory::destroy() is called before aborting. >> >> >> Ah - right. I assume we need to close off the perfdata file before we abort. >> >> Thanks, >> David >> >> >>> ----------------------- >>> #0 perfMemory_exit () >>> at >>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/share/vm/runtime/perfMemory.cpp:80 >>> #1 0x00007f99b091c949 in os::shutdown () >>> at >>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:1483 >>> #2 0x00007f99b091c980 in os::abort (dump_core=) >>> at >>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:1503 >>> #3 0x00007f99b0b689c3 in VMError::report_and_die ( >>> this=this at entry=0x7ffcacf40b50) >>> at >>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/share/vm/utilities/vmError.cpp:1060 >>> #4 0x00007f99b0926f04 in JVM_handle_linux_signal (sig=sig at entry=11, >>> info=info at entry=0x7ffcacf40df0, ucVoid=ucVoid at entry=0x7ffcacf40cc0, >>> abort_if_unrecognized=abort_if_unrecognized at entry=1) >>> at >>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/os_cpu/linux_x86/vm/os_linux_x86.cpp:541 >>> ----------------------- >>> >>> >>> Thanks, >>> >>> Yasumasa >>> >>> >>>>>> But it seems to me that there are various checks of >>>>>> _prologue that should really be checking is_initialized() and/or >>>>>> is_destroyed() as a guard. >>>>> >>>>> >>>>> >>>>> Should I change all assertions for _prologue? >>>> >>>> >>>> >>>> Assertions and direct guards. Checking _prologue is a placeholder for the >>>> real check. >>>> >>>> >>>> Thanks, >>>> David >>>> >>>> >>>>> >>>>> Thanks, >>>>> >>>>> Yasumasa >>>>> >>>>> >>>>> 2017-10-18 10:53 GMT+09:00 David Holmes : >>>>>> >>>>>> >>>>>> Hi Yasumasa, >>>>>> >>>>>> By chance we ran into this bug which I analysed yesterday: >>>>>> >>>>>> https://bugs.openjdk.java.net/browse/JDK-8189390 >>>>>> >>>>>> We hit the assertion: >>>>>> >>>>>> # Internal Error (/open/src/hotspot/share/runtime/perfMemory.cpp:216), >>>>>> pid=17874, tid=17875 >>>>>> # assert(_prologue != __null) failed: called before initialization >>>>>> # >>>>>> >>>>>> which is misleading because it can fail if called before >>>>>> initialization, >>>>>> or >>>>>> after PerfMemory::destroy has been called. >>>>>> >>>>>> With your changes you no longer null out _prologue so the assertion >>>>>> would >>>>>> now not fail and we'd proceed to access the deleted memory region! >>>>>> >>>>>> I'm unclear why you no longer clear all the fields set during >>>>>> initialization? But it seems to me that there are various checks of >>>>>> _prologue that should really be checking is_initialized() and/or >>>>>> is_destroyed() as a guard. >>>>>> >>>>>> Thanks, >>>>>> David >>>>>> >>>>>> >>>>>> On 16/10/2017 11:25 PM, Yasumasa Suenaga wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>> PING: >>>>>>> >>>>>>> Could you review it? >>>>>>> >>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.05/ >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Yasumasa >>>>>>> >>>>>>> >>>>>>> On 2017/10/03 13:18, Yasumasa Suenaga wrote: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Hi all, >>>>>>>> >>>>>>>> I added gtest unit test case for this change in new webrev: >>>>>>>> >>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.05/ >>>>>>>> >>>>>>>> Could you review it? >>>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Yasumasa >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> 2017-09-27 0:01 GMT+09:00 Yasumasa Suenaga : >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Hi all, >>>>>>>>> >>>>>>>>> I uploaded new webrev to be adapted to jdk10/hs: >>>>>>>>> >>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.04/ >>>>>>>>> >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> Yasumasa >>>>>>>>> >>>>>>>>> >>>>>>>>> On 2017/09/21 7:45, Yasumasa Suenaga wrote: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> PING: >>>>>>>>>> >>>>>>>>>> Have you checked this issue? >>>>>>>>>> >>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.03/ >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Yasumasa >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 2017/07/01 23:43, Yasumasa Suenaga wrote: >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> PING: >>>>>>>>>>> >>>>>>>>>>> Have you checked this issue? >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Yasumasa >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 2017/06/13 14:10, Yasumasa Suenaga wrote: >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Hi all, >>>>>>>>>>>> >>>>>>>>>>>> I want to discuss about JDK-8151815: Could not parse core image >>>>>>>>>>>> with >>>>>>>>>>>> JSnap. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> In last year, I found JSnap cannot parse coredump and I've sent >>>>>>>>>>>> review >>>>>>>>>>>> request for it as JDK-8151815. However it has not been reviewed >>>>>>>>>>>> yet >>>>>>>>>>>> [1]. >>>>>>>>>>>> >>>>>>>>>>>> We've discussed about safety implementation, but we could not get >>>>>>>>>>>> consensus. >>>>>>>>>>>> IMHO all SA tools should be handled java processes and core >>>>>>>>>>>> images, >>>>>>>>>>>> and PerfCounter value is useful. So I fix this issue. >>>>>>>>>>>> >>>>>>>>>>>> I uploaded new webrev for this issue. I think this patch is >>>>>>>>>>>> safety >>>>>>>>>>>> because new flag PerfMemory::_destroyed guards double free, and >>>>>>>>>>>> all >>>>>>>>>>>> members in PerfMemory is accessible (they are not munmap'ed) >>>>>>>>>>>> >>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.03/ >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Can you cooperate? >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> >>>>>>>>>>>> Yasumasa >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> [1] >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> http://mail.openjdk.java.net/pipermail/serviceability-dev/2016-April/019480.html >>>>>>>>>>>> >>>>>>>>> >>>>>> >>>> >> From erik.helin at oracle.com Wed Oct 18 07:32:36 2017 From: erik.helin at oracle.com (Erik Helin) Date: Wed, 18 Oct 2017 09:32:36 +0200 Subject: RFR: 8189440: Event tracing macros for allocation and weak oops processing In-Reply-To: <59E6BDAA.3080502@oracle.com> References: <59E6BDAA.3080502@oracle.com> Message-ID: <065721aa-c70e-f764-1c68-147a68786e26@oracle.com> I'm adding hotspot-gc-dev since GC code is touched and not all GC developers are on serviceability-dev (probably just me). Thanks, Erik On 10/18/2017 04:34 AM, Erik Gahlin wrote: > Hi, > > Could I have a review of a change that adds two macros to be used with > event-based JVM tracing. > > Bug: > https://bugs.openjdk.java.net/browse/JDK-8189440 > > Webrev: > http://cr.openjdk.java.net/~egahlin/8189440_0 > > Thanks > Erik From yasuenag at gmail.com Wed Oct 18 07:39:37 2017 From: yasuenag at gmail.com (Yasumasa Suenaga) Date: Wed, 18 Oct 2017 16:39:37 +0900 Subject: PING: RFR: JDK-8151815: Could not parse core image with JSnap. In-Reply-To: <84922d13-79ac-8904-1885-79ee0c2b4232@oracle.com> References: <7b897b36-1824-606d-b206-df577a6afe02@gmail.com> <0a760f75-58bd-c334-26c5-0e9adddfe5b7@gmail.com> <643d9ea2-5bb3-f6ef-8007-14cd1c580137@oracle.com> <4370fcb0-d06d-865b-8bab-e03ec1812e89@oracle.com> <84922d13-79ac-8904-1885-79ee0c2b4232@oracle.com> Message-ID: Hi David, Thank you for your comment. I uploaded new webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.07/ Serguei, please comment about this :-) Yasumasa 2017-10-18 16:09 GMT+09:00 David Holmes : > Hi Yasumasa, > > On 18/10/2017 4:34 PM, Yasumasa Suenaga wrote: >> >> Hi David, >> >>> I don't think we need the extra fields, just ensure the existing ones >>> can't >>> be accessed (other than by the tools) after destroy is called. >> >> >> I've added PerfMemory::is_useable() to check whether we can access to >> PerfMemory. >> I think this webrev prevent to access to PerfMemory after destroy() call. >> >> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.06/ > > > This: > > 90 void PerfMemory::initialize() { > 91 > 92 if (_prologue != NULL) > 93 // initialization already performed > 94 return; > > shouldn't check _prologue, but is_initialized(). > > 213 assert(is_useable(), "called before initialization"); > > -> "called before init or after destroy" > > Could add a similar assert in PerfMemory::mark_updated(). > > Let's see what Serguei thinks. :) > > > Thanks, > David > >> >> Thanks, >> >> Yasumasa >> >> >> 2017-10-18 13:44 GMT+09:00 David Holmes : >>> >>> On 18/10/2017 2:27 PM, Yasumasa Suenaga wrote: >>>> >>>> >>>> Hi David, >>>> >>>> 2017-10-18 12:55 GMT+09:00 David Holmes : >>>>> >>>>> >>>>> On 18/10/2017 12:37 PM, Yasumasa Suenaga wrote: >>>>>> >>>>>> >>>>>> >>>>>> Hi David, >>>>>> >>>>>>> With your changes you no longer null out _prologue so the assertion >>>>>>> would >>>>>>> now not fail and we'd proceed to access the deleted memory region! >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> On Linux, PerfMemory::delete_memory_region() does not call munmap() >>>>>> for PerfMemory. >>>>> >>>>> >>>>> >>>>> >>>>> Perhaps not but there are still other actions that happen and the point >>>>> is >>>>> we should not be able to continue to use PerfMemory once it has been >>>>> destroyed (even if the destruction is only logical). >>>> >>>> >>>> >>>> I received same comment from Dmitry in the past, but we couldn't >>>> decide how should we do. >>>> >>>> >>>> >>>> http://mail.openjdk.java.net/pipermail/serviceability-dev/2016-May/019728.html >>>> >>>> In that discussion, I uploaded another webrev which adds other fields >>>> for >>>> JSnap. >>>> Is it suitable? >>>> >>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.02/ >>> >>> >>> >>> I don't think we need the extra fields, just ensure the existing ones >>> can't >>> be accessed (other than by the tools) after destroy is called. >>> >>>> >>>>>>> I'm unclear why you no longer clear all the fields set during >>>>>>> initialization? >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> PerfMemory.java in jdk.hotspot.agent needs these field values. >>>>>> `jhsdb jsnap --core` is failed if they are cleared. >>>>> >>>>> >>>>> >>>>> >>>>> I'm not familiar with these tools. When do we produce a core file after >>>>> calling PerfMemory::destroy ? >>>> >>>> >>>> >>>> PerfMemory::destroy() is called before aborting. >>> >>> >>> >>> Ah - right. I assume we need to close off the perfdata file before we >>> abort. >>> >>> Thanks, >>> David >>> >>> >>>> ----------------------- >>>> #0 perfMemory_exit () >>>> at >>>> >>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/share/vm/runtime/perfMemory.cpp:80 >>>> #1 0x00007f99b091c949 in os::shutdown () >>>> at >>>> >>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:1483 >>>> #2 0x00007f99b091c980 in os::abort (dump_core=) >>>> at >>>> >>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:1503 >>>> #3 0x00007f99b0b689c3 in VMError::report_and_die ( >>>> this=this at entry=0x7ffcacf40b50) >>>> at >>>> >>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/share/vm/utilities/vmError.cpp:1060 >>>> #4 0x00007f99b0926f04 in JVM_handle_linux_signal (sig=sig at entry=11, >>>> info=info at entry=0x7ffcacf40df0, >>>> ucVoid=ucVoid at entry=0x7ffcacf40cc0, >>>> abort_if_unrecognized=abort_if_unrecognized at entry=1) >>>> at >>>> >>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/os_cpu/linux_x86/vm/os_linux_x86.cpp:541 >>>> ----------------------- >>>> >>>> >>>> Thanks, >>>> >>>> Yasumasa >>>> >>>> >>>>>>> But it seems to me that there are various checks of >>>>>>> _prologue that should really be checking is_initialized() and/or >>>>>>> is_destroyed() as a guard. >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> Should I change all assertions for _prologue? >>>>> >>>>> >>>>> >>>>> >>>>> Assertions and direct guards. Checking _prologue is a placeholder for >>>>> the >>>>> real check. >>>>> >>>>> >>>>> Thanks, >>>>> David >>>>> >>>>> >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Yasumasa >>>>>> >>>>>> >>>>>> 2017-10-18 10:53 GMT+09:00 David Holmes : >>>>>>> >>>>>>> >>>>>>> >>>>>>> Hi Yasumasa, >>>>>>> >>>>>>> By chance we ran into this bug which I analysed yesterday: >>>>>>> >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8189390 >>>>>>> >>>>>>> We hit the assertion: >>>>>>> >>>>>>> # Internal Error >>>>>>> (/open/src/hotspot/share/runtime/perfMemory.cpp:216), >>>>>>> pid=17874, tid=17875 >>>>>>> # assert(_prologue != __null) failed: called before initialization >>>>>>> # >>>>>>> >>>>>>> which is misleading because it can fail if called before >>>>>>> initialization, >>>>>>> or >>>>>>> after PerfMemory::destroy has been called. >>>>>>> >>>>>>> With your changes you no longer null out _prologue so the assertion >>>>>>> would >>>>>>> now not fail and we'd proceed to access the deleted memory region! >>>>>>> >>>>>>> I'm unclear why you no longer clear all the fields set during >>>>>>> initialization? But it seems to me that there are various checks of >>>>>>> _prologue that should really be checking is_initialized() and/or >>>>>>> is_destroyed() as a guard. >>>>>>> >>>>>>> Thanks, >>>>>>> David >>>>>>> >>>>>>> >>>>>>> On 16/10/2017 11:25 PM, Yasumasa Suenaga wrote: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> PING: >>>>>>>> >>>>>>>> Could you review it? >>>>>>>> >>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.05/ >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Yasumasa >>>>>>>> >>>>>>>> >>>>>>>> On 2017/10/03 13:18, Yasumasa Suenaga wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Hi all, >>>>>>>>> >>>>>>>>> I added gtest unit test case for this change in new webrev: >>>>>>>>> >>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.05/ >>>>>>>>> >>>>>>>>> Could you review it? >>>>>>>>> >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> Yasumasa >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> 2017-09-27 0:01 GMT+09:00 Yasumasa Suenaga : >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Hi all, >>>>>>>>>> >>>>>>>>>> I uploaded new webrev to be adapted to jdk10/hs: >>>>>>>>>> >>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.04/ >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> >>>>>>>>>> Yasumasa >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 2017/09/21 7:45, Yasumasa Suenaga wrote: >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> PING: >>>>>>>>>>> >>>>>>>>>>> Have you checked this issue? >>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.03/ >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Yasumasa >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 2017/07/01 23:43, Yasumasa Suenaga wrote: >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> PING: >>>>>>>>>>>> >>>>>>>>>>>> Have you checked this issue? >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Yasumasa >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On 2017/06/13 14:10, Yasumasa Suenaga wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Hi all, >>>>>>>>>>>>> >>>>>>>>>>>>> I want to discuss about JDK-8151815: Could not parse core image >>>>>>>>>>>>> with >>>>>>>>>>>>> JSnap. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> In last year, I found JSnap cannot parse coredump and I've sent >>>>>>>>>>>>> review >>>>>>>>>>>>> request for it as JDK-8151815. However it has not been reviewed >>>>>>>>>>>>> yet >>>>>>>>>>>>> [1]. >>>>>>>>>>>>> >>>>>>>>>>>>> We've discussed about safety implementation, but we could not >>>>>>>>>>>>> get >>>>>>>>>>>>> consensus. >>>>>>>>>>>>> IMHO all SA tools should be handled java processes and core >>>>>>>>>>>>> images, >>>>>>>>>>>>> and PerfCounter value is useful. So I fix this issue. >>>>>>>>>>>>> >>>>>>>>>>>>> I uploaded new webrev for this issue. I think this patch is >>>>>>>>>>>>> safety >>>>>>>>>>>>> because new flag PerfMemory::_destroyed guards double free, and >>>>>>>>>>>>> all >>>>>>>>>>>>> members in PerfMemory is accessible (they are not munmap'ed) >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.03/ >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Can you cooperate? >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> >>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> [1] >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> http://mail.openjdk.java.net/pipermail/serviceability-dev/2016-April/019480.html >>>>>>>>>>>>> >>>>>>>>>> >>>>>>> >>>>> >>> > From serguei.spitsyn at oracle.com Wed Oct 18 09:25:01 2017 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 18 Oct 2017 02:25:01 -0700 Subject: PING: RFR: JDK-8151815: Could not parse core image with JSnap. In-Reply-To: References: <7b897b36-1824-606d-b206-df577a6afe02@gmail.com> <0a760f75-58bd-c334-26c5-0e9adddfe5b7@gmail.com> <643d9ea2-5bb3-f6ef-8007-14cd1c580137@oracle.com> <4370fcb0-d06d-865b-8bab-e03ec1812e89@oracle.com> <84922d13-79ac-8904-1885-79ee0c2b4232@oracle.com> Message-ID: <967acad8-8378-cf99-0ff9-ba3852d51dd3@oracle.com> An HTML attachment was scrubbed... URL: From yasuenag at gmail.com Wed Oct 18 09:43:20 2017 From: yasuenag at gmail.com (Yasumasa Suenaga) Date: Wed, 18 Oct 2017 18:43:20 +0900 Subject: PING: RFR: JDK-8151815: Could not parse core image with JSnap. In-Reply-To: <967acad8-8378-cf99-0ff9-ba3852d51dd3@oracle.com> References: <7b897b36-1824-606d-b206-df577a6afe02@gmail.com> <0a760f75-58bd-c334-26c5-0e9adddfe5b7@gmail.com> <643d9ea2-5bb3-f6ef-8007-14cd1c580137@oracle.com> <4370fcb0-d06d-865b-8bab-e03ec1812e89@oracle.com> <84922d13-79ac-8904-1885-79ee0c2b4232@oracle.com> <967acad8-8378-cf99-0ff9-ba3852d51dd3@oracle.com> Message-ID: Hi Serguei, Should we use OrderAccess::load_acquire() to check _initialized and _destroyed? IMHO it do not need because initialize / destroy of PerfMemory seem not to be on multi-threaded. Thanks, Yasumasa. 2017/10/18 ??6:25 "serguei.spitsyn at oracle.com" : > Hi Yasumasa, > > Sorry for a quite late participation. > > I looked at the previous webrevs and think that this one is much better. > > Some concern is if we need any kind of synchronization here, e.g. CAS. > But it depends on the PerfMemory class usage. > > Should we make the static variables '_initialized' and '_destroyed' > volatile? > > Also, the '_initialized' is set to 1 with: > 159 OrderAccess::release_store(&_initialized, 1); > > Should we do the same to set the '_destroyed'?: > 200 _destroyed = true; > > > Thanks, > Serguei > > > On 10/18/17 00:39, Yasumasa Suenaga wrote: > > Hi David, > > Thank you for your comment. > I uploaded new webrev: > > http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.07/ > > Serguei, please comment about this :-) > > > Yasumasa > > > > 2017-10-18 16:09 GMT+09:00 David Holmes : > > Hi Yasumasa, > > On 18/10/2017 4:34 PM, Yasumasa Suenaga wrote: > > > Hi David, > > > I don't think we need the extra fields, just ensure the existing ones > can't > be accessed (other than by the tools) after destroy is called. > > > > I've added PerfMemory::is_useable() to check whether we can access to > PerfMemory. > I think this webrev prevent to access to PerfMemory after destroy() call. > > http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.06/ > > > > This: > > 90 void PerfMemory::initialize() { > 91 > 92 if (_prologue != NULL) > 93 // initialization already performed > 94 return; > > shouldn't check _prologue, but is_initialized(). > > 213 assert(is_useable(), "called before initialization"); > > -> "called before init or after destroy" > > Could add a similar assert in PerfMemory::mark_updated(). > > Let's see what Serguei thinks. :) > > > Thanks, > David > > > > Thanks, > > Yasumasa > > > 2017-10-18 13:44 GMT+09:00 David Holmes : > > > On 18/10/2017 2:27 PM, Yasumasa Suenaga wrote: > > > > Hi David, > > 2017-10-18 12:55 GMT+09:00 David Holmes : > > > > On 18/10/2017 12:37 PM, Yasumasa Suenaga wrote: > > > > > Hi David, > > > With your changes you no longer null out _prologue so the assertion > would > now not fail and we'd proceed to access the deleted memory region! > > > > > > On Linux, PerfMemory::delete_memory_region() does not call munmap() > for PerfMemory. > > > > > > Perhaps not but there are still other actions that happen and the point > is > we should not be able to continue to use PerfMemory once it has been > destroyed (even if the destruction is only logical). > > > > > I received same comment from Dmitry in the past, but we couldn't > decide how should we do. > > > http://mail.openjdk.java.net/pipermail/serviceability-dev/2016-May/019728.html > > In that discussion, I uploaded another webrev which adds other fields > for > JSnap. > Is it suitable? > > http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.02/ > > > > > I don't think we need the extra fields, just ensure the existing ones > can't > be accessed (other than by the tools) after destroy is called. > > > I'm unclear why you no longer clear all the fields set during > initialization? > > > > > > PerfMemory.java in jdk.hotspot.agent needs these field values. > `jhsdb jsnap --core` is failed if they are cleared. > > > > > > I'm not familiar with these tools. When do we produce a core file after > calling PerfMemory::destroy ? > > > > > PerfMemory::destroy() is called before aborting. > > > > > Ah - right. I assume we need to close off the perfdata file before we > abort. > > Thanks, > David > > > > ----------------------- > #0 perfMemory_exit () > at > > /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/share/vm/runtime/perfMemory.cpp:80 > #1 0x00007f99b091c949 in os::shutdown () > at > > /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:1483 > #2 0x00007f99b091c980 in os::abort (dump_core=) > at > > /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:1503 > #3 0x00007f99b0b689c3 in VMError::report_and_die ( > this=this at entry=0x7ffcacf40b50) > at > > /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/share/vm/utilities/vmError.cpp:1060 > #4 0x00007f99b0926f04 in JVM_handle_linux_signal (sig=sig at entry=11, > info=info at entry=0x7ffcacf40df0, > ucVoid=ucVoid at entry=0x7ffcacf40cc0, > abort_if_unrecognized=abort_if_unrecognized at entry=1) > at > > /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/os_cpu/linux_x86/vm/os_linux_x86.cpp:541 > ----------------------- > > > Thanks, > > Yasumasa > > > > But it seems to me that there are various checks of > _prologue that should really be checking is_initialized() and/or > is_destroyed() as a guard. > > > > > > Should I change all assertions for _prologue? > > > > > > Assertions and direct guards. Checking _prologue is a placeholder for > the > real check. > > > Thanks, > David > > > > > Thanks, > > Yasumasa > > > 2017-10-18 10:53 GMT+09:00 David Holmes : > > > > > Hi Yasumasa, > > By chance we ran into this bug which I analysed yesterday: > https://bugs.openjdk.java.net/browse/JDK-8189390 > > We hit the assertion: > > # Internal Error > (/open/src/hotspot/share/runtime/perfMemory.cpp:216), > pid=17874, tid=17875 > # assert(_prologue != __null) failed: called before initialization > # > > which is misleading because it can fail if called before > initialization, > or > after PerfMemory::destroy has been called. > > With your changes you no longer null out _prologue so the assertion > would > now not fail and we'd proceed to access the deleted memory region! > > I'm unclear why you no longer clear all the fields set during > initialization? But it seems to me that there are various checks of > _prologue that should really be checking is_initialized() and/or > is_destroyed() as a guard. > > Thanks, > David > > > On 16/10/2017 11:25 PM, Yasumasa Suenaga wrote: > > > > > > PING: > > Could you review it? > > > http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.05/ > > > > > > > > Thanks, > > Yasumasa > > > On 2017/10/03 13:18, Yasumasa Suenaga wrote: > > > > > > Hi all, > > I added gtest unit test case for this change in new webrev: > > http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.05/ > > Could you review it? > > > Thanks, > > Yasumasa > > > > 2017-09-27 0:01 GMT+09:00 Yasumasa Suenaga : > > > > > > Hi all, > > I uploaded new webrev to be adapted to jdk10/hs: > > http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.04/ > > > Thanks, > > Yasumasa > > > On 2017/09/21 7:45, Yasumasa Suenaga wrote: > > > > > > > PING: > > Have you checked this issue? > > > http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.03/ > > > > > > > > > Yasumasa > > > On 2017/07/01 23:43, Yasumasa Suenaga wrote: > > > > > > > PING: > > Have you checked this issue? > > > Yasumasa > > > On 2017/06/13 14:10, Yasumasa Suenaga wrote: > > > > > > > Hi all, > > I want to discuss about JDK-8151815: Could not parse core image > with > JSnap. > > > In last year, I found JSnap cannot parse coredump and I've sent > review > request for it as JDK-8151815. However it has not been reviewed > yet > [1]. > > We've discussed about safety implementation, but we could not > get > consensus. > IMHO all SA tools should be handled java processes and core > images, > and PerfCounter value is useful. So I fix this issue. > > I uploaded new webrev for this issue. I think this patch is > safety > because new flag PerfMemory::_destroyed guards double free, and > all > members in PerfMemory is accessible (they are not munmap'ed) > > http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.03/ > > > Can you cooperate? > > > Thanks, > > Yasumasa > > > [1] > > > > http://mail.openjdk.java.net/pipermail/serviceability-dev/2016-April/019480.html > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.holmes at oracle.com Wed Oct 18 09:48:14 2017 From: david.holmes at oracle.com (David Holmes) Date: Wed, 18 Oct 2017 19:48:14 +1000 Subject: PING: RFR: JDK-8151815: Could not parse core image with JSnap. In-Reply-To: <967acad8-8378-cf99-0ff9-ba3852d51dd3@oracle.com> References: <7b897b36-1824-606d-b206-df577a6afe02@gmail.com> <0a760f75-58bd-c334-26c5-0e9adddfe5b7@gmail.com> <643d9ea2-5bb3-f6ef-8007-14cd1c580137@oracle.com> <4370fcb0-d06d-865b-8bab-e03ec1812e89@oracle.com> <84922d13-79ac-8904-1885-79ee0c2b4232@oracle.com> <967acad8-8378-cf99-0ff9-ba3852d51dd3@oracle.com> Message-ID: <41185116-0271-374c-da9e-74e173d6b58c@oracle.com> Hi Serguei On 18/10/2017 7:25 PM, serguei.spitsyn at oracle.com wrote: > Hi Yasumasa, > > Sorry for a quite late participation. > > I looked at the previous webrevs and think that this one is much better. > > Some concern is if we need any kind of synchronization here, e.g. CAS. > But it depends on the PerfMemory class usage. > > Should we make the static variables '_initialized' and '_destroyed' > volatile? For good measure - yes. > Also, the '_initialized' is set to 1 with: > ?? 159??? OrderAccess::release_store(&_initialized, 1); > > Should we do the same to set the '_destroyed'?: > 200 _destroyed = true; There is a benign initialization race but we need the release_store to ensure all the data fields can be read if _initialized is seen as true. But what is missing is a load_acquire() in is_initialized() to ensure we synchronize with that store! There is also a potential for a destruction race (if multiple aborts happens concurrently in different threads) but that also seems benign. In this case there is no data being set so the store to _destroyed does not need to be a release_store. Cheers, David > > Thanks, > Serguei > > > On 10/18/17 00:39, Yasumasa Suenaga wrote: >> Hi David, >> >> Thank you for your comment. >> I uploaded new webrev: >> >> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.07/ >> >> Serguei, please comment about this :-) >> >> >> Yasumasa >> >> >> >> 2017-10-18 16:09 GMT+09:00 David Holmes: >>> Hi Yasumasa, >>> >>> On 18/10/2017 4:34 PM, Yasumasa Suenaga wrote: >>>> Hi David, >>>> >>>>> I don't think we need the extra fields, just ensure the existing ones >>>>> can't >>>>> be accessed (other than by the tools) after destroy is called. >>>> >>>> I've added PerfMemory::is_useable() to check whether we can access to >>>> PerfMemory. >>>> I think this webrev prevent to access to PerfMemory after destroy() call. >>>> >>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.06/ >>> >>> This: >>> >>> 90 void PerfMemory::initialize() { >>> 91 >>> 92 if (_prologue != NULL) >>> 93 // initialization already performed >>> 94 return; >>> >>> shouldn't check _prologue, but is_initialized(). >>> >>> 213 assert(is_useable(), "called before initialization"); >>> >>> -> "called before init or after destroy" >>> >>> Could add a similar assert in PerfMemory::mark_updated(). >>> >>> Let's see what Serguei thinks. :) >>> >>> >>> Thanks, >>> David >>> >>>> Thanks, >>>> >>>> Yasumasa >>>> >>>> >>>> 2017-10-18 13:44 GMT+09:00 David Holmes: >>>>> On 18/10/2017 2:27 PM, Yasumasa Suenaga wrote: >>>>>> >>>>>> Hi David, >>>>>> >>>>>> 2017-10-18 12:55 GMT+09:00 David Holmes: >>>>>>> >>>>>>> On 18/10/2017 12:37 PM, Yasumasa Suenaga wrote: >>>>>>>> >>>>>>>> >>>>>>>> Hi David, >>>>>>>> >>>>>>>>> With your changes you no longer null out _prologue so the assertion >>>>>>>>> would >>>>>>>>> now not fail and we'd proceed to access the deleted memory region! >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Linux, PerfMemory::delete_memory_region() does not call munmap() >>>>>>>> for PerfMemory. >>>>>>> >>>>>>> >>>>>>> >>>>>>> Perhaps not but there are still other actions that happen and the point >>>>>>> is >>>>>>> we should not be able to continue to use PerfMemory once it has been >>>>>>> destroyed (even if the destruction is only logical). >>>>>> >>>>>> >>>>>> I received same comment from Dmitry in the past, but we couldn't >>>>>> decide how should we do. >>>>>> >>>>>> >>>>>> >>>>>> http://mail.openjdk.java.net/pipermail/serviceability-dev/2016-May/019728.html >>>>>> >>>>>> In that discussion, I uploaded another webrev which adds other fields >>>>>> for >>>>>> JSnap. >>>>>> Is it suitable? >>>>>> >>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.02/ >>>>> >>>>> >>>>> I don't think we need the extra fields, just ensure the existing ones >>>>> can't >>>>> be accessed (other than by the tools) after destroy is called. >>>>> >>>>>>>>> I'm unclear why you no longer clear all the fields set during >>>>>>>>> initialization? >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> PerfMemory.java in jdk.hotspot.agent needs these field values. >>>>>>>> `jhsdb jsnap --core` is failed if they are cleared. >>>>>>> >>>>>>> >>>>>>> >>>>>>> I'm not familiar with these tools. When do we produce a core file after >>>>>>> calling PerfMemory::destroy ? >>>>>> >>>>>> >>>>>> PerfMemory::destroy() is called before aborting. >>>>> >>>>> >>>>> Ah - right. I assume we need to close off the perfdata file before we >>>>> abort. >>>>> >>>>> Thanks, >>>>> David >>>>> >>>>> >>>>>> ----------------------- >>>>>> #0 perfMemory_exit () >>>>>> at >>>>>> >>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/share/vm/runtime/perfMemory.cpp:80 >>>>>> #1 0x00007f99b091c949 in os::shutdown () >>>>>> at >>>>>> >>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:1483 >>>>>> #2 0x00007f99b091c980 in os::abort (dump_core=) >>>>>> at >>>>>> >>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:1503 >>>>>> #3 0x00007f99b0b689c3 in VMError::report_and_die ( >>>>>> this=this at entry=0x7ffcacf40b50) >>>>>> at >>>>>> >>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/share/vm/utilities/vmError.cpp:1060 >>>>>> #4 0x00007f99b0926f04 in JVM_handle_linux_signal (sig=sig at entry=11, >>>>>> info=info at entry=0x7ffcacf40df0, >>>>>> ucVoid=ucVoid at entry=0x7ffcacf40cc0, >>>>>> abort_if_unrecognized=abort_if_unrecognized at entry=1) >>>>>> at >>>>>> >>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/os_cpu/linux_x86/vm/os_linux_x86.cpp:541 >>>>>> ----------------------- >>>>>> >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Yasumasa >>>>>> >>>>>> >>>>>>>>> But it seems to me that there are various checks of >>>>>>>>> _prologue that should really be checking is_initialized() and/or >>>>>>>>> is_destroyed() as a guard. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Should I change all assertions for _prologue? >>>>>>> >>>>>>> >>>>>>> >>>>>>> Assertions and direct guards. Checking _prologue is a placeholder for >>>>>>> the >>>>>>> real check. >>>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> David >>>>>>> >>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Yasumasa >>>>>>>> >>>>>>>> >>>>>>>> 2017-10-18 10:53 GMT+09:00 David Holmes: >>>>>>>>> >>>>>>>>> >>>>>>>>> Hi Yasumasa, >>>>>>>>> >>>>>>>>> By chance we ran into this bug which I analysed yesterday: >>>>>>>>> >>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8189390 >>>>>>>>> >>>>>>>>> We hit the assertion: >>>>>>>>> >>>>>>>>> # Internal Error >>>>>>>>> (/open/src/hotspot/share/runtime/perfMemory.cpp:216), >>>>>>>>> pid=17874, tid=17875 >>>>>>>>> # assert(_prologue != __null) failed: called before initialization >>>>>>>>> # >>>>>>>>> >>>>>>>>> which is misleading because it can fail if called before >>>>>>>>> initialization, >>>>>>>>> or >>>>>>>>> after PerfMemory::destroy has been called. >>>>>>>>> >>>>>>>>> With your changes you no longer null out _prologue so the assertion >>>>>>>>> would >>>>>>>>> now not fail and we'd proceed to access the deleted memory region! >>>>>>>>> >>>>>>>>> I'm unclear why you no longer clear all the fields set during >>>>>>>>> initialization? But it seems to me that there are various checks of >>>>>>>>> _prologue that should really be checking is_initialized() and/or >>>>>>>>> is_destroyed() as a guard. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> David >>>>>>>>> >>>>>>>>> >>>>>>>>> On 16/10/2017 11:25 PM, Yasumasa Suenaga wrote: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> PING: >>>>>>>>>> >>>>>>>>>> Could you review it? >>>>>>>>>> >>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.05/ >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> >>>>>>>>>> Yasumasa >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 2017/10/03 13:18, Yasumasa Suenaga wrote: >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Hi all, >>>>>>>>>>> >>>>>>>>>>> I added gtest unit test case for this change in new webrev: >>>>>>>>>>> >>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.05/ >>>>>>>>>>> >>>>>>>>>>> Could you review it? >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> >>>>>>>>>>> Yasumasa >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> 2017-09-27 0:01 GMT+09:00 Yasumasa Suenaga: >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Hi all, >>>>>>>>>>>> >>>>>>>>>>>> I uploaded new webrev to be adapted to jdk10/hs: >>>>>>>>>>>> >>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.04/ >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> >>>>>>>>>>>> Yasumasa >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On 2017/09/21 7:45, Yasumasa Suenaga wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> PING: >>>>>>>>>>>>> >>>>>>>>>>>>> Have you checked this issue? >>>>>>>>>>>>> >>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.03/ >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On 2017/07/01 23:43, Yasumasa Suenaga wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> PING: >>>>>>>>>>>>>> >>>>>>>>>>>>>> Have you checked this issue? >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On 2017/06/13 14:10, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I want to discuss about JDK-8151815: Could not parse core image >>>>>>>>>>>>>>> with >>>>>>>>>>>>>>> JSnap. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> In last year, I found JSnap cannot parse coredump and I've sent >>>>>>>>>>>>>>> review >>>>>>>>>>>>>>> request for it as JDK-8151815. However it has not been reviewed >>>>>>>>>>>>>>> yet >>>>>>>>>>>>>>> [1]. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> We've discussed about safety implementation, but we could not >>>>>>>>>>>>>>> get >>>>>>>>>>>>>>> consensus. >>>>>>>>>>>>>>> IMHO all SA tools should be handled java processes and core >>>>>>>>>>>>>>> images, >>>>>>>>>>>>>>> and PerfCounter value is useful. So I fix this issue. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I uploaded new webrev for this issue. I think this patch is >>>>>>>>>>>>>>> safety >>>>>>>>>>>>>>> because new flag PerfMemory::_destroyed guards double free, and >>>>>>>>>>>>>>> all >>>>>>>>>>>>>>> members in PerfMemory is accessible (they are not munmap'ed) >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.03/ >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Can you cooperate? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> [1] >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> http://mail.openjdk.java.net/pipermail/serviceability-dev/2016-April/019480.html >>>>>>>>>>>>>>> > From david.holmes at oracle.com Wed Oct 18 09:51:09 2017 From: david.holmes at oracle.com (David Holmes) Date: Wed, 18 Oct 2017 19:51:09 +1000 Subject: PING: RFR: JDK-8151815: Could not parse core image with JSnap. In-Reply-To: References: <7b897b36-1824-606d-b206-df577a6afe02@gmail.com> <0a760f75-58bd-c334-26c5-0e9adddfe5b7@gmail.com> <643d9ea2-5bb3-f6ef-8007-14cd1c580137@oracle.com> <4370fcb0-d06d-865b-8bab-e03ec1812e89@oracle.com> <84922d13-79ac-8904-1885-79ee0c2b4232@oracle.com> <967acad8-8378-cf99-0ff9-ba3852d51dd3@oracle.com> Message-ID: <0c33eeff-6918-0ca9-8424-8f4a28378515@oracle.com> On 18/10/2017 7:43 PM, Yasumasa Suenaga wrote: > Hi Serguei, > > Should we use OrderAccess::load_acquire() to check _initialized and > _destroyed? Yes for _initialized. > IMHO it do not need because initialize / destroy of PerfMemory seem not > to be on multi-threaded. Initialization would normally be single-threaded, but I suspect that may be (or were in the past) possible ways to have it occur in different threads - else we'd not need the release_store. Destroy can be attempted by multiple threads I believe, if there are multiple aborts for example. Thanks, David > > Thanks, > > Yasumasa. > > > 2017/10/18 ??6:25 "serguei.spitsyn at oracle.com > " >: > > Hi Yasumasa, > > Sorry for a quite late participation. > > I looked at the previous webrevs and think that this one is much better. > > Some concern is if we need any kind of synchronization here, e.g. CAS. > But it depends on the PerfMemory class usage. > > Should we make the static variables '_initialized' and '_destroyed' > volatile? > > Also, the '_initialized' is set to 1 with: > ?? 159??? OrderAccess::release_store(&_initialized, 1); > > Should we do the same to set the '_destroyed'?: > 200 _destroyed = true; > > > Thanks, > Serguei > > > On 10/18/17 00:39, Yasumasa Suenaga wrote: >> Hi David, >> >> Thank you for your comment. >> I uploaded new webrev: >> >> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.07/ >> >> >> Serguei, please comment about this :-) >> >> >> Yasumasa >> >> >> >> 2017-10-18 16:09 GMT+09:00 David Holmes : >>> Hi Yasumasa, >>> >>> On 18/10/2017 4:34 PM, Yasumasa Suenaga wrote: >>>> Hi David, >>>> >>>>> I don't think we need the extra fields, just ensure the existing ones >>>>> can't >>>>> be accessed (other than by the tools) after destroy is called. >>>> >>>> I've added PerfMemory::is_useable() to check whether we can access to >>>> PerfMemory. >>>> I think this webrev prevent to access to PerfMemory after destroy() call. >>>> >>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.06/ >>>> >>> >>> This: >>> >>> 90 void PerfMemory::initialize() { >>> 91 >>> 92 if (_prologue != NULL) >>> 93 // initialization already performed >>> 94 return; >>> >>> shouldn't check _prologue, but is_initialized(). >>> >>> 213 assert(is_useable(), "called before initialization"); >>> >>> -> "called before init or after destroy" >>> >>> Could add a similar assert in PerfMemory::mark_updated(). >>> >>> Let's see what Serguei thinks. :) >>> >>> >>> Thanks, >>> David >>> >>>> Thanks, >>>> >>>> Yasumasa >>>> >>>> >>>> 2017-10-18 13:44 GMT+09:00 David Holmes : >>>>> On 18/10/2017 2:27 PM, Yasumasa Suenaga wrote: >>>>>> >>>>>> Hi David, >>>>>> >>>>>> 2017-10-18 12:55 GMT+09:00 David Holmes : >>>>>>> >>>>>>> On 18/10/2017 12:37 PM, Yasumasa Suenaga wrote: >>>>>>>> >>>>>>>> >>>>>>>> Hi David, >>>>>>>> >>>>>>>>> With your changes you no longer null out _prologue so the assertion >>>>>>>>> would >>>>>>>>> now not fail and we'd proceed to access the deleted memory region! >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Linux, PerfMemory::delete_memory_region() does not call munmap() >>>>>>>> for PerfMemory. >>>>>>> >>>>>>> >>>>>>> >>>>>>> Perhaps not but there are still other actions that happen and the point >>>>>>> is >>>>>>> we should not be able to continue to use PerfMemory once it has been >>>>>>> destroyed (even if the destruction is only logical). >>>>>> >>>>>> >>>>>> I received same comment from Dmitry in the past, but we couldn't >>>>>> decide how should we do. >>>>>> >>>>>> >>>>>> >>>>>> http://mail.openjdk.java.net/pipermail/serviceability-dev/2016-May/019728.html >>>>>> >>>>>> >>>>>> In that discussion, I uploaded another webrev which adds other fields >>>>>> for >>>>>> JSnap. >>>>>> Is it suitable? >>>>>> >>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.02/ >>>>>> >>>>> >>>>> >>>>> I don't think we need the extra fields, just ensure the existing ones >>>>> can't >>>>> be accessed (other than by the tools) after destroy is called. >>>>> >>>>>>>>> I'm unclear why you no longer clear all the fields set during >>>>>>>>> initialization? >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> PerfMemory.java in jdk.hotspot.agent needs these field values. >>>>>>>> `jhsdb jsnap --core` is failed if they are cleared. >>>>>>> >>>>>>> >>>>>>> >>>>>>> I'm not familiar with these tools. When do we produce a core file after >>>>>>> calling PerfMemory::destroy ? >>>>>> >>>>>> >>>>>> PerfMemory::destroy() is called before aborting. >>>>> >>>>> >>>>> Ah - right. I assume we need to close off the perfdata file before we >>>>> abort. >>>>> >>>>> Thanks, >>>>> David >>>>> >>>>> >>>>>> ----------------------- >>>>>> #0 perfMemory_exit () >>>>>> at >>>>>> >>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/share/vm/runtime/perfMemory.cpp:80 >>>>>> #1 0x00007f99b091c949 in os::shutdown () >>>>>> at >>>>>> >>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:1483 >>>>>> #2 0x00007f99b091c980 in os::abort (dump_core=) >>>>>> at >>>>>> >>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:1503 >>>>>> #3 0x00007f99b0b689c3 in VMError::report_and_die ( >>>>>> this=this at entry=0x7ffcacf40b50) >>>>>> at >>>>>> >>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/share/vm/utilities/vmError.cpp:1060 >>>>>> #4 0x00007f99b0926f04 in JVM_handle_linux_signal (sig=sig at entry=11, >>>>>> info=info at entry=0x7ffcacf40df0, >>>>>> ucVoid=ucVoid at entry=0x7ffcacf40cc0, >>>>>> abort_if_unrecognized=abort_if_unrecognized at entry=1) >>>>>> at >>>>>> >>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/os_cpu/linux_x86/vm/os_linux_x86.cpp:541 >>>>>> ----------------------- >>>>>> >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Yasumasa >>>>>> >>>>>> >>>>>>>>> But it seems to me that there are various checks of >>>>>>>>> _prologue that should really be checking is_initialized() and/or >>>>>>>>> is_destroyed() as a guard. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Should I change all assertions for _prologue? >>>>>>> >>>>>>> >>>>>>> >>>>>>> Assertions and direct guards. Checking _prologue is a placeholder for >>>>>>> the >>>>>>> real check. >>>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> David >>>>>>> >>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Yasumasa >>>>>>>> >>>>>>>> >>>>>>>> 2017-10-18 10:53 GMT+09:00 David Holmes : >>>>>>>>> >>>>>>>>> >>>>>>>>> Hi Yasumasa, >>>>>>>>> >>>>>>>>> By chance we ran into this bug which I analysed yesterday: >>>>>>>>> >>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8189390 >>>>>>>>> >>>>>>>>> >>>>>>>>> We hit the assertion: >>>>>>>>> >>>>>>>>> # Internal Error >>>>>>>>> (/open/src/hotspot/share/runtime/perfMemory.cpp:216), >>>>>>>>> pid=17874, tid=17875 >>>>>>>>> # assert(_prologue != __null) failed: called before initialization >>>>>>>>> # >>>>>>>>> >>>>>>>>> which is misleading because it can fail if called before >>>>>>>>> initialization, >>>>>>>>> or >>>>>>>>> after PerfMemory::destroy has been called. >>>>>>>>> >>>>>>>>> With your changes you no longer null out _prologue so the assertion >>>>>>>>> would >>>>>>>>> now not fail and we'd proceed to access the deleted memory region! >>>>>>>>> >>>>>>>>> I'm unclear why you no longer clear all the fields set during >>>>>>>>> initialization? But it seems to me that there are various checks of >>>>>>>>> _prologue that should really be checking is_initialized() and/or >>>>>>>>> is_destroyed() as a guard. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> David >>>>>>>>> >>>>>>>>> >>>>>>>>> On 16/10/2017 11:25 PM, Yasumasa Suenaga wrote: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> PING: >>>>>>>>>> >>>>>>>>>> Could you review it? >>>>>>>>>> >>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.05/ >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> >>>>>>>>>> Yasumasa >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 2017/10/03 13:18, Yasumasa Suenaga wrote: >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Hi all, >>>>>>>>>>> >>>>>>>>>>> I added gtest unit test case for this change in new webrev: >>>>>>>>>>> >>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.05/ >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Could you review it? >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> >>>>>>>>>>> Yasumasa >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> 2017-09-27 0:01 GMT+09:00 Yasumasa Suenaga : >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Hi all, >>>>>>>>>>>> >>>>>>>>>>>> I uploaded new webrev to be adapted to jdk10/hs: >>>>>>>>>>>> >>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.04/ >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> >>>>>>>>>>>> Yasumasa >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On 2017/09/21 7:45, Yasumasa Suenaga wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> PING: >>>>>>>>>>>>> >>>>>>>>>>>>> Have you checked this issue? >>>>>>>>>>>>> >>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.03/ >>>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On 2017/07/01 23:43, Yasumasa Suenaga wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> PING: >>>>>>>>>>>>>> >>>>>>>>>>>>>> Have you checked this issue? >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On 2017/06/13 14:10, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I want to discuss about JDK-8151815: Could not parse core image >>>>>>>>>>>>>>> with >>>>>>>>>>>>>>> JSnap. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> In last year, I found JSnap cannot parse coredump and I've sent >>>>>>>>>>>>>>> review >>>>>>>>>>>>>>> request for it as JDK-8151815. However it has not been reviewed >>>>>>>>>>>>>>> yet >>>>>>>>>>>>>>> [1]. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> We've discussed about safety implementation, but we could not >>>>>>>>>>>>>>> get >>>>>>>>>>>>>>> consensus. >>>>>>>>>>>>>>> IMHO all SA tools should be handled java processes and core >>>>>>>>>>>>>>> images, >>>>>>>>>>>>>>> and PerfCounter value is useful. So I fix this issue. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I uploaded new webrev for this issue. I think this patch is >>>>>>>>>>>>>>> safety >>>>>>>>>>>>>>> because new flag PerfMemory::_destroyed guards double free, and >>>>>>>>>>>>>>> all >>>>>>>>>>>>>>> members in PerfMemory is accessible (they are not munmap'ed) >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.03/ >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Can you cooperate? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> [1] >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> http://mail.openjdk.java.net/pipermail/serviceability-dev/2016-April/019480.html >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> > From serguei.spitsyn at oracle.com Wed Oct 18 10:26:37 2017 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 18 Oct 2017 03:26:37 -0700 Subject: PING: RFR: JDK-8151815: Could not parse core image with JSnap. In-Reply-To: <41185116-0271-374c-da9e-74e173d6b58c@oracle.com> References: <7b897b36-1824-606d-b206-df577a6afe02@gmail.com> <0a760f75-58bd-c334-26c5-0e9adddfe5b7@gmail.com> <643d9ea2-5bb3-f6ef-8007-14cd1c580137@oracle.com> <4370fcb0-d06d-865b-8bab-e03ec1812e89@oracle.com> <84922d13-79ac-8904-1885-79ee0c2b4232@oracle.com> <967acad8-8378-cf99-0ff9-ba3852d51dd3@oracle.com> <41185116-0271-374c-da9e-74e173d6b58c@oracle.com> Message-ID: <7cb41834-75fa-9dc5-4f6d-c2b85f84dcee@oracle.com> An HTML attachment was scrubbed... URL: From david.holmes at oracle.com Wed Oct 18 12:28:00 2017 From: david.holmes at oracle.com (David Holmes) Date: Wed, 18 Oct 2017 22:28:00 +1000 Subject: PING: RFR: JDK-8151815: Could not parse core image with JSnap. In-Reply-To: <7cb41834-75fa-9dc5-4f6d-c2b85f84dcee@oracle.com> References: <7b897b36-1824-606d-b206-df577a6afe02@gmail.com> <0a760f75-58bd-c334-26c5-0e9adddfe5b7@gmail.com> <643d9ea2-5bb3-f6ef-8007-14cd1c580137@oracle.com> <4370fcb0-d06d-865b-8bab-e03ec1812e89@oracle.com> <84922d13-79ac-8904-1885-79ee0c2b4232@oracle.com> <967acad8-8378-cf99-0ff9-ba3852d51dd3@oracle.com> <41185116-0271-374c-da9e-74e173d6b58c@oracle.com> <7cb41834-75fa-9dc5-4f6d-c2b85f84dcee@oracle.com> Message-ID: On 18/10/2017 8:26 PM, serguei.spitsyn at oracle.com wrote: > Hi David, > > Thank you for jumping to this review and helping Yasumasa to sort it out! > I've just discovered that this issue was already on the table for > several months without a significant progress. > > > On 10/18/17 02:48, David Holmes wrote: >> Hi Serguei >> >> On 18/10/2017 7:25 PM, serguei.spitsyn at oracle.com wrote: >>> Hi Yasumasa, >>> >>> Sorry for a quite late participation. >>> >>> I looked at the previous webrevs and think that this one is much better. >>> >>> Some concern is if we need any kind of synchronization here, e.g. CAS. >>> But it depends on the PerfMemory class usage. >>> >>> Should we make the static variables '_initialized' and '_destroyed' >>> volatile? >> >> For good measure - yes. >> >>> Also, the '_initialized' is set to 1 with: >>> ??? 159??? OrderAccess::release_store(&_initialized, 1); >>> >>> Should we do the same to set the '_destroyed'?: >>> 200 _destroyed = true; >> >> There is a benign initialization race but we need the release_store to >> ensure all the data fields can be read if _initialized is seen as >> true. But what is missing is a load_acquire() in is_initialized() to >> ensure we synchronize with that store! > > Yes, I noticed that the load_acquire() is missed. :| > >> >> There is also a potential for a destruction race (if multiple aborts >> happens concurrently in different threads) but that also seems benign. >> In this case there is no data being set so the store to _destroyed >> does not need to be a release_store. > > I'm not convinced yet this is benign as the PerfMemory::destroy() has > this call: > ? 197 delete_memory_region(); Yes though most of its work ends up being no-ops. > > Now, I started thinking about the asserts that call the is_useable(). > Should they be returns instead? I think this is a somewhat confused chunk of code. It's only fractionally thread-safe yet once in use could be in use concurrently with an aborting thread that calls destroy(). I don't think there is any simple fix for this. If we're in the process of crashing does it really matter if we trigger a secondary crash due to this? The problems with this code go way beyond what Yasumasa is trying to address with the JSnap problem and I would not want to put it back on him to try and come up with an overall solution. > Then the is_destroyed() would better to have the load_acquire(). You could add a load_acquire and do the store_release. It certainly would not hurt, but I don't think it would actually benefit anything either. Cheers, David > Just interested to know what do you think on this. > > Thanks, > Serguei > >> >> Cheers, >> David >> >>> >>> Thanks, >>> Serguei >>> >>> >>> On 10/18/17 00:39, Yasumasa Suenaga wrote: >>>> Hi David, >>>> >>>> Thank you for your comment. >>>> I uploaded new webrev: >>>> >>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.07/ >>>> >>>> Serguei, please comment about this :-) >>>> >>>> >>>> Yasumasa >>>> >>>> >>>> >>>> 2017-10-18 16:09 GMT+09:00 David Holmes: >>>>> Hi Yasumasa, >>>>> >>>>> On 18/10/2017 4:34 PM, Yasumasa Suenaga wrote: >>>>>> Hi David, >>>>>> >>>>>>> I don't think we need the extra fields, just ensure the existing >>>>>>> ones >>>>>>> can't >>>>>>> be accessed (other than by the tools) after destroy is called. >>>>>> >>>>>> I've added PerfMemory::is_useable() to check whether we can access to >>>>>> PerfMemory. >>>>>> I think this webrev prevent to access to PerfMemory after >>>>>> destroy() call. >>>>>> >>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.06/ >>>>> >>>>> This: >>>>> >>>>> ?? 90 void PerfMemory::initialize() { >>>>> ?? 91 >>>>> ?? 92?? if (_prologue != NULL) >>>>> ?? 93???? // initialization already performed >>>>> ?? 94???? return; >>>>> >>>>> shouldn't check _prologue, but is_initialized(). >>>>> >>>>> ? 213?? assert(is_useable(), "called before initialization"); >>>>> >>>>> -> "called before init or after destroy" >>>>> >>>>> Could add a similar assert in PerfMemory::mark_updated(). >>>>> >>>>> Let's see what Serguei thinks. :) >>>>> >>>>> >>>>> Thanks, >>>>> David >>>>> >>>>>> Thanks, >>>>>> >>>>>> Yasumasa >>>>>> >>>>>> >>>>>> 2017-10-18 13:44 GMT+09:00 David Holmes: >>>>>>> On 18/10/2017 2:27 PM, Yasumasa Suenaga wrote: >>>>>>>> >>>>>>>> Hi David, >>>>>>>> >>>>>>>> 2017-10-18 12:55 GMT+09:00 David Holmes: >>>>>>>>> >>>>>>>>> On 18/10/2017 12:37 PM, Yasumasa Suenaga wrote: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Hi David, >>>>>>>>>> >>>>>>>>>>> With your changes you no longer null out _prologue so the >>>>>>>>>>> assertion >>>>>>>>>>> would >>>>>>>>>>> now not fail and we'd proceed to access the deleted memory >>>>>>>>>>> region! >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Linux, PerfMemory::delete_memory_region() does not call >>>>>>>>>> munmap() >>>>>>>>>> for PerfMemory. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Perhaps not but there are still other actions that happen and >>>>>>>>> the point >>>>>>>>> is >>>>>>>>> we should not be able to continue to use PerfMemory once it has >>>>>>>>> been >>>>>>>>> destroyed (even if the destruction is only logical). >>>>>>>> >>>>>>>> >>>>>>>> I received same comment from Dmitry in the past, but we couldn't >>>>>>>> decide how should we do. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> http://mail.openjdk.java.net/pipermail/serviceability-dev/2016-May/019728.html >>>>>>>> >>>>>>>> >>>>>>>> In that discussion, I uploaded another webrev which adds other >>>>>>>> fields >>>>>>>> for >>>>>>>> JSnap. >>>>>>>> Is it suitable? >>>>>>>> >>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.02/ >>>>>>> >>>>>>> >>>>>>> I don't think we need the extra fields, just ensure the existing >>>>>>> ones >>>>>>> can't >>>>>>> be accessed (other than by the tools) after destroy is called. >>>>>>> >>>>>>>>>>> I'm unclear why you no longer clear all the fields set during >>>>>>>>>>> initialization? >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> PerfMemory.java in jdk.hotspot.agent needs these field values. >>>>>>>>>> `jhsdb jsnap --core` is failed if they are cleared. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> I'm not familiar with these tools. When do we produce a core >>>>>>>>> file after >>>>>>>>> calling PerfMemory::destroy ? >>>>>>>> >>>>>>>> >>>>>>>> PerfMemory::destroy() is called before aborting. >>>>>>> >>>>>>> >>>>>>> Ah - right. I assume we need to close off the perfdata file >>>>>>> before we >>>>>>> abort. >>>>>>> >>>>>>> Thanks, >>>>>>> David >>>>>>> >>>>>>> >>>>>>>> ----------------------- >>>>>>>> #0? perfMemory_exit () >>>>>>>> ?????? at >>>>>>>> >>>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/share/vm/runtime/perfMemory.cpp:80 >>>>>>>> >>>>>>>> #1? 0x00007f99b091c949 in os::shutdown () >>>>>>>> ?????? at >>>>>>>> >>>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:1483 >>>>>>>> >>>>>>>> #2? 0x00007f99b091c980 in os::abort (dump_core=) >>>>>>>> ?????? at >>>>>>>> >>>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:1503 >>>>>>>> >>>>>>>> #3? 0x00007f99b0b689c3 in VMError::report_and_die ( >>>>>>>> ?????? this=this at entry=0x7ffcacf40b50) >>>>>>>> ?????? at >>>>>>>> >>>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/share/vm/utilities/vmError.cpp:1060 >>>>>>>> >>>>>>>> #4? 0x00007f99b0926f04 in JVM_handle_linux_signal >>>>>>>> (sig=sig at entry=11, >>>>>>>> ?????? info=info at entry=0x7ffcacf40df0, >>>>>>>> ucVoid=ucVoid at entry=0x7ffcacf40cc0, >>>>>>>> abort_if_unrecognized=abort_if_unrecognized at entry=1) >>>>>>>> ?????? at >>>>>>>> >>>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/os_cpu/linux_x86/vm/os_linux_x86.cpp:541 >>>>>>>> >>>>>>>> ----------------------- >>>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Yasumasa >>>>>>>> >>>>>>>> >>>>>>>>>>> But it seems to me that there are various checks of >>>>>>>>>>> _prologue that should really be checking is_initialized() and/or >>>>>>>>>>> is_destroyed() as a guard. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Should I change all assertions for _prologue? >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Assertions and direct guards. Checking _prologue is a >>>>>>>>> placeholder for >>>>>>>>> the >>>>>>>>> real check. >>>>>>>>> >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> David >>>>>>>>> >>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> >>>>>>>>>> Yasumasa >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> 2017-10-18 10:53 GMT+09:00 David Holmes: >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Hi Yasumasa, >>>>>>>>>>> >>>>>>>>>>> By chance we ran into this bug which I analysed yesterday: >>>>>>>>>>> >>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8189390 >>>>>>>>>>> >>>>>>>>>>> We hit the assertion: >>>>>>>>>>> >>>>>>>>>>> #? Internal Error >>>>>>>>>>> (/open/src/hotspot/share/runtime/perfMemory.cpp:216), >>>>>>>>>>> pid=17874, tid=17875 >>>>>>>>>>> #? assert(_prologue != __null) failed: called before >>>>>>>>>>> initialization >>>>>>>>>>> # >>>>>>>>>>> >>>>>>>>>>> which is misleading because it can fail if called before >>>>>>>>>>> initialization, >>>>>>>>>>> or >>>>>>>>>>> after PerfMemory::destroy has been called. >>>>>>>>>>> >>>>>>>>>>> With your changes you no longer null out _prologue so the >>>>>>>>>>> assertion >>>>>>>>>>> would >>>>>>>>>>> now not fail and we'd proceed to access the deleted memory >>>>>>>>>>> region! >>>>>>>>>>> >>>>>>>>>>> I'm unclear why you no longer clear all the fields set during >>>>>>>>>>> initialization? But it seems to me that there are various >>>>>>>>>>> checks of >>>>>>>>>>> _prologue that should really be checking is_initialized() and/or >>>>>>>>>>> is_destroyed() as a guard. >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> David >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 16/10/2017 11:25 PM, Yasumasa Suenaga wrote: >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> PING: >>>>>>>>>>>> >>>>>>>>>>>> Could you review it? >>>>>>>>>>>> >>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.05/ >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> >>>>>>>>>>>> Yasumasa >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On 2017/10/03 13:18, Yasumasa Suenaga wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Hi all, >>>>>>>>>>>>> >>>>>>>>>>>>> I added gtest unit test case for this change in new webrev: >>>>>>>>>>>>> >>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.05/ >>>>>>>>>>>>> >>>>>>>>>>>>> Could you review it? >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> >>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> 2017-09-27 0:01 GMT+09:00 Yasumasa >>>>>>>>>>>>> Suenaga: >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>> >>>>>>>>>>>>>> I uploaded new webrev to be adapted to jdk10/hs: >>>>>>>>>>>>>> >>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.04/ >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On 2017/09/21 7:45, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> PING: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Have you checked this issue? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.03/ >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On 2017/07/01 23:43, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> PING: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Have you checked this issue? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On 2017/06/13 14:10, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I want to discuss about JDK-8151815: Could not parse >>>>>>>>>>>>>>>>> core image >>>>>>>>>>>>>>>>> with >>>>>>>>>>>>>>>>> JSnap. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> In last year, I found JSnap cannot parse coredump and >>>>>>>>>>>>>>>>> I've sent >>>>>>>>>>>>>>>>> review >>>>>>>>>>>>>>>>> request for it as JDK-8151815. However it has not been >>>>>>>>>>>>>>>>> reviewed >>>>>>>>>>>>>>>>> yet >>>>>>>>>>>>>>>>> [1]. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> We've discussed about safety implementation, but we >>>>>>>>>>>>>>>>> could not >>>>>>>>>>>>>>>>> get >>>>>>>>>>>>>>>>> consensus. >>>>>>>>>>>>>>>>> IMHO all SA tools should be handled java processes and >>>>>>>>>>>>>>>>> core >>>>>>>>>>>>>>>>> images, >>>>>>>>>>>>>>>>> and PerfCounter value is useful. So I fix this issue. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I uploaded new webrev for this issue. I think this >>>>>>>>>>>>>>>>> patch is >>>>>>>>>>>>>>>>> safety >>>>>>>>>>>>>>>>> because new flag PerfMemory::_destroyed guards double >>>>>>>>>>>>>>>>> free, and >>>>>>>>>>>>>>>>> all >>>>>>>>>>>>>>>>> members in PerfMemory is accessible (they are not >>>>>>>>>>>>>>>>> munmap'ed) >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.03/ >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Can you cooperate? >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> [1] >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> http://mail.openjdk.java.net/pipermail/serviceability-dev/2016-April/019480.html >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>> > From david.holmes at oracle.com Wed Oct 18 12:34:12 2017 From: david.holmes at oracle.com (David Holmes) Date: Wed, 18 Oct 2017 22:34:12 +1000 Subject: PING: RFR: JDK-8151815: Could not parse core image with JSnap. In-Reply-To: References: <0a760f75-58bd-c334-26c5-0e9adddfe5b7@gmail.com> <643d9ea2-5bb3-f6ef-8007-14cd1c580137@oracle.com> <4370fcb0-d06d-865b-8bab-e03ec1812e89@oracle.com> <84922d13-79ac-8904-1885-79ee0c2b4232@oracle.com> <967acad8-8378-cf99-0ff9-ba3852d51dd3@oracle.com> <41185116-0271-374c-da9e-74e173d6b58c@oracle.com> <7cb41834-75fa-9dc5-4f6d-c2b85f84dcee@oracle.com> Message-ID: <78b0bfca-1259-e5ef-1189-a7646d7fea36@oracle.com> Just to clarify ... On 18/10/2017 10:28 PM, David Holmes wrote: > On 18/10/2017 8:26 PM, serguei.spitsyn at oracle.com wrote: >> Hi David, >> >> Thank you for jumping to this review and helping Yasumasa to sort it out! >> I've just discovered that this issue was already on the table for >> several months without a significant progress. >> >> >> On 10/18/17 02:48, David Holmes wrote: >>> Hi Serguei >>> >>> On 18/10/2017 7:25 PM, serguei.spitsyn at oracle.com wrote: >>>> Hi Yasumasa, >>>> >>>> Sorry for a quite late participation. >>>> >>>> I looked at the previous webrevs and think that this one is much >>>> better. >>>> >>>> Some concern is if we need any kind of synchronization here, e.g. CAS. >>>> But it depends on the PerfMemory class usage. >>>> >>>> Should we make the static variables '_initialized' and '_destroyed' >>>> volatile? >>> >>> For good measure - yes. >>> >>>> Also, the '_initialized' is set to 1 with: >>>> ??? 159??? OrderAccess::release_store(&_initialized, 1); >>>> >>>> Should we do the same to set the '_destroyed'?: >>>> 200 _destroyed = true; >>> >>> There is a benign initialization race but we need the release_store >>> to ensure all the data fields can be read if _initialized is seen as >>> true. But what is missing is a load_acquire() in is_initialized() to >>> ensure we synchronize with that store! >> >> Yes, I noticed that the load_acquire() is missed. :| >> >>> >>> There is also a potential for a destruction race (if multiple aborts >>> happens concurrently in different threads) but that also seems >>> benign. In this case there is no data being set so the store to >>> _destroyed does not need to be a release_store. >> >> I'm not convinced yet this is benign as the PerfMemory::destroy() has >> this call: >> ?? 197 delete_memory_region(); > > Yes though most of its work ends up being no-ops. > >> >> Now, I started thinking about the asserts that call the is_useable(). >> Should they be returns instead? > > I think this is a somewhat confused chunk of code. It's only > fractionally thread-safe yet once in use could be in use concurrently > with an aborting thread that calls destroy(). I don't think there is any > simple fix for this. If we're in the process of crashing does it really > matter if we trigger a secondary crash due to this? It doesn't matter if we do: assert(is_usable(),...); // continue or if (!is_usable()) return; // continue because as soon as we have checked is_usable() and abort happening in another thread may have changed that by calling destroy. This code is basically broken if we hit an abort path instead of a normal VM shutdown. David ----- > The problems with this code go way beyond what Yasumasa is trying to > address with the JSnap problem and I would not want to put it back on > him to try and come up with an overall solution. > >> Then the is_destroyed() would better to have the load_acquire(). > > You could add a load_acquire and do the store_release. It certainly > would not hurt, but I don't think it would actually benefit anything > either. > > Cheers, > David > >> Just interested to know what do you think on this. >> >> Thanks, >> Serguei >> >>> >>> Cheers, >>> David >>> >>>> >>>> Thanks, >>>> Serguei >>>> >>>> >>>> On 10/18/17 00:39, Yasumasa Suenaga wrote: >>>>> Hi David, >>>>> >>>>> Thank you for your comment. >>>>> I uploaded new webrev: >>>>> >>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.07/ >>>>> >>>>> Serguei, please comment about this :-) >>>>> >>>>> >>>>> Yasumasa >>>>> >>>>> >>>>> >>>>> 2017-10-18 16:09 GMT+09:00 David Holmes: >>>>>> Hi Yasumasa, >>>>>> >>>>>> On 18/10/2017 4:34 PM, Yasumasa Suenaga wrote: >>>>>>> Hi David, >>>>>>> >>>>>>>> I don't think we need the extra fields, just ensure the existing >>>>>>>> ones >>>>>>>> can't >>>>>>>> be accessed (other than by the tools) after destroy is called. >>>>>>> >>>>>>> I've added PerfMemory::is_useable() to check whether we can >>>>>>> access to >>>>>>> PerfMemory. >>>>>>> I think this webrev prevent to access to PerfMemory after >>>>>>> destroy() call. >>>>>>> >>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.06/ >>>>>> >>>>>> This: >>>>>> >>>>>> ?? 90 void PerfMemory::initialize() { >>>>>> ?? 91 >>>>>> ?? 92?? if (_prologue != NULL) >>>>>> ?? 93???? // initialization already performed >>>>>> ?? 94???? return; >>>>>> >>>>>> shouldn't check _prologue, but is_initialized(). >>>>>> >>>>>> ? 213?? assert(is_useable(), "called before initialization"); >>>>>> >>>>>> -> "called before init or after destroy" >>>>>> >>>>>> Could add a similar assert in PerfMemory::mark_updated(). >>>>>> >>>>>> Let's see what Serguei thinks. :) >>>>>> >>>>>> >>>>>> Thanks, >>>>>> David >>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Yasumasa >>>>>>> >>>>>>> >>>>>>> 2017-10-18 13:44 GMT+09:00 David Holmes: >>>>>>>> On 18/10/2017 2:27 PM, Yasumasa Suenaga wrote: >>>>>>>>> >>>>>>>>> Hi David, >>>>>>>>> >>>>>>>>> 2017-10-18 12:55 GMT+09:00 David Holmes: >>>>>>>>>> >>>>>>>>>> On 18/10/2017 12:37 PM, Yasumasa Suenaga wrote: >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Hi David, >>>>>>>>>>> >>>>>>>>>>>> With your changes you no longer null out _prologue so the >>>>>>>>>>>> assertion >>>>>>>>>>>> would >>>>>>>>>>>> now not fail and we'd proceed to access the deleted memory >>>>>>>>>>>> region! >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Linux, PerfMemory::delete_memory_region() does not call >>>>>>>>>>> munmap() >>>>>>>>>>> for PerfMemory. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Perhaps not but there are still other actions that happen and >>>>>>>>>> the point >>>>>>>>>> is >>>>>>>>>> we should not be able to continue to use PerfMemory once it >>>>>>>>>> has been >>>>>>>>>> destroyed (even if the destruction is only logical). >>>>>>>>> >>>>>>>>> >>>>>>>>> I received same comment from Dmitry in the past, but we couldn't >>>>>>>>> decide how should we do. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> http://mail.openjdk.java.net/pipermail/serviceability-dev/2016-May/019728.html >>>>>>>>> >>>>>>>>> >>>>>>>>> In that discussion, I uploaded another webrev which adds other >>>>>>>>> fields >>>>>>>>> for >>>>>>>>> JSnap. >>>>>>>>> Is it suitable? >>>>>>>>> >>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.02/ >>>>>>>> >>>>>>>> >>>>>>>> I don't think we need the extra fields, just ensure the existing >>>>>>>> ones >>>>>>>> can't >>>>>>>> be accessed (other than by the tools) after destroy is called. >>>>>>>> >>>>>>>>>>>> I'm unclear why you no longer clear all the fields set during >>>>>>>>>>>> initialization? >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> PerfMemory.java in jdk.hotspot.agent needs these field values. >>>>>>>>>>> `jhsdb jsnap --core` is failed if they are cleared. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> I'm not familiar with these tools. When do we produce a core >>>>>>>>>> file after >>>>>>>>>> calling PerfMemory::destroy ? >>>>>>>>> >>>>>>>>> >>>>>>>>> PerfMemory::destroy() is called before aborting. >>>>>>>> >>>>>>>> >>>>>>>> Ah - right. I assume we need to close off the perfdata file >>>>>>>> before we >>>>>>>> abort. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> David >>>>>>>> >>>>>>>> >>>>>>>>> ----------------------- >>>>>>>>> #0? perfMemory_exit () >>>>>>>>> ?????? at >>>>>>>>> >>>>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/share/vm/runtime/perfMemory.cpp:80 >>>>>>>>> >>>>>>>>> #1? 0x00007f99b091c949 in os::shutdown () >>>>>>>>> ?????? at >>>>>>>>> >>>>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:1483 >>>>>>>>> >>>>>>>>> #2? 0x00007f99b091c980 in os::abort (dump_core=) >>>>>>>>> ?????? at >>>>>>>>> >>>>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:1503 >>>>>>>>> >>>>>>>>> #3? 0x00007f99b0b689c3 in VMError::report_and_die ( >>>>>>>>> ?????? this=this at entry=0x7ffcacf40b50) >>>>>>>>> ?????? at >>>>>>>>> >>>>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/share/vm/utilities/vmError.cpp:1060 >>>>>>>>> >>>>>>>>> #4? 0x00007f99b0926f04 in JVM_handle_linux_signal >>>>>>>>> (sig=sig at entry=11, >>>>>>>>> ?????? info=info at entry=0x7ffcacf40df0, >>>>>>>>> ucVoid=ucVoid at entry=0x7ffcacf40cc0, >>>>>>>>> abort_if_unrecognized=abort_if_unrecognized at entry=1) >>>>>>>>> ?????? at >>>>>>>>> >>>>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/os_cpu/linux_x86/vm/os_linux_x86.cpp:541 >>>>>>>>> >>>>>>>>> ----------------------- >>>>>>>>> >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> Yasumasa >>>>>>>>> >>>>>>>>> >>>>>>>>>>>> But it seems to me that there are various checks of >>>>>>>>>>>> _prologue that should really be checking is_initialized() >>>>>>>>>>>> and/or >>>>>>>>>>>> is_destroyed() as a guard. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Should I change all assertions for _prologue? >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Assertions and direct guards. Checking _prologue is a >>>>>>>>>> placeholder for >>>>>>>>>> the >>>>>>>>>> real check. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> David >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> >>>>>>>>>>> Yasumasa >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> 2017-10-18 10:53 GMT+09:00 David >>>>>>>>>>> Holmes: >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Hi Yasumasa, >>>>>>>>>>>> >>>>>>>>>>>> By chance we ran into this bug which I analysed yesterday: >>>>>>>>>>>> >>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8189390 >>>>>>>>>>>> >>>>>>>>>>>> We hit the assertion: >>>>>>>>>>>> >>>>>>>>>>>> #? Internal Error >>>>>>>>>>>> (/open/src/hotspot/share/runtime/perfMemory.cpp:216), >>>>>>>>>>>> pid=17874, tid=17875 >>>>>>>>>>>> #? assert(_prologue != __null) failed: called before >>>>>>>>>>>> initialization >>>>>>>>>>>> # >>>>>>>>>>>> >>>>>>>>>>>> which is misleading because it can fail if called before >>>>>>>>>>>> initialization, >>>>>>>>>>>> or >>>>>>>>>>>> after PerfMemory::destroy has been called. >>>>>>>>>>>> >>>>>>>>>>>> With your changes you no longer null out _prologue so the >>>>>>>>>>>> assertion >>>>>>>>>>>> would >>>>>>>>>>>> now not fail and we'd proceed to access the deleted memory >>>>>>>>>>>> region! >>>>>>>>>>>> >>>>>>>>>>>> I'm unclear why you no longer clear all the fields set during >>>>>>>>>>>> initialization? But it seems to me that there are various >>>>>>>>>>>> checks of >>>>>>>>>>>> _prologue that should really be checking is_initialized() >>>>>>>>>>>> and/or >>>>>>>>>>>> is_destroyed() as a guard. >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> David >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On 16/10/2017 11:25 PM, Yasumasa Suenaga wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> PING: >>>>>>>>>>>>> >>>>>>>>>>>>> Could you review it? >>>>>>>>>>>>> >>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.05/ >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> >>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On 2017/10/03 13:18, Yasumasa Suenaga wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>> >>>>>>>>>>>>>> I added gtest unit test case for this change in new webrev: >>>>>>>>>>>>>> >>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.05/ >>>>>>>>>>>>>> >>>>>>>>>>>>>> Could you review it? >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> 2017-09-27 0:01 GMT+09:00 Yasumasa >>>>>>>>>>>>>> Suenaga: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I uploaded new webrev to be adapted to jdk10/hs: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.04/ >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On 2017/09/21 7:45, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> PING: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Have you checked this issue? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.03/ >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On 2017/07/01 23:43, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> PING: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Have you checked this issue? >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On 2017/06/13 14:10, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> I want to discuss about JDK-8151815: Could not parse >>>>>>>>>>>>>>>>>> core image >>>>>>>>>>>>>>>>>> with >>>>>>>>>>>>>>>>>> JSnap. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> In last year, I found JSnap cannot parse coredump and >>>>>>>>>>>>>>>>>> I've sent >>>>>>>>>>>>>>>>>> review >>>>>>>>>>>>>>>>>> request for it as JDK-8151815. However it has not been >>>>>>>>>>>>>>>>>> reviewed >>>>>>>>>>>>>>>>>> yet >>>>>>>>>>>>>>>>>> [1]. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> We've discussed about safety implementation, but we >>>>>>>>>>>>>>>>>> could not >>>>>>>>>>>>>>>>>> get >>>>>>>>>>>>>>>>>> consensus. >>>>>>>>>>>>>>>>>> IMHO all SA tools should be handled java processes and >>>>>>>>>>>>>>>>>> core >>>>>>>>>>>>>>>>>> images, >>>>>>>>>>>>>>>>>> and PerfCounter value is useful. So I fix this issue. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> I uploaded new webrev for this issue. I think this >>>>>>>>>>>>>>>>>> patch is >>>>>>>>>>>>>>>>>> safety >>>>>>>>>>>>>>>>>> because new flag PerfMemory::_destroyed guards double >>>>>>>>>>>>>>>>>> free, and >>>>>>>>>>>>>>>>>> all >>>>>>>>>>>>>>>>>> members in PerfMemory is accessible (they are not >>>>>>>>>>>>>>>>>> munmap'ed) >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.03/ >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Can you cooperate? >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> [1] >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> http://mail.openjdk.java.net/pipermail/serviceability-dev/2016-April/019480.html >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>> >> From yasuenag at gmail.com Wed Oct 18 13:51:39 2017 From: yasuenag at gmail.com (Yasumasa Suenaga) Date: Wed, 18 Oct 2017 22:51:39 +0900 Subject: PING: RFR: JDK-8151815: Could not parse core image with JSnap. In-Reply-To: <78b0bfca-1259-e5ef-1189-a7646d7fea36@oracle.com> References: <0a760f75-58bd-c334-26c5-0e9adddfe5b7@gmail.com> <643d9ea2-5bb3-f6ef-8007-14cd1c580137@oracle.com> <4370fcb0-d06d-865b-8bab-e03ec1812e89@oracle.com> <84922d13-79ac-8904-1885-79ee0c2b4232@oracle.com> <967acad8-8378-cf99-0ff9-ba3852d51dd3@oracle.com> <41185116-0271-374c-da9e-74e173d6b58c@oracle.com> <7cb41834-75fa-9dc5-4f6d-c2b85f84dcee@oracle.com> <78b0bfca-1259-e5ef-1189-a7646d7fea36@oracle.com> Message-ID: <7b8ae324-c590-bb41-4bb6-2b4d18b12267@gmail.com> Hi David, Serguei, > because as soon as we have checked is_usable() and abort happening in another thread may have changed that by calling destroy. > > This code is basically broken if we hit an abort path instead of a normal VM shutdown. Can we use MutexLocker for initialize() and destroy() ? I've tried to fix about your comments, but I have an issue about volatile. PerfMemory.java depends on PerfMemory::_initialized. However VMStructs cannot handle static volatile variables. I think two approaches as below: 1. Remove _initialized check from PerfMemory.java SA will throw UnmappedAddressException if JSnap try to access invalid address including uninitialized memory. 2. Add static volatile support to VMStructs Which should we do? 1. is easy to fix. But 2. might be right way... Thanks, Yasumasa On 2017/10/18 21:34, David Holmes wrote: > Just to clarify ... > > On 18/10/2017 10:28 PM, David Holmes wrote: >> On 18/10/2017 8:26 PM, serguei.spitsyn at oracle.com wrote: >>> Hi David, >>> >>> Thank you for jumping to this review and helping Yasumasa to sort it out! >>> I've just discovered that this issue was already on the table for several months without a significant progress. >>> >>> >>> On 10/18/17 02:48, David Holmes wrote: >>>> Hi Serguei >>>> >>>> On 18/10/2017 7:25 PM, serguei.spitsyn at oracle.com wrote: >>>>> Hi Yasumasa, >>>>> >>>>> Sorry for a quite late participation. >>>>> >>>>> I looked at the previous webrevs and think that this one is much better. >>>>> >>>>> Some concern is if we need any kind of synchronization here, e.g. CAS. >>>>> But it depends on the PerfMemory class usage. >>>>> >>>>> Should we make the static variables '_initialized' and '_destroyed' volatile? >>>> >>>> For good measure - yes. >>>> >>>>> Also, the '_initialized' is set to 1 with: >>>>> ??? 159??? OrderAccess::release_store(&_initialized, 1); >>>>> >>>>> Should we do the same to set the '_destroyed'?: >>>>> 200 _destroyed = true; >>>> >>>> There is a benign initialization race but we need the release_store to ensure all the data fields can be read if _initialized is seen as true. But what is missing is a load_acquire() in is_initialized() to ensure we synchronize with that store! >>> >>> Yes, I noticed that the load_acquire() is missed. :| >>> >>>> >>>> There is also a potential for a destruction race (if multiple aborts happens concurrently in different threads) but that also seems benign. In this case there is no data being set so the store to _destroyed does not need to be a release_store. >>> >>> I'm not convinced yet this is benign as the PerfMemory::destroy() has this call: >>> ?? 197 delete_memory_region(); >> >> Yes though most of its work ends up being no-ops. >> >>> >>> Now, I started thinking about the asserts that call the is_useable(). >>> Should they be returns instead? >> >> I think this is a somewhat confused chunk of code. It's only fractionally thread-safe yet once in use could be in use concurrently with an aborting thread that calls destroy(). I don't think there is any simple fix for this. If we're in the process of crashing does it really matter if we trigger a secondary crash due to this? > > It doesn't matter if we do: > > assert(is_usable(),...); > // continue > > or > > if (!is_usable()) return; > // continue > > because as soon as we have checked is_usable() and abort happening in another thread may have changed that by calling destroy. > > This code is basically broken if we hit an abort path instead of a normal VM shutdown. > > David > ----- > >> The problems with this code go way beyond what Yasumasa is trying to address with the JSnap problem and I would not want to put it back on him to try and come up with an overall solution. >> >>> Then the is_destroyed() would better to have the load_acquire(). >> >> You could add a load_acquire and do the store_release. It certainly would not hurt, but I don't think it would actually benefit anything either. >> >> Cheers, >> David >> >>> Just interested to know what do you think on this. >>> >>> Thanks, >>> Serguei >>> >>>> >>>> Cheers, >>>> David >>>> >>>>> >>>>> Thanks, >>>>> Serguei >>>>> >>>>> >>>>> On 10/18/17 00:39, Yasumasa Suenaga wrote: >>>>>> Hi David, >>>>>> >>>>>> Thank you for your comment. >>>>>> I uploaded new webrev: >>>>>> >>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.07/ >>>>>> >>>>>> Serguei, please comment about this :-) >>>>>> >>>>>> >>>>>> Yasumasa >>>>>> >>>>>> >>>>>> >>>>>> 2017-10-18 16:09 GMT+09:00 David Holmes: >>>>>>> Hi Yasumasa, >>>>>>> >>>>>>> On 18/10/2017 4:34 PM, Yasumasa Suenaga wrote: >>>>>>>> Hi David, >>>>>>>> >>>>>>>>> I don't think we need the extra fields, just ensure the existing ones >>>>>>>>> can't >>>>>>>>> be accessed (other than by the tools) after destroy is called. >>>>>>>> >>>>>>>> I've added PerfMemory::is_useable() to check whether we can access to >>>>>>>> PerfMemory. >>>>>>>> I think this webrev prevent to access to PerfMemory after destroy() call. >>>>>>>> >>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.06/ >>>>>>> >>>>>>> This: >>>>>>> >>>>>>> ?? 90 void PerfMemory::initialize() { >>>>>>> ?? 91 >>>>>>> ?? 92?? if (_prologue != NULL) >>>>>>> ?? 93???? // initialization already performed >>>>>>> ?? 94???? return; >>>>>>> >>>>>>> shouldn't check _prologue, but is_initialized(). >>>>>>> >>>>>>> ? 213?? assert(is_useable(), "called before initialization"); >>>>>>> >>>>>>> -> "called before init or after destroy" >>>>>>> >>>>>>> Could add a similar assert in PerfMemory::mark_updated(). >>>>>>> >>>>>>> Let's see what Serguei thinks. :) >>>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> David >>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Yasumasa >>>>>>>> >>>>>>>> >>>>>>>> 2017-10-18 13:44 GMT+09:00 David Holmes: >>>>>>>>> On 18/10/2017 2:27 PM, Yasumasa Suenaga wrote: >>>>>>>>>> >>>>>>>>>> Hi David, >>>>>>>>>> >>>>>>>>>> 2017-10-18 12:55 GMT+09:00 David Holmes: >>>>>>>>>>> >>>>>>>>>>> On 18/10/2017 12:37 PM, Yasumasa Suenaga wrote: >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Hi David, >>>>>>>>>>>> >>>>>>>>>>>>> With your changes you no longer null out _prologue so the assertion >>>>>>>>>>>>> would >>>>>>>>>>>>> now not fail and we'd proceed to access the deleted memory region! >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Linux, PerfMemory::delete_memory_region() does not call munmap() >>>>>>>>>>>> for PerfMemory. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Perhaps not but there are still other actions that happen and the point >>>>>>>>>>> is >>>>>>>>>>> we should not be able to continue to use PerfMemory once it has been >>>>>>>>>>> destroyed (even if the destruction is only logical). >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> I received same comment from Dmitry in the past, but we couldn't >>>>>>>>>> decide how should we do. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> http://mail.openjdk.java.net/pipermail/serviceability-dev/2016-May/019728.html >>>>>>>>>> >>>>>>>>>> In that discussion, I uploaded another webrev which adds other fields >>>>>>>>>> for >>>>>>>>>> JSnap. >>>>>>>>>> Is it suitable? >>>>>>>>>> >>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.02/ >>>>>>>>> >>>>>>>>> >>>>>>>>> I don't think we need the extra fields, just ensure the existing ones >>>>>>>>> can't >>>>>>>>> be accessed (other than by the tools) after destroy is called. >>>>>>>>> >>>>>>>>>>>>> I'm unclear why you no longer clear all the fields set during >>>>>>>>>>>>> initialization? >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> PerfMemory.java in jdk.hotspot.agent needs these field values. >>>>>>>>>>>> `jhsdb jsnap --core` is failed if they are cleared. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> I'm not familiar with these tools. When do we produce a core file after >>>>>>>>>>> calling PerfMemory::destroy ? >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> PerfMemory::destroy() is called before aborting. >>>>>>>>> >>>>>>>>> >>>>>>>>> Ah - right. I assume we need to close off the perfdata file before we >>>>>>>>> abort. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> David >>>>>>>>> >>>>>>>>> >>>>>>>>>> ----------------------- >>>>>>>>>> #0? perfMemory_exit () >>>>>>>>>> ?????? at >>>>>>>>>> >>>>>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/share/vm/runtime/perfMemory.cpp:80 >>>>>>>>>> #1? 0x00007f99b091c949 in os::shutdown () >>>>>>>>>> ?????? at >>>>>>>>>> >>>>>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:1483 >>>>>>>>>> #2? 0x00007f99b091c980 in os::abort (dump_core=) >>>>>>>>>> ?????? at >>>>>>>>>> >>>>>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:1503 >>>>>>>>>> #3? 0x00007f99b0b689c3 in VMError::report_and_die ( >>>>>>>>>> ?????? this=this at entry=0x7ffcacf40b50) >>>>>>>>>> ?????? at >>>>>>>>>> >>>>>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/share/vm/utilities/vmError.cpp:1060 >>>>>>>>>> #4? 0x00007f99b0926f04 in JVM_handle_linux_signal (sig=sig at entry=11, >>>>>>>>>> ?????? info=info at entry=0x7ffcacf40df0, >>>>>>>>>> ucVoid=ucVoid at entry=0x7ffcacf40cc0, >>>>>>>>>> abort_if_unrecognized=abort_if_unrecognized at entry=1) >>>>>>>>>> ?????? at >>>>>>>>>> >>>>>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/os_cpu/linux_x86/vm/os_linux_x86.cpp:541 >>>>>>>>>> ----------------------- >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> >>>>>>>>>> Yasumasa >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>>>> But it seems to me that there are various checks of >>>>>>>>>>>>> _prologue that should really be checking is_initialized() and/or >>>>>>>>>>>>> is_destroyed() as a guard. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Should I change all assertions for _prologue? >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Assertions and direct guards. Checking _prologue is a placeholder for >>>>>>>>>>> the >>>>>>>>>>> real check. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> David >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> >>>>>>>>>>>> Yasumasa >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> 2017-10-18 10:53 GMT+09:00 David Holmes: >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Hi Yasumasa, >>>>>>>>>>>>> >>>>>>>>>>>>> By chance we ran into this bug which I analysed yesterday: >>>>>>>>>>>>> >>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8189390 >>>>>>>>>>>>> >>>>>>>>>>>>> We hit the assertion: >>>>>>>>>>>>> >>>>>>>>>>>>> #? Internal Error >>>>>>>>>>>>> (/open/src/hotspot/share/runtime/perfMemory.cpp:216), >>>>>>>>>>>>> pid=17874, tid=17875 >>>>>>>>>>>>> #? assert(_prologue != __null) failed: called before initialization >>>>>>>>>>>>> # >>>>>>>>>>>>> >>>>>>>>>>>>> which is misleading because it can fail if called before >>>>>>>>>>>>> initialization, >>>>>>>>>>>>> or >>>>>>>>>>>>> after PerfMemory::destroy has been called. >>>>>>>>>>>>> >>>>>>>>>>>>> With your changes you no longer null out _prologue so the assertion >>>>>>>>>>>>> would >>>>>>>>>>>>> now not fail and we'd proceed to access the deleted memory region! >>>>>>>>>>>>> >>>>>>>>>>>>> I'm unclear why you no longer clear all the fields set during >>>>>>>>>>>>> initialization? But it seems to me that there are various checks of >>>>>>>>>>>>> _prologue that should really be checking is_initialized() and/or >>>>>>>>>>>>> is_destroyed() as a guard. >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> David >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On 16/10/2017 11:25 PM, Yasumasa Suenaga wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> PING: >>>>>>>>>>>>>> >>>>>>>>>>>>>> Could you review it? >>>>>>>>>>>>>> >>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.05/ >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On 2017/10/03 13:18, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I added gtest unit test case for this change in new webrev: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.05/ >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Could you review it? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> 2017-09-27 0:01 GMT+09:00 Yasumasa Suenaga: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I uploaded new webrev to be adapted to jdk10/hs: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.04/ >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On 2017/09/21 7:45, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> PING: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Have you checked this issue? >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.03/ >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On 2017/07/01 23:43, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> PING: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Have you checked this issue? >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On 2017/06/13 14:10, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> I want to discuss about JDK-8151815: Could not parse core image >>>>>>>>>>>>>>>>>>> with >>>>>>>>>>>>>>>>>>> JSnap. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> In last year, I found JSnap cannot parse coredump and I've sent >>>>>>>>>>>>>>>>>>> review >>>>>>>>>>>>>>>>>>> request for it as JDK-8151815. However it has not been reviewed >>>>>>>>>>>>>>>>>>> yet >>>>>>>>>>>>>>>>>>> [1]. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> We've discussed about safety implementation, but we could not >>>>>>>>>>>>>>>>>>> get >>>>>>>>>>>>>>>>>>> consensus. >>>>>>>>>>>>>>>>>>> IMHO all SA tools should be handled java processes and core >>>>>>>>>>>>>>>>>>> images, >>>>>>>>>>>>>>>>>>> and PerfCounter value is useful. So I fix this issue. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> I uploaded new webrev for this issue. I think this patch is >>>>>>>>>>>>>>>>>>> safety >>>>>>>>>>>>>>>>>>> because new flag PerfMemory::_destroyed guards double free, and >>>>>>>>>>>>>>>>>>> all >>>>>>>>>>>>>>>>>>> members in PerfMemory is accessible (they are not munmap'ed) >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.03/ >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Can you cooperate? >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> [1] >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> http://mail.openjdk.java.net/pipermail/serviceability-dev/2016-April/019480.html >>>>>>>>>>>>>>>>>>> >>>>> >>> From serguei.spitsyn at oracle.com Wed Oct 18 18:15:02 2017 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 18 Oct 2017 11:15:02 -0700 Subject: PING: RFR: JDK-8151815: Could not parse core image with JSnap. In-Reply-To: References: <0a760f75-58bd-c334-26c5-0e9adddfe5b7@gmail.com> <643d9ea2-5bb3-f6ef-8007-14cd1c580137@oracle.com> <4370fcb0-d06d-865b-8bab-e03ec1812e89@oracle.com> <84922d13-79ac-8904-1885-79ee0c2b4232@oracle.com> <967acad8-8378-cf99-0ff9-ba3852d51dd3@oracle.com> <41185116-0271-374c-da9e-74e173d6b58c@oracle.com> <7cb41834-75fa-9dc5-4f6d-c2b85f84dcee@oracle.com> Message-ID: On 10/18/17 05:28, David Holmes wrote: > On 18/10/2017 8:26 PM, serguei.spitsyn at oracle.com wrote: >> Hi David, >> >> Thank you for jumping to this review and helping Yasumasa to sort it >> out! >> I've just discovered that this issue was already on the table for >> several months without a significant progress. >> >> >> On 10/18/17 02:48, David Holmes wrote: >>> Hi Serguei >>> >>> On 18/10/2017 7:25 PM, serguei.spitsyn at oracle.com wrote: >>>> Hi Yasumasa, >>>> >>>> Sorry for a quite late participation. >>>> >>>> I looked at the previous webrevs and think that this one is much >>>> better. >>>> >>>> Some concern is if we need any kind of synchronization here, e.g. CAS. >>>> But it depends on the PerfMemory class usage. >>>> >>>> Should we make the static variables '_initialized' and '_destroyed' >>>> volatile? >>> >>> For good measure - yes. >>> >>>> Also, the '_initialized' is set to 1 with: >>>> ??? 159??? OrderAccess::release_store(&_initialized, 1); >>>> >>>> Should we do the same to set the '_destroyed'?: >>>> 200 _destroyed = true; >>> >>> There is a benign initialization race but we need the release_store >>> to ensure all the data fields can be read if _initialized is seen as >>> true. But what is missing is a load_acquire() in is_initialized() to >>> ensure we synchronize with that store! >> >> Yes, I noticed that the load_acquire() is missed. :| >> >>> >>> There is also a potential for a destruction race (if multiple aborts >>> happens concurrently in different threads) but that also seems >>> benign. In this case there is no data being set so the store to >>> _destroyed does not need to be a release_store. >> >> I'm not convinced yet this is benign as the PerfMemory::destroy() has >> this call: >> ?? 197 delete_memory_region(); > > Yes though most of its work ends up being no-ops. This is the implementation of the PerfMemory::delete_memory_region() for Solaris: 1229 void PerfMemory::delete_memory_region() { 1230 1231?? assert((start() != NULL && capacity() > 0), "verify proper state"); 1232 1233?? // If user specifies PerfDataSaveFile, it will save the performance data 1234?? // to the specified file name no matter whether PerfDataSaveToFile is specified 1235?? // or not. In other word, -XX:PerfDataSaveFile=.. overrides flag 1236?? // -XX:+PerfDataSaveToFile. 1237?? if (PerfDataSaveToFile || PerfDataSaveFile != NULL) { 1238???? save_memory_to_file(start(), capacity()); ?? <== This function is non-trivial 1239?? } 1240 1241?? if (PerfDisableSharedMem) { 1242???? delete_standard_memory(start(), capacity()); 1243?? } 1244?? else { 1245???? delete_shared_memory(start(), capacity()); 1246?? } 1247 } > >> >> Now, I started thinking about the asserts that call the is_useable(). >> Should they be returns instead? > > I think this is a somewhat confused chunk of code. It's only > fractionally thread-safe yet once in use could be in use concurrently > with an aborting thread that calls destroy(). I don't think there is > any simple fix for this. If we're in the process of crashing does it > really matter if we trigger a secondary crash due to this? Got it, thanks. > The problems with this code go way beyond what Yasumasa is trying to > address with the JSnap problem and I would not want to put it back on > him to try and come up with an overall solution. I was thinking about the same. >> Then the is_destroyed() would better to have the load_acquire(). > > You could add a load_acquire and do the store_release. It certainly > would not hurt, but I don't think it would actually benefit anything > either. Ok with me. Thanks, Serguei > > Cheers, > David > >> Just interested to know what do you think on this. >> >> Thanks, >> Serguei >> >>> >>> Cheers, >>> David >>> >>>> >>>> Thanks, >>>> Serguei >>>> >>>> >>>> On 10/18/17 00:39, Yasumasa Suenaga wrote: >>>>> Hi David, >>>>> >>>>> Thank you for your comment. >>>>> I uploaded new webrev: >>>>> >>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.07/ >>>>> >>>>> Serguei, please comment about this :-) >>>>> >>>>> >>>>> Yasumasa >>>>> >>>>> >>>>> >>>>> 2017-10-18 16:09 GMT+09:00 David Holmes: >>>>>> Hi Yasumasa, >>>>>> >>>>>> On 18/10/2017 4:34 PM, Yasumasa Suenaga wrote: >>>>>>> Hi David, >>>>>>> >>>>>>>> I don't think we need the extra fields, just ensure the >>>>>>>> existing ones >>>>>>>> can't >>>>>>>> be accessed (other than by the tools) after destroy is called. >>>>>>> >>>>>>> I've added PerfMemory::is_useable() to check whether we can >>>>>>> access to >>>>>>> PerfMemory. >>>>>>> I think this webrev prevent to access to PerfMemory after >>>>>>> destroy() call. >>>>>>> >>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.06/ >>>>>> >>>>>> This: >>>>>> >>>>>> ?? 90 void PerfMemory::initialize() { >>>>>> ?? 91 >>>>>> ?? 92?? if (_prologue != NULL) >>>>>> ?? 93???? // initialization already performed >>>>>> ?? 94???? return; >>>>>> >>>>>> shouldn't check _prologue, but is_initialized(). >>>>>> >>>>>> ? 213?? assert(is_useable(), "called before initialization"); >>>>>> >>>>>> -> "called before init or after destroy" >>>>>> >>>>>> Could add a similar assert in PerfMemory::mark_updated(). >>>>>> >>>>>> Let's see what Serguei thinks. :) >>>>>> >>>>>> >>>>>> Thanks, >>>>>> David >>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Yasumasa >>>>>>> >>>>>>> >>>>>>> 2017-10-18 13:44 GMT+09:00 David Holmes: >>>>>>>> On 18/10/2017 2:27 PM, Yasumasa Suenaga wrote: >>>>>>>>> >>>>>>>>> Hi David, >>>>>>>>> >>>>>>>>> 2017-10-18 12:55 GMT+09:00 David Holmes: >>>>>>>>>> >>>>>>>>>> On 18/10/2017 12:37 PM, Yasumasa Suenaga wrote: >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Hi David, >>>>>>>>>>> >>>>>>>>>>>> With your changes you no longer null out _prologue so the >>>>>>>>>>>> assertion >>>>>>>>>>>> would >>>>>>>>>>>> now not fail and we'd proceed to access the deleted memory >>>>>>>>>>>> region! >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Linux, PerfMemory::delete_memory_region() does not call >>>>>>>>>>> munmap() >>>>>>>>>>> for PerfMemory. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Perhaps not but there are still other actions that happen and >>>>>>>>>> the point >>>>>>>>>> is >>>>>>>>>> we should not be able to continue to use PerfMemory once it >>>>>>>>>> has been >>>>>>>>>> destroyed (even if the destruction is only logical). >>>>>>>>> >>>>>>>>> >>>>>>>>> I received same comment from Dmitry in the past, but we couldn't >>>>>>>>> decide how should we do. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> http://mail.openjdk.java.net/pipermail/serviceability-dev/2016-May/019728.html >>>>>>>>> >>>>>>>>> >>>>>>>>> In that discussion, I uploaded another webrev which adds other >>>>>>>>> fields >>>>>>>>> for >>>>>>>>> JSnap. >>>>>>>>> Is it suitable? >>>>>>>>> >>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.02/ >>>>>>>> >>>>>>>> >>>>>>>> I don't think we need the extra fields, just ensure the >>>>>>>> existing ones >>>>>>>> can't >>>>>>>> be accessed (other than by the tools) after destroy is called. >>>>>>>> >>>>>>>>>>>> I'm unclear why you no longer clear all the fields set during >>>>>>>>>>>> initialization? >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> PerfMemory.java in jdk.hotspot.agent needs these field values. >>>>>>>>>>> `jhsdb jsnap --core` is failed if they are cleared. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> I'm not familiar with these tools. When do we produce a core >>>>>>>>>> file after >>>>>>>>>> calling PerfMemory::destroy ? >>>>>>>>> >>>>>>>>> >>>>>>>>> PerfMemory::destroy() is called before aborting. >>>>>>>> >>>>>>>> >>>>>>>> Ah - right. I assume we need to close off the perfdata file >>>>>>>> before we >>>>>>>> abort. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> David >>>>>>>> >>>>>>>> >>>>>>>>> ----------------------- >>>>>>>>> #0? perfMemory_exit () >>>>>>>>> ?????? at >>>>>>>>> >>>>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/share/vm/runtime/perfMemory.cpp:80 >>>>>>>>> >>>>>>>>> #1? 0x00007f99b091c949 in os::shutdown () >>>>>>>>> ?????? at >>>>>>>>> >>>>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:1483 >>>>>>>>> >>>>>>>>> #2? 0x00007f99b091c980 in os::abort (dump_core=) >>>>>>>>> ?????? at >>>>>>>>> >>>>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:1503 >>>>>>>>> >>>>>>>>> #3? 0x00007f99b0b689c3 in VMError::report_and_die ( >>>>>>>>> ?????? this=this at entry=0x7ffcacf40b50) >>>>>>>>> ?????? at >>>>>>>>> >>>>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/share/vm/utilities/vmError.cpp:1060 >>>>>>>>> >>>>>>>>> #4? 0x00007f99b0926f04 in JVM_handle_linux_signal >>>>>>>>> (sig=sig at entry=11, >>>>>>>>> ?????? info=info at entry=0x7ffcacf40df0, >>>>>>>>> ucVoid=ucVoid at entry=0x7ffcacf40cc0, >>>>>>>>> abort_if_unrecognized=abort_if_unrecognized at entry=1) >>>>>>>>> ?????? at >>>>>>>>> >>>>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/os_cpu/linux_x86/vm/os_linux_x86.cpp:541 >>>>>>>>> >>>>>>>>> ----------------------- >>>>>>>>> >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> Yasumasa >>>>>>>>> >>>>>>>>> >>>>>>>>>>>> But it seems to me that there are various checks of >>>>>>>>>>>> _prologue that should really be checking is_initialized() >>>>>>>>>>>> and/or >>>>>>>>>>>> is_destroyed() as a guard. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Should I change all assertions for _prologue? >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Assertions and direct guards. Checking _prologue is a >>>>>>>>>> placeholder for >>>>>>>>>> the >>>>>>>>>> real check. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> David >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> >>>>>>>>>>> Yasumasa >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> 2017-10-18 10:53 GMT+09:00 David >>>>>>>>>>> Holmes: >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Hi Yasumasa, >>>>>>>>>>>> >>>>>>>>>>>> By chance we ran into this bug which I analysed yesterday: >>>>>>>>>>>> >>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8189390 >>>>>>>>>>>> >>>>>>>>>>>> We hit the assertion: >>>>>>>>>>>> >>>>>>>>>>>> #? Internal Error >>>>>>>>>>>> (/open/src/hotspot/share/runtime/perfMemory.cpp:216), >>>>>>>>>>>> pid=17874, tid=17875 >>>>>>>>>>>> #? assert(_prologue != __null) failed: called before >>>>>>>>>>>> initialization >>>>>>>>>>>> # >>>>>>>>>>>> >>>>>>>>>>>> which is misleading because it can fail if called before >>>>>>>>>>>> initialization, >>>>>>>>>>>> or >>>>>>>>>>>> after PerfMemory::destroy has been called. >>>>>>>>>>>> >>>>>>>>>>>> With your changes you no longer null out _prologue so the >>>>>>>>>>>> assertion >>>>>>>>>>>> would >>>>>>>>>>>> now not fail and we'd proceed to access the deleted memory >>>>>>>>>>>> region! >>>>>>>>>>>> >>>>>>>>>>>> I'm unclear why you no longer clear all the fields set during >>>>>>>>>>>> initialization? But it seems to me that there are various >>>>>>>>>>>> checks of >>>>>>>>>>>> _prologue that should really be checking is_initialized() >>>>>>>>>>>> and/or >>>>>>>>>>>> is_destroyed() as a guard. >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> David >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On 16/10/2017 11:25 PM, Yasumasa Suenaga wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> PING: >>>>>>>>>>>>> >>>>>>>>>>>>> Could you review it? >>>>>>>>>>>>> >>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.05/ >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> >>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On 2017/10/03 13:18, Yasumasa Suenaga wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>> >>>>>>>>>>>>>> I added gtest unit test case for this change in new webrev: >>>>>>>>>>>>>> >>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.05/ >>>>>>>>>>>>>> >>>>>>>>>>>>>> Could you review it? >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> 2017-09-27 0:01 GMT+09:00 Yasumasa >>>>>>>>>>>>>> Suenaga: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I uploaded new webrev to be adapted to jdk10/hs: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.04/ >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On 2017/09/21 7:45, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> PING: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Have you checked this issue? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.03/ >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On 2017/07/01 23:43, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> PING: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Have you checked this issue? >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On 2017/06/13 14:10, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> I want to discuss about JDK-8151815: Could not parse >>>>>>>>>>>>>>>>>> core image >>>>>>>>>>>>>>>>>> with >>>>>>>>>>>>>>>>>> JSnap. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> In last year, I found JSnap cannot parse coredump and >>>>>>>>>>>>>>>>>> I've sent >>>>>>>>>>>>>>>>>> review >>>>>>>>>>>>>>>>>> request for it as JDK-8151815. However it has not >>>>>>>>>>>>>>>>>> been reviewed >>>>>>>>>>>>>>>>>> yet >>>>>>>>>>>>>>>>>> [1]. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> We've discussed about safety implementation, but we >>>>>>>>>>>>>>>>>> could not >>>>>>>>>>>>>>>>>> get >>>>>>>>>>>>>>>>>> consensus. >>>>>>>>>>>>>>>>>> IMHO all SA tools should be handled java processes >>>>>>>>>>>>>>>>>> and core >>>>>>>>>>>>>>>>>> images, >>>>>>>>>>>>>>>>>> and PerfCounter value is useful. So I fix this issue. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> I uploaded new webrev for this issue. I think this >>>>>>>>>>>>>>>>>> patch is >>>>>>>>>>>>>>>>>> safety >>>>>>>>>>>>>>>>>> because new flag PerfMemory::_destroyed guards double >>>>>>>>>>>>>>>>>> free, and >>>>>>>>>>>>>>>>>> all >>>>>>>>>>>>>>>>>> members in PerfMemory is accessible (they are not >>>>>>>>>>>>>>>>>> munmap'ed) >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.03/ >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Can you cooperate? >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> [1] >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> http://mail.openjdk.java.net/pipermail/serviceability-dev/2016-April/019480.html >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>> >> From serguei.spitsyn at oracle.com Wed Oct 18 18:17:30 2017 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 18 Oct 2017 11:17:30 -0700 Subject: PING: RFR: JDK-8151815: Could not parse core image with JSnap. In-Reply-To: <78b0bfca-1259-e5ef-1189-a7646d7fea36@oracle.com> References: <0a760f75-58bd-c334-26c5-0e9adddfe5b7@gmail.com> <643d9ea2-5bb3-f6ef-8007-14cd1c580137@oracle.com> <4370fcb0-d06d-865b-8bab-e03ec1812e89@oracle.com> <84922d13-79ac-8904-1885-79ee0c2b4232@oracle.com> <967acad8-8378-cf99-0ff9-ba3852d51dd3@oracle.com> <41185116-0271-374c-da9e-74e173d6b58c@oracle.com> <7cb41834-75fa-9dc5-4f6d-c2b85f84dcee@oracle.com> <78b0bfca-1259-e5ef-1189-a7646d7fea36@oracle.com> Message-ID: <9e724aab-dbe9-e830-9c97-d7b6d2ca56a4@oracle.com> On 10/18/17 05:34, David Holmes wrote: > Just to clarify ... > > On 18/10/2017 10:28 PM, David Holmes wrote: >> On 18/10/2017 8:26 PM, serguei.spitsyn at oracle.com wrote: >>> Hi David, >>> >>> Thank you for jumping to this review and helping Yasumasa to sort it >>> out! >>> I've just discovered that this issue was already on the table for >>> several months without a significant progress. >>> >>> >>> On 10/18/17 02:48, David Holmes wrote: >>>> Hi Serguei >>>> >>>> On 18/10/2017 7:25 PM, serguei.spitsyn at oracle.com wrote: >>>>> Hi Yasumasa, >>>>> >>>>> Sorry for a quite late participation. >>>>> >>>>> I looked at the previous webrevs and think that this one is much >>>>> better. >>>>> >>>>> Some concern is if we need any kind of synchronization here, e.g. >>>>> CAS. >>>>> But it depends on the PerfMemory class usage. >>>>> >>>>> Should we make the static variables '_initialized' and >>>>> '_destroyed' volatile? >>>> >>>> For good measure - yes. >>>> >>>>> Also, the '_initialized' is set to 1 with: >>>>> ??? 159??? OrderAccess::release_store(&_initialized, 1); >>>>> >>>>> Should we do the same to set the '_destroyed'?: >>>>> 200 _destroyed = true; >>>> >>>> There is a benign initialization race but we need the release_store >>>> to ensure all the data fields can be read if _initialized is seen >>>> as true. But what is missing is a load_acquire() in >>>> is_initialized() to ensure we synchronize with that store! >>> >>> Yes, I noticed that the load_acquire() is missed. :| >>> >>>> >>>> There is also a potential for a destruction race (if multiple >>>> aborts happens concurrently in different threads) but that also >>>> seems benign. In this case there is no data being set so the store >>>> to _destroyed does not need to be a release_store. >>> >>> I'm not convinced yet this is benign as the PerfMemory::destroy() >>> has this call: >>> ?? 197 delete_memory_region(); >> >> Yes though most of its work ends up being no-ops. >> >>> >>> Now, I started thinking about the asserts that call the is_useable(). >>> Should they be returns instead? >> >> I think this is a somewhat confused chunk of code. It's only >> fractionally thread-safe yet once in use could be in use concurrently >> with an aborting thread that calls destroy(). I don't think there is >> any simple fix for this. If we're in the process of crashing does it >> really matter if we trigger a secondary crash due to this? > > It doesn't matter if we do: > > assert(is_usable(),...); > // continue > > or > > if (!is_usable()) return; > // continue > > because as soon as we have checked is_usable() and abort happening in > another thread may have changed that by calling destroy. > > This code is basically broken if we hit an abort path instead of a > normal VM shutdown. Agreed. Thanks, Serguei > > David > ----- > >> The problems with this code go way beyond what Yasumasa is trying to >> address with the JSnap problem and I would not want to put it back on >> him to try and come up with an overall solution. >> >>> Then the is_destroyed() would better to have the load_acquire(). >> >> You could add a load_acquire and do the store_release. It certainly >> would not hurt, but I don't think it would actually benefit anything >> either. >> >> Cheers, >> David >> >>> Just interested to know what do you think on this. >>> >>> Thanks, >>> Serguei >>> >>>> >>>> Cheers, >>>> David >>>> >>>>> >>>>> Thanks, >>>>> Serguei >>>>> >>>>> >>>>> On 10/18/17 00:39, Yasumasa Suenaga wrote: >>>>>> Hi David, >>>>>> >>>>>> Thank you for your comment. >>>>>> I uploaded new webrev: >>>>>> >>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.07/ >>>>>> >>>>>> Serguei, please comment about this :-) >>>>>> >>>>>> >>>>>> Yasumasa >>>>>> >>>>>> >>>>>> >>>>>> 2017-10-18 16:09 GMT+09:00 David Holmes: >>>>>>> Hi Yasumasa, >>>>>>> >>>>>>> On 18/10/2017 4:34 PM, Yasumasa Suenaga wrote: >>>>>>>> Hi David, >>>>>>>> >>>>>>>>> I don't think we need the extra fields, just ensure the >>>>>>>>> existing ones >>>>>>>>> can't >>>>>>>>> be accessed (other than by the tools) after destroy is called. >>>>>>>> >>>>>>>> I've added PerfMemory::is_useable() to check whether we can >>>>>>>> access to >>>>>>>> PerfMemory. >>>>>>>> I think this webrev prevent to access to PerfMemory after >>>>>>>> destroy() call. >>>>>>>> >>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.06/ >>>>>>> >>>>>>> This: >>>>>>> >>>>>>> ?? 90 void PerfMemory::initialize() { >>>>>>> ?? 91 >>>>>>> ?? 92?? if (_prologue != NULL) >>>>>>> ?? 93???? // initialization already performed >>>>>>> ?? 94???? return; >>>>>>> >>>>>>> shouldn't check _prologue, but is_initialized(). >>>>>>> >>>>>>> ? 213?? assert(is_useable(), "called before initialization"); >>>>>>> >>>>>>> -> "called before init or after destroy" >>>>>>> >>>>>>> Could add a similar assert in PerfMemory::mark_updated(). >>>>>>> >>>>>>> Let's see what Serguei thinks. :) >>>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> David >>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Yasumasa >>>>>>>> >>>>>>>> >>>>>>>> 2017-10-18 13:44 GMT+09:00 David Holmes: >>>>>>>>> On 18/10/2017 2:27 PM, Yasumasa Suenaga wrote: >>>>>>>>>> >>>>>>>>>> Hi David, >>>>>>>>>> >>>>>>>>>> 2017-10-18 12:55 GMT+09:00 David >>>>>>>>>> Holmes: >>>>>>>>>>> >>>>>>>>>>> On 18/10/2017 12:37 PM, Yasumasa Suenaga wrote: >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Hi David, >>>>>>>>>>>> >>>>>>>>>>>>> With your changes you no longer null out _prologue so the >>>>>>>>>>>>> assertion >>>>>>>>>>>>> would >>>>>>>>>>>>> now not fail and we'd proceed to access the deleted memory >>>>>>>>>>>>> region! >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Linux, PerfMemory::delete_memory_region() does not call >>>>>>>>>>>> munmap() >>>>>>>>>>>> for PerfMemory. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Perhaps not but there are still other actions that happen >>>>>>>>>>> and the point >>>>>>>>>>> is >>>>>>>>>>> we should not be able to continue to use PerfMemory once it >>>>>>>>>>> has been >>>>>>>>>>> destroyed (even if the destruction is only logical). >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> I received same comment from Dmitry in the past, but we couldn't >>>>>>>>>> decide how should we do. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> http://mail.openjdk.java.net/pipermail/serviceability-dev/2016-May/019728.html >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> In that discussion, I uploaded another webrev which adds >>>>>>>>>> other fields >>>>>>>>>> for >>>>>>>>>> JSnap. >>>>>>>>>> Is it suitable? >>>>>>>>>> >>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.02/ >>>>>>>>> >>>>>>>>> >>>>>>>>> I don't think we need the extra fields, just ensure the >>>>>>>>> existing ones >>>>>>>>> can't >>>>>>>>> be accessed (other than by the tools) after destroy is called. >>>>>>>>> >>>>>>>>>>>>> I'm unclear why you no longer clear all the fields set during >>>>>>>>>>>>> initialization? >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> PerfMemory.java in jdk.hotspot.agent needs these field values. >>>>>>>>>>>> `jhsdb jsnap --core` is failed if they are cleared. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> I'm not familiar with these tools. When do we produce a core >>>>>>>>>>> file after >>>>>>>>>>> calling PerfMemory::destroy ? >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> PerfMemory::destroy() is called before aborting. >>>>>>>>> >>>>>>>>> >>>>>>>>> Ah - right. I assume we need to close off the perfdata file >>>>>>>>> before we >>>>>>>>> abort. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> David >>>>>>>>> >>>>>>>>> >>>>>>>>>> ----------------------- >>>>>>>>>> #0? perfMemory_exit () >>>>>>>>>> ?????? at >>>>>>>>>> >>>>>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/share/vm/runtime/perfMemory.cpp:80 >>>>>>>>>> >>>>>>>>>> #1? 0x00007f99b091c949 in os::shutdown () >>>>>>>>>> ?????? at >>>>>>>>>> >>>>>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:1483 >>>>>>>>>> >>>>>>>>>> #2? 0x00007f99b091c980 in os::abort (dump_core=) >>>>>>>>>> ?????? at >>>>>>>>>> >>>>>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:1503 >>>>>>>>>> >>>>>>>>>> #3? 0x00007f99b0b689c3 in VMError::report_and_die ( >>>>>>>>>> ?????? this=this at entry=0x7ffcacf40b50) >>>>>>>>>> ?????? at >>>>>>>>>> >>>>>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/share/vm/utilities/vmError.cpp:1060 >>>>>>>>>> >>>>>>>>>> #4? 0x00007f99b0926f04 in JVM_handle_linux_signal >>>>>>>>>> (sig=sig at entry=11, >>>>>>>>>> ?????? info=info at entry=0x7ffcacf40df0, >>>>>>>>>> ucVoid=ucVoid at entry=0x7ffcacf40cc0, >>>>>>>>>> abort_if_unrecognized=abort_if_unrecognized at entry=1) >>>>>>>>>> ?????? at >>>>>>>>>> >>>>>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/os_cpu/linux_x86/vm/os_linux_x86.cpp:541 >>>>>>>>>> >>>>>>>>>> ----------------------- >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> >>>>>>>>>> Yasumasa >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>>>> But it seems to me that there are various checks of >>>>>>>>>>>>> _prologue that should really be checking is_initialized() >>>>>>>>>>>>> and/or >>>>>>>>>>>>> is_destroyed() as a guard. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Should I change all assertions for _prologue? >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Assertions and direct guards. Checking _prologue is a >>>>>>>>>>> placeholder for >>>>>>>>>>> the >>>>>>>>>>> real check. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> David >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> >>>>>>>>>>>> Yasumasa >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> 2017-10-18 10:53 GMT+09:00 David >>>>>>>>>>>> Holmes: >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Hi Yasumasa, >>>>>>>>>>>>> >>>>>>>>>>>>> By chance we ran into this bug which I analysed yesterday: >>>>>>>>>>>>> >>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8189390 >>>>>>>>>>>>> >>>>>>>>>>>>> We hit the assertion: >>>>>>>>>>>>> >>>>>>>>>>>>> #? Internal Error >>>>>>>>>>>>> (/open/src/hotspot/share/runtime/perfMemory.cpp:216), >>>>>>>>>>>>> pid=17874, tid=17875 >>>>>>>>>>>>> #? assert(_prologue != __null) failed: called before >>>>>>>>>>>>> initialization >>>>>>>>>>>>> # >>>>>>>>>>>>> >>>>>>>>>>>>> which is misleading because it can fail if called before >>>>>>>>>>>>> initialization, >>>>>>>>>>>>> or >>>>>>>>>>>>> after PerfMemory::destroy has been called. >>>>>>>>>>>>> >>>>>>>>>>>>> With your changes you no longer null out _prologue so the >>>>>>>>>>>>> assertion >>>>>>>>>>>>> would >>>>>>>>>>>>> now not fail and we'd proceed to access the deleted memory >>>>>>>>>>>>> region! >>>>>>>>>>>>> >>>>>>>>>>>>> I'm unclear why you no longer clear all the fields set during >>>>>>>>>>>>> initialization? But it seems to me that there are various >>>>>>>>>>>>> checks of >>>>>>>>>>>>> _prologue that should really be checking is_initialized() >>>>>>>>>>>>> and/or >>>>>>>>>>>>> is_destroyed() as a guard. >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> David >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On 16/10/2017 11:25 PM, Yasumasa Suenaga wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> PING: >>>>>>>>>>>>>> >>>>>>>>>>>>>> Could you review it? >>>>>>>>>>>>>> >>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.05/ >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On 2017/10/03 13:18, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I added gtest unit test case for this change in new webrev: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.05/ >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Could you review it? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> 2017-09-27 0:01 GMT+09:00 Yasumasa >>>>>>>>>>>>>>> Suenaga: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I uploaded new webrev to be adapted to jdk10/hs: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.04/ >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On 2017/09/21 7:45, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> PING: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Have you checked this issue? >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.03/ >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On 2017/07/01 23:43, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> PING: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Have you checked this issue? >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On 2017/06/13 14:10, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> I want to discuss about JDK-8151815: Could not parse >>>>>>>>>>>>>>>>>>> core image >>>>>>>>>>>>>>>>>>> with >>>>>>>>>>>>>>>>>>> JSnap. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> In last year, I found JSnap cannot parse coredump >>>>>>>>>>>>>>>>>>> and I've sent >>>>>>>>>>>>>>>>>>> review >>>>>>>>>>>>>>>>>>> request for it as JDK-8151815. However it has not >>>>>>>>>>>>>>>>>>> been reviewed >>>>>>>>>>>>>>>>>>> yet >>>>>>>>>>>>>>>>>>> [1]. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> We've discussed about safety implementation, but we >>>>>>>>>>>>>>>>>>> could not >>>>>>>>>>>>>>>>>>> get >>>>>>>>>>>>>>>>>>> consensus. >>>>>>>>>>>>>>>>>>> IMHO all SA tools should be handled java processes >>>>>>>>>>>>>>>>>>> and core >>>>>>>>>>>>>>>>>>> images, >>>>>>>>>>>>>>>>>>> and PerfCounter value is useful. So I fix this issue. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> I uploaded new webrev for this issue. I think this >>>>>>>>>>>>>>>>>>> patch is >>>>>>>>>>>>>>>>>>> safety >>>>>>>>>>>>>>>>>>> because new flag PerfMemory::_destroyed guards >>>>>>>>>>>>>>>>>>> double free, and >>>>>>>>>>>>>>>>>>> all >>>>>>>>>>>>>>>>>>> members in PerfMemory is accessible (they are not >>>>>>>>>>>>>>>>>>> munmap'ed) >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.03/ >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Can you cooperate? >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> [1] >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> http://mail.openjdk.java.net/pipermail/serviceability-dev/2016-April/019480.html >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>> >>> From rkennke at redhat.com Wed Oct 18 18:29:18 2017 From: rkennke at redhat.com (Roman Kennke) Date: Wed, 18 Oct 2017 20:29:18 +0200 Subject: RFR: 8189373: jmap -heap exited with error code Message-ID: <5f387be5-bd8a-91c6-39d4-871a74e80355@redhat.com> My recent CMSHeap extraction has broken the JVM servicability agent. It looks like I actually need a little bit more boilerplate to make it happy: http://cr.openjdk.java.net/~rkennke/8189373/webrev.00/ It does fix the test that's mentioned in the bug report: https://bugs.openjdk.java.net/browse/JDK-8189373 Is this the correct way to fix it? Roman From serguei.spitsyn at oracle.com Wed Oct 18 19:15:21 2017 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 18 Oct 2017 12:15:21 -0700 Subject: RFR (XS): 8175510 Null pointer dereference in getModuleObject of JPLISAgent.c:790 In-Reply-To: References: Message-ID: <143a7b30-9c0c-6bbb-de84-404b0c150c61@oracle.com> Is anyone interested to review this simple fix? Otherwise, I'd suggest to push it with one review from David under trivial fix rule. Thanks, Serguei On 10/12/17 18:01, David Holmes wrote: > Hi Serguei, > > Seems quite reasonable. > > Reviewed. > > Thanks, > David > > On 13/10/2017 8:21 AM, serguei.spitsyn at oracle.com wrote: >> Please, review a fix for the Parfait bug: >> https://bugs.openjdk.java.net/browse/JDK-8175510 >> >> Webrev: >> http://cr.openjdk.java.net/~sspitsyn/webrevs/2017/hotspot/8175510-jplis-parfait.1/ >> >> >> >> Summary: >> >> ?? This is the main fragment from the Parfait report: >> >> >> ??????? getModuleObject >> >> FileExpandCollapseLine >> #jdk/src/java.instrument/share/native/libinstrument/JPLISAgent.c >> jdk-9+180-JDK9_linux >> >> 783. >> >> ???? int len = (last_slash == NULL) ? 0 : (int)(last_slash - cname); >> >> 784. >> >> ???? char* pkg_name_buf = (char*)malloc(len + 1); >> >> 785. >> 786. >> >> ???? jplis_assert_msg(pkg_name_buf != NULL, "OOM error in native tmp >> buffer allocation"); >> >> Pointer checked against constant 'NULL' but does not protect the >> dereference. >> 787. >> >> ???? if (last_slash != NULL) { >> >> 788. >> >> ???????? strncpy(pkg_name_buf, cname, len); >> >> 789. >> >> ???? } >> >> 790. >> >> ???? pkg_name_buf[len] = '\0'; >> >> *Null pointer dereference not protected by null check* >> Write to pointer pkg_name_buf that could be constant 'NULL' >> >> >> ?? The malloc can return NULL in a case of OOME. >> ?? The assert at L786 checks the returned pointer for NULL but does >> not protect the dereference at L790. >> ?? The fix is to replace the assert with printing a error message and >> returning with NULL from the getModuleObject(). >> ?? It must be safe as the returned result is passed to the >> sun.instrument.InstrumentationImpl.transform() >> ?? which handles null passed as in the module parameter. >> >> Thanks, >> Serguei >> From erik.gahlin at oracle.com Wed Oct 18 19:33:35 2017 From: erik.gahlin at oracle.com (Erik Gahlin) Date: Wed, 18 Oct 2017 21:33:35 +0200 Subject: RFR: 8189440: Event tracing macros for allocation and weak oops processing In-Reply-To: References: <59E6BDAA.3080502@oracle.com> Message-ID: <59E7AC8F.80101@oracle.com> Hi David, > Hi Erik, > > On 18/10/2017 12:34 PM, Erik Gahlin wrote: >> Hi, >> >> Could I have a review of a change that adds two macros to be used >> with event-based JVM tracing. >> >> Bug: >> https://bugs.openjdk.java.net/browse/JDK-8189440 >> >> Webrev: >> http://cr.openjdk.java.net/~egahlin/8189440_0 > > Reviewed - though all somewhat mysterious in isolation :( > Thanks for the review! Sorry about the mysterious part. > My only real query is in jniHandles.cpp: > > JvmtiExport::weak_oops_do(is_alive, f); > + TRACE_WEAK_OOPS_DO(is_alive, f); > > Can't/shouldn't the tracing be done inside weak_oops_do? > Stefan Karlsson has a change out for review and if integrated before this one, I will move the TRACE_WEAK_OOPS_DO into weak_oops_do in weakProcessor.cpp. Thanks Erik From chris.plummer at oracle.com Wed Oct 18 19:34:57 2017 From: chris.plummer at oracle.com (Chris Plummer) Date: Wed, 18 Oct 2017 12:34:57 -0700 Subject: RFR (XS): 8175510 Null pointer dereference in getModuleObject of JPLISAgent.c:790 In-Reply-To: <143a7b30-9c0c-6bbb-de84-404b0c150c61@oracle.com> References: <143a7b30-9c0c-6bbb-de84-404b0c150c61@oracle.com> Message-ID: <10ff109d-58df-3a14-ff89-ba79c91ec4bc@oracle.com> I actually took a look at it the other day but never responded. I was wondering if we really want to print a message here. I didn't see any other cases of doing this. Also, if we are out of native memory, do we really want to continue? Chris On 10/18/17 12:15 PM, serguei.spitsyn at oracle.com wrote: > Is anyone interested to review this simple fix? > Otherwise, I'd suggest to push it with one review from David under > trivial fix rule. > > Thanks, > Serguei > > > On 10/12/17 18:01, David Holmes wrote: >> Hi Serguei, >> >> Seems quite reasonable. >> >> Reviewed. >> >> Thanks, >> David >> >> On 13/10/2017 8:21 AM, serguei.spitsyn at oracle.com wrote: >>> Please, review a fix for the Parfait bug: >>> https://bugs.openjdk.java.net/browse/JDK-8175510 >>> >>> Webrev: >>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2017/hotspot/8175510-jplis-parfait.1/ >>> >>> >>> >>> Summary: >>> >>> ?? This is the main fragment from the Parfait report: >>> >>> >>> ??????? getModuleObject >>> >>> FileExpandCollapseLine >>> #jdk/src/java.instrument/share/native/libinstrument/JPLISAgent.c >>> jdk-9+180-JDK9_linux >>> >>> 783. >>> >>> ???? int len = (last_slash == NULL) ? 0 : (int)(last_slash - cname); >>> >>> 784. >>> >>> ???? char* pkg_name_buf = (char*)malloc(len + 1); >>> >>> 785. >>> 786. >>> >>> ???? jplis_assert_msg(pkg_name_buf != NULL, "OOM error in native tmp >>> buffer allocation"); >>> >>> Pointer checked against constant 'NULL' but does not protect the >>> dereference. >>> 787. >>> >>> ???? if (last_slash != NULL) { >>> >>> 788. >>> >>> ???????? strncpy(pkg_name_buf, cname, len); >>> >>> 789. >>> >>> ???? } >>> >>> 790. >>> >>> ???? pkg_name_buf[len] = '\0'; >>> >>> *Null pointer dereference not protected by null check* >>> Write to pointer pkg_name_buf that could be constant 'NULL' >>> >>> >>> ?? The malloc can return NULL in a case of OOME. >>> ?? The assert at L786 checks the returned pointer for NULL but does >>> not protect the dereference at L790. >>> ?? The fix is to replace the assert with printing a error message >>> and returning with NULL from the getModuleObject(). >>> ?? It must be safe as the returned result is passed to the >>> sun.instrument.InstrumentationImpl.transform() >>> ?? which handles null passed as in the module parameter. >>> >>> Thanks, >>> Serguei >>> > From serguei.spitsyn at oracle.com Wed Oct 18 20:02:05 2017 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 18 Oct 2017 13:02:05 -0700 Subject: RFR (XS): 8175510 Null pointer dereference in getModuleObject of JPLISAgent.c:790 In-Reply-To: <10ff109d-58df-3a14-ff89-ba79c91ec4bc@oracle.com> References: <143a7b30-9c0c-6bbb-de84-404b0c150c61@oracle.com> <10ff109d-58df-3a14-ff89-ba79c91ec4bc@oracle.com> Message-ID: <4861b398-df3b-f3e0-9ff3-e9aae09eb98b@oracle.com> Hi Chris, On 10/18/17 12:34, Chris Plummer wrote: > I actually took a look at it the other day but never responded. Thank you for reviewing it! > I was wondering if we really want to print a message here.I didn't see > any other cases of doing this. I see no harm to print it in this particular case at least to have some clue about what is going on. > Also, if we are out of native memory, do we really want to continue? In this case, the VM itself will be aborted with a big probability. I do not see cases where the JDWP agent is aborted other than at initialization. Thanks, Serguei > > Chris > > On 10/18/17 12:15 PM, serguei.spitsyn at oracle.com wrote: >> Is anyone interested to review this simple fix? >> Otherwise, I'd suggest to push it with one review from David under >> trivial fix rule. >> >> Thanks, >> Serguei >> >> >> On 10/12/17 18:01, David Holmes wrote: >>> Hi Serguei, >>> >>> Seems quite reasonable. >>> >>> Reviewed. >>> >>> Thanks, >>> David >>> >>> On 13/10/2017 8:21 AM, serguei.spitsyn at oracle.com wrote: >>>> Please, review a fix for the Parfait bug: >>>> https://bugs.openjdk.java.net/browse/JDK-8175510 >>>> >>>> Webrev: >>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2017/hotspot/8175510-jplis-parfait.1/ >>>> >>>> >>>> >>>> Summary: >>>> >>>> ?? This is the main fragment from the Parfait report: >>>> >>>> >>>> ??????? getModuleObject >>>> >>>> FileExpandCollapseLine >>>> #jdk/src/java.instrument/share/native/libinstrument/JPLISAgent.c >>>> jdk-9+180-JDK9_linux >>>> >>>> 783. >>>> >>>> ???? int len = (last_slash == NULL) ? 0 : (int)(last_slash - cname); >>>> >>>> 784. >>>> >>>> ???? char* pkg_name_buf = (char*)malloc(len + 1); >>>> >>>> 785. >>>> 786. >>>> >>>> ???? jplis_assert_msg(pkg_name_buf != NULL, "OOM error in native >>>> tmp buffer allocation"); >>>> >>>> Pointer checked against constant 'NULL' but does not protect the >>>> dereference. >>>> 787. >>>> >>>> ???? if (last_slash != NULL) { >>>> >>>> 788. >>>> >>>> ???????? strncpy(pkg_name_buf, cname, len); >>>> >>>> 789. >>>> >>>> ???? } >>>> >>>> 790. >>>> >>>> ???? pkg_name_buf[len] = '\0'; >>>> >>>> *Null pointer dereference not protected by null check* >>>> Write to pointer pkg_name_buf that could be constant 'NULL' >>>> >>>> >>>> ?? The malloc can return NULL in a case of OOME. >>>> ?? The assert at L786 checks the returned pointer for NULL but does >>>> not protect the dereference at L790. >>>> ?? The fix is to replace the assert with printing a error message >>>> and returning with NULL from the getModuleObject(). >>>> ?? It must be safe as the returned result is passed to the >>>> sun.instrument.InstrumentationImpl.transform() >>>> ?? which handles null passed as in the module parameter. >>>> >>>> Thanks, >>>> Serguei >>>> >> > > From erik.gahlin at oracle.com Wed Oct 18 20:04:54 2017 From: erik.gahlin at oracle.com (Erik Gahlin) Date: Wed, 18 Oct 2017 22:04:54 +0200 Subject: RFR(S): 8189425: Minor updates in support of closed changes In-Reply-To: <9acd0472-de6b-4f6f-454e-7a325492af15@oracle.com> References: <59E6BB27.8020605@oracle.com> <9acd0472-de6b-4f6f-454e-7a325492af15@oracle.com> Message-ID: <59E7B3E6.8000101@oracle.com> Hi David, > Hi Erik, > > On 18/10/2017 12:23 PM, Erik Gahlin wrote: >> Hi, >> >> Could I have a review of this change that will adjust an assertion and > > Can you explain the adjustment please. We have closed code that modifies the mark word and then changes it back during a safepoint. When the mark word is modified, we reuse GC infrastructure that run into the assert. If we change the assert to ignore checking that the mark word is NULL, we don't run into the problem. > >> remove a lock associated with JFR. > I forgot to modify the header file, see updated webrev. http://cr.openjdk.java.net/~egahlin/8189425_1/ I also made a change to GrowableArray, the insert_sorted method now takes a const. Thanks Erik > That bit is fine :) > > Thanks, > David > >> Webrev: >> http://cr.openjdk.java.net/~egahlin/8189425_0 >> >> Bug: >> https://bugs.openjdk.java.net/browse/JDK-8189425 >> >> Thanks >> Erik >> >> From serguei.spitsyn at oracle.com Wed Oct 18 20:10:56 2017 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 18 Oct 2017 13:10:56 -0700 Subject: RFR (XS): 8175510 Null pointer dereference in getModuleObject of JPLISAgent.c:790 In-Reply-To: <4861b398-df3b-f3e0-9ff3-e9aae09eb98b@oracle.com> References: <143a7b30-9c0c-6bbb-de84-404b0c150c61@oracle.com> <10ff109d-58df-3a14-ff89-ba79c91ec4bc@oracle.com> <4861b398-df3b-f3e0-9ff3-e9aae09eb98b@oracle.com> Message-ID: <27c59119-715f-6ed5-9c3a-f40e763649dd@oracle.com> On 10/18/17 13:02, serguei.spitsyn at oracle.com wrote: > Hi Chris, > > > On 10/18/17 12:34, Chris Plummer wrote: >> I actually took a look at it the other day but never responded. > > Thank you for reviewing it! > >> I was wondering if we really want to print a message here.I didn't >> see any other cases of doing this. > > I see no harm to print it in this particular case at least to have > some clue about what is going on. > > >> Also, if we are out of native memory, do we really want to continue? > > In this case, the VM itself will be aborted with a big probability. > I do not see cases where the JDWP agent is aborted other than at > initialization. Sorry for the typo, I had to say the JPLIS agent. :) Thanks, Serguei > > > Thanks, > Serguei > > >> >> Chris >> >> On 10/18/17 12:15 PM, serguei.spitsyn at oracle.com wrote: >>> Is anyone interested to review this simple fix? >>> Otherwise, I'd suggest to push it with one review from David under >>> trivial fix rule. >>> >>> Thanks, >>> Serguei >>> >>> >>> On 10/12/17 18:01, David Holmes wrote: >>>> Hi Serguei, >>>> >>>> Seems quite reasonable. >>>> >>>> Reviewed. >>>> >>>> Thanks, >>>> David >>>> >>>> On 13/10/2017 8:21 AM, serguei.spitsyn at oracle.com wrote: >>>>> Please, review a fix for the Parfait bug: >>>>> https://bugs.openjdk.java.net/browse/JDK-8175510 >>>>> >>>>> Webrev: >>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2017/hotspot/8175510-jplis-parfait.1/ >>>>> >>>>> >>>>> >>>>> Summary: >>>>> >>>>> ?? This is the main fragment from the Parfait report: >>>>> >>>>> >>>>> ??????? getModuleObject >>>>> >>>>> FileExpandCollapseLine >>>>> #jdk/src/java.instrument/share/native/libinstrument/JPLISAgent.c >>>>> jdk-9+180-JDK9_linux >>>>> >>>>> 783. >>>>> >>>>> ???? int len = (last_slash == NULL) ? 0 : (int)(last_slash - cname); >>>>> >>>>> 784. >>>>> >>>>> ???? char* pkg_name_buf = (char*)malloc(len + 1); >>>>> >>>>> 785. >>>>> 786. >>>>> >>>>> ???? jplis_assert_msg(pkg_name_buf != NULL, "OOM error in native >>>>> tmp buffer allocation"); >>>>> >>>>> Pointer checked against constant 'NULL' but does not protect the >>>>> dereference. >>>>> 787. >>>>> >>>>> ???? if (last_slash != NULL) { >>>>> >>>>> 788. >>>>> >>>>> ???????? strncpy(pkg_name_buf, cname, len); >>>>> >>>>> 789. >>>>> >>>>> ???? } >>>>> >>>>> 790. >>>>> >>>>> ???? pkg_name_buf[len] = '\0'; >>>>> >>>>> *Null pointer dereference not protected by null check* >>>>> Write to pointer pkg_name_buf that could be constant 'NULL' >>>>> >>>>> >>>>> ?? The malloc can return NULL in a case of OOME. >>>>> ?? The assert at L786 checks the returned pointer for NULL but >>>>> does not protect the dereference at L790. >>>>> ?? The fix is to replace the assert with printing a error message >>>>> and returning with NULL from the getModuleObject(). >>>>> ?? It must be safe as the returned result is passed to the >>>>> sun.instrument.InstrumentationImpl.transform() >>>>> ?? which handles null passed as in the module parameter. >>>>> >>>>> Thanks, >>>>> Serguei >>>>> >>> >> >> > From chris.plummer at oracle.com Wed Oct 18 20:33:14 2017 From: chris.plummer at oracle.com (Chris Plummer) Date: Wed, 18 Oct 2017 13:33:14 -0700 Subject: RFR (XS): 8175510 Null pointer dereference in getModuleObject of JPLISAgent.c:790 In-Reply-To: <27c59119-715f-6ed5-9c3a-f40e763649dd@oracle.com> References: <143a7b30-9c0c-6bbb-de84-404b0c150c61@oracle.com> <10ff109d-58df-3a14-ff89-ba79c91ec4bc@oracle.com> <4861b398-df3b-f3e0-9ff3-e9aae09eb98b@oracle.com> <27c59119-715f-6ed5-9c3a-f40e763649dd@oracle.com> Message-ID: <55f219f0-582d-4444-72c1-2a6c380fd698@oracle.com> On 10/18/17 1:10 PM, serguei.spitsyn at oracle.com wrote: > On 10/18/17 13:02, serguei.spitsyn at oracle.com wrote: >> Hi Chris, >> >> >> On 10/18/17 12:34, Chris Plummer wrote: >>> I actually took a look at it the other day but never responded. >> >> Thank you for reviewing it! >> >>> I was wondering if we really want to print a message here.I didn't >>> see any other cases of doing this. >> >> I see no harm to print it in this particular case at least to have >> some clue about what is going on. >> >> >>> Also, if we are out of native memory, do we really want to continue? >> >> In this case, the VM itself will be aborted with a big probability. >> I do not see cases where the JDWP agent is aborted other than at >> initialization. > > Sorry for the typo, I had to say the JPLIS agent. :) Ok. Changes are fine. thanks, Chris > > Thanks, > Serguei > >> >> >> Thanks, >> Serguei >> >> >>> >>> Chris >>> >>> On 10/18/17 12:15 PM, serguei.spitsyn at oracle.com wrote: >>>> Is anyone interested to review this simple fix? >>>> Otherwise, I'd suggest to push it with one review from David under >>>> trivial fix rule. >>>> >>>> Thanks, >>>> Serguei >>>> >>>> >>>> On 10/12/17 18:01, David Holmes wrote: >>>>> Hi Serguei, >>>>> >>>>> Seems quite reasonable. >>>>> >>>>> Reviewed. >>>>> >>>>> Thanks, >>>>> David >>>>> >>>>> On 13/10/2017 8:21 AM, serguei.spitsyn at oracle.com wrote: >>>>>> Please, review a fix for the Parfait bug: >>>>>> https://bugs.openjdk.java.net/browse/JDK-8175510 >>>>>> >>>>>> Webrev: >>>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2017/hotspot/8175510-jplis-parfait.1/ >>>>>> >>>>>> >>>>>> >>>>>> Summary: >>>>>> >>>>>> ?? This is the main fragment from the Parfait report: >>>>>> >>>>>> >>>>>> ??????? getModuleObject >>>>>> >>>>>> FileExpandCollapseLine >>>>>> #jdk/src/java.instrument/share/native/libinstrument/JPLISAgent.c >>>>>> jdk-9+180-JDK9_linux >>>>>> >>>>>> 783. >>>>>> >>>>>> ???? int len = (last_slash == NULL) ? 0 : (int)(last_slash - cname); >>>>>> >>>>>> 784. >>>>>> >>>>>> ???? char* pkg_name_buf = (char*)malloc(len + 1); >>>>>> >>>>>> 785. >>>>>> 786. >>>>>> >>>>>> ???? jplis_assert_msg(pkg_name_buf != NULL, "OOM error in native >>>>>> tmp buffer allocation"); >>>>>> >>>>>> Pointer checked against constant 'NULL' but does not protect the >>>>>> dereference. >>>>>> 787. >>>>>> >>>>>> ???? if (last_slash != NULL) { >>>>>> >>>>>> 788. >>>>>> >>>>>> ???????? strncpy(pkg_name_buf, cname, len); >>>>>> >>>>>> 789. >>>>>> >>>>>> ???? } >>>>>> >>>>>> 790. >>>>>> >>>>>> ???? pkg_name_buf[len] = '\0'; >>>>>> >>>>>> *Null pointer dereference not protected by null check* >>>>>> Write to pointer pkg_name_buf that could be constant 'NULL' >>>>>> >>>>>> >>>>>> ?? The malloc can return NULL in a case of OOME. >>>>>> ?? The assert at L786 checks the returned pointer for NULL but >>>>>> does not protect the dereference at L790. >>>>>> ?? The fix is to replace the assert with printing a error message >>>>>> and returning with NULL from the getModuleObject(). >>>>>> ?? It must be safe as the returned result is passed to the >>>>>> sun.instrument.InstrumentationImpl.transform() >>>>>> ?? which handles null passed as in the module parameter. >>>>>> >>>>>> Thanks, >>>>>> Serguei >>>>>> >>>> >>> >>> >> > From serguei.spitsyn at oracle.com Wed Oct 18 20:34:30 2017 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 18 Oct 2017 13:34:30 -0700 Subject: RFR (XS): 8175510 Null pointer dereference in getModuleObject of JPLISAgent.c:790 In-Reply-To: <55f219f0-582d-4444-72c1-2a6c380fd698@oracle.com> References: <143a7b30-9c0c-6bbb-de84-404b0c150c61@oracle.com> <10ff109d-58df-3a14-ff89-ba79c91ec4bc@oracle.com> <4861b398-df3b-f3e0-9ff3-e9aae09eb98b@oracle.com> <27c59119-715f-6ed5-9c3a-f40e763649dd@oracle.com> <55f219f0-582d-4444-72c1-2a6c380fd698@oracle.com> Message-ID: <3d7a8ad7-2c20-b5f0-aa0a-4fbcb6d4905e@oracle.com> On 10/18/17 13:33, Chris Plummer wrote: > On 10/18/17 1:10 PM, serguei.spitsyn at oracle.com wrote: >> On 10/18/17 13:02, serguei.spitsyn at oracle.com wrote: >>> Hi Chris, >>> >>> >>> On 10/18/17 12:34, Chris Plummer wrote: >>>> I actually took a look at it the other day but never responded. >>> >>> Thank you for reviewing it! >>> >>>> I was wondering if we really want to print a message here.I didn't >>>> see any other cases of doing this. >>> >>> I see no harm to print it in this particular case at least to have >>> some clue about what is going on. >>> >>> >>>> Also, if we are out of native memory, do we really want to continue? >>> >>> In this case, the VM itself will be aborted with a big probability. >>> I do not see cases where the JDWP agent is aborted other than at >>> initialization. >> >> Sorry for the typo, I had to say the JPLIS agent. :) > Ok. Changes are fine. Thanks a lot, Chris! Serguei > > thanks, > > Chris >> >> Thanks, >> Serguei >> >>> >>> >>> Thanks, >>> Serguei >>> >>> >>>> >>>> Chris >>>> >>>> On 10/18/17 12:15 PM, serguei.spitsyn at oracle.com wrote: >>>>> Is anyone interested to review this simple fix? >>>>> Otherwise, I'd suggest to push it with one review from David under >>>>> trivial fix rule. >>>>> >>>>> Thanks, >>>>> Serguei >>>>> >>>>> >>>>> On 10/12/17 18:01, David Holmes wrote: >>>>>> Hi Serguei, >>>>>> >>>>>> Seems quite reasonable. >>>>>> >>>>>> Reviewed. >>>>>> >>>>>> Thanks, >>>>>> David >>>>>> >>>>>> On 13/10/2017 8:21 AM, serguei.spitsyn at oracle.com wrote: >>>>>>> Please, review a fix for the Parfait bug: >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8175510 >>>>>>> >>>>>>> Webrev: >>>>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2017/hotspot/8175510-jplis-parfait.1/ >>>>>>> >>>>>>> >>>>>>> >>>>>>> Summary: >>>>>>> >>>>>>> ?? This is the main fragment from the Parfait report: >>>>>>> >>>>>>> >>>>>>> ??????? getModuleObject >>>>>>> >>>>>>> FileExpandCollapseLine >>>>>>> #jdk/src/java.instrument/share/native/libinstrument/JPLISAgent.c >>>>>>> jdk-9+180-JDK9_linux >>>>>>> >>>>>>> 783. >>>>>>> >>>>>>> ???? int len = (last_slash == NULL) ? 0 : (int)(last_slash - >>>>>>> cname); >>>>>>> >>>>>>> 784. >>>>>>> >>>>>>> ???? char* pkg_name_buf = (char*)malloc(len + 1); >>>>>>> >>>>>>> 785. >>>>>>> 786. >>>>>>> >>>>>>> ???? jplis_assert_msg(pkg_name_buf != NULL, "OOM error in native >>>>>>> tmp buffer allocation"); >>>>>>> >>>>>>> Pointer checked against constant 'NULL' but does not protect the >>>>>>> dereference. >>>>>>> 787. >>>>>>> >>>>>>> ???? if (last_slash != NULL) { >>>>>>> >>>>>>> 788. >>>>>>> >>>>>>> ???????? strncpy(pkg_name_buf, cname, len); >>>>>>> >>>>>>> 789. >>>>>>> >>>>>>> ???? } >>>>>>> >>>>>>> 790. >>>>>>> >>>>>>> ???? pkg_name_buf[len] = '\0'; >>>>>>> >>>>>>> *Null pointer dereference not protected by null check* >>>>>>> Write to pointer pkg_name_buf that could be constant 'NULL' >>>>>>> >>>>>>> >>>>>>> ?? The malloc can return NULL in a case of OOME. >>>>>>> ?? The assert at L786 checks the returned pointer for NULL but >>>>>>> does not protect the dereference at L790. >>>>>>> ?? The fix is to replace the assert with printing a error >>>>>>> message and returning with NULL from the getModuleObject(). >>>>>>> ?? It must be safe as the returned result is passed to the >>>>>>> sun.instrument.InstrumentationImpl.transform() >>>>>>> ?? which handles null passed as in the module parameter. >>>>>>> >>>>>>> Thanks, >>>>>>> Serguei >>>>>>> >>>>> >>>> >>>> >>> >> > > From serguei.spitsyn at oracle.com Wed Oct 18 21:18:24 2017 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 18 Oct 2017 14:18:24 -0700 Subject: PING: RFR: JDK-8151815: Could not parse core image with JSnap. In-Reply-To: <7b8ae324-c590-bb41-4bb6-2b4d18b12267@gmail.com> References: <0a760f75-58bd-c334-26c5-0e9adddfe5b7@gmail.com> <643d9ea2-5bb3-f6ef-8007-14cd1c580137@oracle.com> <4370fcb0-d06d-865b-8bab-e03ec1812e89@oracle.com> <84922d13-79ac-8904-1885-79ee0c2b4232@oracle.com> <967acad8-8378-cf99-0ff9-ba3852d51dd3@oracle.com> <41185116-0271-374c-da9e-74e173d6b58c@oracle.com> <7cb41834-75fa-9dc5-4f6d-c2b85f84dcee@oracle.com> <78b0bfca-1259-e5ef-1189-a7646d7fea36@oracle.com> <7b8ae324-c590-bb41-4bb6-2b4d18b12267@gmail.com> Message-ID: <024d8188-8d54-3345-d3b6-e757f4cafb6e@oracle.com> On 10/18/17 06:51, Yasumasa Suenaga wrote: > Hi David, Serguei, > >> because as soon as we have checked is_usable() and abort happening in >> another thread may have changed that by calling destroy. >> >> This code is basically broken if we hit an abort path instead of a >> normal VM shutdown. > > Can we use MutexLocker for initialize() and destroy() ? > > > I've tried to fix about your comments, but I have an issue about > volatile. > PerfMemory.java depends on PerfMemory::_initialized. However VMStructs > cannot handle static volatile variables. > I think two approaches as below: > > > ? 1. Remove _initialized check from PerfMemory.java > ???? SA will throw UnmappedAddressException if JSnap try to access > invalid address including uninitialized memory. > > ? 2. Add static volatile support to VMStructs > > > Which should we do? > 1. is easy to fix. But 2. might be right way... Would the below work? : ?578????? static_field(PerfMemory, _initialized,???????????????????? volatile jint)????????????????????????????????? \ It'd be similar to this non-static case: ?362?? nonstatic_field(ConstantPoolCacheEntry, _f1,????????????????????????????????? volatile Metadata*)??????????????????? \ Thanks, Serguei > > > Thanks, > > Yasumasa > > > On 2017/10/18 21:34, David Holmes wrote: >> Just to clarify ... >> >> On 18/10/2017 10:28 PM, David Holmes wrote: >>> On 18/10/2017 8:26 PM, serguei.spitsyn at oracle.com wrote: >>>> Hi David, >>>> >>>> Thank you for jumping to this review and helping Yasumasa to sort >>>> it out! >>>> I've just discovered that this issue was already on the table for >>>> several months without a significant progress. >>>> >>>> >>>> On 10/18/17 02:48, David Holmes wrote: >>>>> Hi Serguei >>>>> >>>>> On 18/10/2017 7:25 PM, serguei.spitsyn at oracle.com wrote: >>>>>> Hi Yasumasa, >>>>>> >>>>>> Sorry for a quite late participation. >>>>>> >>>>>> I looked at the previous webrevs and think that this one is much >>>>>> better. >>>>>> >>>>>> Some concern is if we need any kind of synchronization here, e.g. >>>>>> CAS. >>>>>> But it depends on the PerfMemory class usage. >>>>>> >>>>>> Should we make the static variables '_initialized' and >>>>>> '_destroyed' volatile? >>>>> >>>>> For good measure - yes. >>>>> >>>>>> Also, the '_initialized' is set to 1 with: >>>>>> ??? 159??? OrderAccess::release_store(&_initialized, 1); >>>>>> >>>>>> Should we do the same to set the '_destroyed'?: >>>>>> 200 _destroyed = true; >>>>> >>>>> There is a benign initialization race but we need the >>>>> release_store to ensure all the data fields can be read if >>>>> _initialized is seen as true. But what is missing is a >>>>> load_acquire() in is_initialized() to ensure we synchronize with >>>>> that store! >>>> >>>> Yes, I noticed that the load_acquire() is missed. :| >>>> >>>>> >>>>> There is also a potential for a destruction race (if multiple >>>>> aborts happens concurrently in different threads) but that also >>>>> seems benign. In this case there is no data being set so the store >>>>> to _destroyed does not need to be a release_store. >>>> >>>> I'm not convinced yet this is benign as the PerfMemory::destroy() >>>> has this call: >>>> ?? 197 delete_memory_region(); >>> >>> Yes though most of its work ends up being no-ops. >>> >>>> >>>> Now, I started thinking about the asserts that call the is_useable(). >>>> Should they be returns instead? >>> >>> I think this is a somewhat confused chunk of code. It's only >>> fractionally thread-safe yet once in use could be in use >>> concurrently with an aborting thread that calls destroy(). I don't >>> think there is any simple fix for this. If we're in the process of >>> crashing does it really matter if we trigger a secondary crash due >>> to this? >> >> It doesn't matter if we do: >> >> assert(is_usable(),...); >> // continue >> >> or >> >> if (!is_usable()) return; >> // continue >> >> because as soon as we have checked is_usable() and abort happening in >> another thread may have changed that by calling destroy. >> >> This code is basically broken if we hit an abort path instead of a >> normal VM shutdown. >> >> David >> ----- >> >>> The problems with this code go way beyond what Yasumasa is trying to >>> address with the JSnap problem and I would not want to put it back >>> on him to try and come up with an overall solution. >>> >>>> Then the is_destroyed() would better to have the load_acquire(). >>> >>> You could add a load_acquire and do the store_release. It certainly >>> would not hurt, but I don't think it would actually benefit anything >>> either. >>> >>> Cheers, >>> David >>> >>>> Just interested to know what do you think on this. >>>> >>>> Thanks, >>>> Serguei >>>> >>>>> >>>>> Cheers, >>>>> David >>>>> >>>>>> >>>>>> Thanks, >>>>>> Serguei >>>>>> >>>>>> >>>>>> On 10/18/17 00:39, Yasumasa Suenaga wrote: >>>>>>> Hi David, >>>>>>> >>>>>>> Thank you for your comment. >>>>>>> I uploaded new webrev: >>>>>>> >>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.07/ >>>>>>> >>>>>>> Serguei, please comment about this :-) >>>>>>> >>>>>>> >>>>>>> Yasumasa >>>>>>> >>>>>>> >>>>>>> >>>>>>> 2017-10-18 16:09 GMT+09:00 David Holmes: >>>>>>>> Hi Yasumasa, >>>>>>>> >>>>>>>> On 18/10/2017 4:34 PM, Yasumasa Suenaga wrote: >>>>>>>>> Hi David, >>>>>>>>> >>>>>>>>>> I don't think we need the extra fields, just ensure the >>>>>>>>>> existing ones >>>>>>>>>> can't >>>>>>>>>> be accessed (other than by the tools) after destroy is called. >>>>>>>>> >>>>>>>>> I've added PerfMemory::is_useable() to check whether we can >>>>>>>>> access to >>>>>>>>> PerfMemory. >>>>>>>>> I think this webrev prevent to access to PerfMemory after >>>>>>>>> destroy() call. >>>>>>>>> >>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.06/ >>>>>>>> >>>>>>>> This: >>>>>>>> >>>>>>>> ?? 90 void PerfMemory::initialize() { >>>>>>>> ?? 91 >>>>>>>> ?? 92?? if (_prologue != NULL) >>>>>>>> ?? 93???? // initialization already performed >>>>>>>> ?? 94???? return; >>>>>>>> >>>>>>>> shouldn't check _prologue, but is_initialized(). >>>>>>>> >>>>>>>> ? 213?? assert(is_useable(), "called before initialization"); >>>>>>>> >>>>>>>> -> "called before init or after destroy" >>>>>>>> >>>>>>>> Could add a similar assert in PerfMemory::mark_updated(). >>>>>>>> >>>>>>>> Let's see what Serguei thinks. :) >>>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> David >>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> Yasumasa >>>>>>>>> >>>>>>>>> >>>>>>>>> 2017-10-18 13:44 GMT+09:00 David Holmes: >>>>>>>>>> On 18/10/2017 2:27 PM, Yasumasa Suenaga wrote: >>>>>>>>>>> >>>>>>>>>>> Hi David, >>>>>>>>>>> >>>>>>>>>>> 2017-10-18 12:55 GMT+09:00 David >>>>>>>>>>> Holmes: >>>>>>>>>>>> >>>>>>>>>>>> On 18/10/2017 12:37 PM, Yasumasa Suenaga wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Hi David, >>>>>>>>>>>>> >>>>>>>>>>>>>> With your changes you no longer null out _prologue so the >>>>>>>>>>>>>> assertion >>>>>>>>>>>>>> would >>>>>>>>>>>>>> now not fail and we'd proceed to access the deleted >>>>>>>>>>>>>> memory region! >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On Linux, PerfMemory::delete_memory_region() does not call >>>>>>>>>>>>> munmap() >>>>>>>>>>>>> for PerfMemory. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Perhaps not but there are still other actions that happen >>>>>>>>>>>> and the point >>>>>>>>>>>> is >>>>>>>>>>>> we should not be able to continue to use PerfMemory once it >>>>>>>>>>>> has been >>>>>>>>>>>> destroyed (even if the destruction is only logical). >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> I received same comment from Dmitry in the past, but we >>>>>>>>>>> couldn't >>>>>>>>>>> decide how should we do. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> http://mail.openjdk.java.net/pipermail/serviceability-dev/2016-May/019728.html >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> In that discussion, I uploaded another webrev which adds >>>>>>>>>>> other fields >>>>>>>>>>> for >>>>>>>>>>> JSnap. >>>>>>>>>>> Is it suitable? >>>>>>>>>>> >>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.02/ >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> I don't think we need the extra fields, just ensure the >>>>>>>>>> existing ones >>>>>>>>>> can't >>>>>>>>>> be accessed (other than by the tools) after destroy is called. >>>>>>>>>> >>>>>>>>>>>>>> I'm unclear why you no longer clear all the fields set >>>>>>>>>>>>>> during >>>>>>>>>>>>>> initialization? >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> PerfMemory.java in jdk.hotspot.agent needs these field >>>>>>>>>>>>> values. >>>>>>>>>>>>> `jhsdb jsnap --core` is failed if they are cleared. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> I'm not familiar with these tools. When do we produce a >>>>>>>>>>>> core file after >>>>>>>>>>>> calling PerfMemory::destroy ? >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> PerfMemory::destroy() is called before aborting. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Ah - right. I assume we need to close off the perfdata file >>>>>>>>>> before we >>>>>>>>>> abort. >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> David >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> ----------------------- >>>>>>>>>>> #0? perfMemory_exit () >>>>>>>>>>> ?????? at >>>>>>>>>>> >>>>>>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/share/vm/runtime/perfMemory.cpp:80 >>>>>>>>>>> >>>>>>>>>>> #1? 0x00007f99b091c949 in os::shutdown () >>>>>>>>>>> ?????? at >>>>>>>>>>> >>>>>>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:1483 >>>>>>>>>>> >>>>>>>>>>> #2? 0x00007f99b091c980 in os::abort (dump_core=) >>>>>>>>>>> ?????? at >>>>>>>>>>> >>>>>>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:1503 >>>>>>>>>>> >>>>>>>>>>> #3? 0x00007f99b0b689c3 in VMError::report_and_die ( >>>>>>>>>>> ?????? this=this at entry=0x7ffcacf40b50) >>>>>>>>>>> ?????? at >>>>>>>>>>> >>>>>>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/share/vm/utilities/vmError.cpp:1060 >>>>>>>>>>> >>>>>>>>>>> #4? 0x00007f99b0926f04 in JVM_handle_linux_signal >>>>>>>>>>> (sig=sig at entry=11, >>>>>>>>>>> ?????? info=info at entry=0x7ffcacf40df0, >>>>>>>>>>> ucVoid=ucVoid at entry=0x7ffcacf40cc0, >>>>>>>>>>> abort_if_unrecognized=abort_if_unrecognized at entry=1) >>>>>>>>>>> ?????? at >>>>>>>>>>> >>>>>>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/os_cpu/linux_x86/vm/os_linux_x86.cpp:541 >>>>>>>>>>> >>>>>>>>>>> ----------------------- >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> >>>>>>>>>>> Yasumasa >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>>>> But it seems to me that there are various checks of >>>>>>>>>>>>>> _prologue that should really be checking is_initialized() >>>>>>>>>>>>>> and/or >>>>>>>>>>>>>> is_destroyed() as a guard. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Should I change all assertions for _prologue? >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Assertions and direct guards. Checking _prologue is a >>>>>>>>>>>> placeholder for >>>>>>>>>>>> the >>>>>>>>>>>> real check. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> David >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> >>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> 2017-10-18 10:53 GMT+09:00 David >>>>>>>>>>>>> Holmes: >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Hi Yasumasa, >>>>>>>>>>>>>> >>>>>>>>>>>>>> By chance we ran into this bug which I analysed yesterday: >>>>>>>>>>>>>> >>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8189390 >>>>>>>>>>>>>> >>>>>>>>>>>>>> We hit the assertion: >>>>>>>>>>>>>> >>>>>>>>>>>>>> #? Internal Error >>>>>>>>>>>>>> (/open/src/hotspot/share/runtime/perfMemory.cpp:216), >>>>>>>>>>>>>> pid=17874, tid=17875 >>>>>>>>>>>>>> #? assert(_prologue != __null) failed: called before >>>>>>>>>>>>>> initialization >>>>>>>>>>>>>> # >>>>>>>>>>>>>> >>>>>>>>>>>>>> which is misleading because it can fail if called before >>>>>>>>>>>>>> initialization, >>>>>>>>>>>>>> or >>>>>>>>>>>>>> after PerfMemory::destroy has been called. >>>>>>>>>>>>>> >>>>>>>>>>>>>> With your changes you no longer null out _prologue so the >>>>>>>>>>>>>> assertion >>>>>>>>>>>>>> would >>>>>>>>>>>>>> now not fail and we'd proceed to access the deleted >>>>>>>>>>>>>> memory region! >>>>>>>>>>>>>> >>>>>>>>>>>>>> I'm unclear why you no longer clear all the fields set >>>>>>>>>>>>>> during >>>>>>>>>>>>>> initialization? But it seems to me that there are various >>>>>>>>>>>>>> checks of >>>>>>>>>>>>>> _prologue that should really be checking is_initialized() >>>>>>>>>>>>>> and/or >>>>>>>>>>>>>> is_destroyed() as a guard. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> David >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On 16/10/2017 11:25 PM, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> PING: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Could you review it? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.05/ >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On 2017/10/03 13:18, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I added gtest unit test case for this change in new >>>>>>>>>>>>>>>> webrev: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.05/ >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Could you review it? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> 2017-09-27 0:01 GMT+09:00 Yasumasa >>>>>>>>>>>>>>>> Suenaga: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I uploaded new webrev to be adapted to jdk10/hs: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.04/ >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On 2017/09/21 7:45, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> PING: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Have you checked this issue? >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.03/ >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On 2017/07/01 23:43, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> PING: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Have you checked this issue? >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> On 2017/06/13 14:10, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> I want to discuss about JDK-8151815: Could not >>>>>>>>>>>>>>>>>>>> parse core image >>>>>>>>>>>>>>>>>>>> with >>>>>>>>>>>>>>>>>>>> JSnap. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> In last year, I found JSnap cannot parse coredump >>>>>>>>>>>>>>>>>>>> and I've sent >>>>>>>>>>>>>>>>>>>> review >>>>>>>>>>>>>>>>>>>> request for it as JDK-8151815. However it has not >>>>>>>>>>>>>>>>>>>> been reviewed >>>>>>>>>>>>>>>>>>>> yet >>>>>>>>>>>>>>>>>>>> [1]. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> We've discussed about safety implementation, but we >>>>>>>>>>>>>>>>>>>> could not >>>>>>>>>>>>>>>>>>>> get >>>>>>>>>>>>>>>>>>>> consensus. >>>>>>>>>>>>>>>>>>>> IMHO all SA tools should be handled java processes >>>>>>>>>>>>>>>>>>>> and core >>>>>>>>>>>>>>>>>>>> images, >>>>>>>>>>>>>>>>>>>> and PerfCounter value is useful. So I fix this issue. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> I uploaded new webrev for this issue. I think this >>>>>>>>>>>>>>>>>>>> patch is >>>>>>>>>>>>>>>>>>>> safety >>>>>>>>>>>>>>>>>>>> because new flag PerfMemory::_destroyed guards >>>>>>>>>>>>>>>>>>>> double free, and >>>>>>>>>>>>>>>>>>>> all >>>>>>>>>>>>>>>>>>>> members in PerfMemory is accessible (they are not >>>>>>>>>>>>>>>>>>>> munmap'ed) >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.03/ >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Can you cooperate? >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> [1] >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> http://mail.openjdk.java.net/pipermail/serviceability-dev/2016-April/019480.html >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>> >>>> From david.holmes at oracle.com Thu Oct 19 01:41:55 2017 From: david.holmes at oracle.com (David Holmes) Date: Thu, 19 Oct 2017 11:41:55 +1000 Subject: RFR(S): 8189425: Minor updates in support of closed changes In-Reply-To: <59E7B3E6.8000101@oracle.com> References: <59E6BB27.8020605@oracle.com> <9acd0472-de6b-4f6f-454e-7a325492af15@oracle.com> <59E7B3E6.8000101@oracle.com> Message-ID: <8a02798d-0d38-6844-a7cb-7f83145fbf29@oracle.com> Hi Erik, On 19/10/2017 6:04 AM, Erik Gahlin wrote: > Hi David, > >> Hi Erik, >> >> On 18/10/2017 12:23 PM, Erik Gahlin wrote: >>> Hi, >>> >>> Could I have a review of this change that will adjust an assertion and >> >> Can you explain the adjustment please. > We have closed code that modifies the mark word and then changes it back > during a safepoint. When the mark word is modified, we reuse GC > infrastructure that run into the assert. If we change the assert to > ignore checking that the mark word is NULL, we don't run into the problem. Ok. This weakens the assert somewhat but then I don't know what it is really trying to catch. >> >>> remove a lock associated with JFR. >> > > I forgot to modify the header file, see updated webrev. > > http://cr.openjdk.java.net/~egahlin/8189425_1/ Okay. > I? also made a change to GrowableArray, the insert_sorted method now > takes a const. Presumably ok. Thanks, David > Thanks > Erik > >> That bit is fine :) >> >> Thanks, >> David >> >>> Webrev: >>> http://cr.openjdk.java.net/~egahlin/8189425_0 >>> >>> Bug: >>> https://bugs.openjdk.java.net/browse/JDK-8189425 >>> >>> Thanks >>> Erik >>> >>> > From david.holmes at oracle.com Thu Oct 19 02:00:02 2017 From: david.holmes at oracle.com (David Holmes) Date: Thu, 19 Oct 2017 12:00:02 +1000 Subject: RFR: 8189440: Event tracing macros for allocation and weak oops processing In-Reply-To: <59E7AC8F.80101@oracle.com> References: <59E6BDAA.3080502@oracle.com> <59E7AC8F.80101@oracle.com> Message-ID: On 19/10/2017 5:33 AM, Erik Gahlin wrote: > Hi David, > >> Hi Erik, >> >> On 18/10/2017 12:34 PM, Erik Gahlin wrote: >>> Hi, >>> >>> Could I have a review of a change that adds two macros to be used >>> with event-based JVM tracing. >>> >>> Bug: >>> https://bugs.openjdk.java.net/browse/JDK-8189440 >>> >>> Webrev: >>> http://cr.openjdk.java.net/~egahlin/8189440_0 >> >> Reviewed - though all somewhat mysterious in isolation :( >> > Thanks for the review! > > Sorry about the mysterious part. > >> My only real query is in jniHandles.cpp: >> >> ?? JvmtiExport::weak_oops_do(is_alive, f); >> +? TRACE_WEAK_OOPS_DO(is_alive, f); >> >> Can't/shouldn't the tracing be done inside weak_oops_do? >> > Stefan Karlsson has a change out for review and if integrated before > this one, I will move the TRACE_WEAK_OOPS_DO into weak_oops_do in > weakProcessor.cpp. Okay. It was a misunderstanding on my part anyway - I thought the macro was for tracing the JvmtiExport::weak_oops_do, but it is actually a weak_oops_do for the tracing code. Thanks, David > Thanks > Erik > From yasuenag at gmail.com Thu Oct 19 03:18:37 2017 From: yasuenag at gmail.com (Yasumasa Suenaga) Date: Thu, 19 Oct 2017 12:18:37 +0900 Subject: PING: RFR: JDK-8151815: Could not parse core image with JSnap. In-Reply-To: <024d8188-8d54-3345-d3b6-e757f4cafb6e@oracle.com> References: <0a760f75-58bd-c334-26c5-0e9adddfe5b7@gmail.com> <643d9ea2-5bb3-f6ef-8007-14cd1c580137@oracle.com> <4370fcb0-d06d-865b-8bab-e03ec1812e89@oracle.com> <84922d13-79ac-8904-1885-79ee0c2b4232@oracle.com> <967acad8-8378-cf99-0ff9-ba3852d51dd3@oracle.com> <41185116-0271-374c-da9e-74e173d6b58c@oracle.com> <7cb41834-75fa-9dc5-4f6d-c2b85f84dcee@oracle.com> <78b0bfca-1259-e5ef-1189-a7646d7fea36@oracle.com> <7b8ae324-c590-bb41-4bb6-2b4d18b12267@gmail.com> <024d8188-8d54-3345-d3b6-e757f4cafb6e@oracle.com> Message-ID: Hi Serguei, > Would the below work? : > > 578 static_field(PerfMemory, _initialized, volatile jint) \ > > It'd be similar to this non-static case: > 362 nonstatic_field(ConstantPoolCacheEntry, _f1, volatile Metadata*) \ I got error messages as below: --------------- In file included from /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:104:0: /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:581:39: error: expected unqualified-id before 'volatile' static_field(PerfMemory, volatile _initialized, jint) \ ^ /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.hpp:168:69: note: in definition of macro 'GENERATE_STATIC_VM_STRUCT_ENTRY' { QUOTE(typeName), QUOTE(fieldName), QUOTE(type), 1, 0, &typeName::fieldName }, ^~~~~~~~~ /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:2934:3: note: in expansion of macro 'VM_STRUCTS' VM_STRUCTS(GENERATE_NONSTATIC_VM_STRUCT_ENTRY, ^ /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:581:39: error: expected '}' before 'volatile' static_field(PerfMemory, volatile _initialized, jint) \ ^ /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.hpp:168:69: note: in definition of macro 'GENERATE_STATIC_VM_STRUCT_ENTRY' { QUOTE(typeName), QUOTE(fieldName), QUOTE(type), 1, 0, &typeName::fieldName }, ^~~~~~~~~ /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:2934:3: note: in expansion of macro 'VM_STRUCTS' VM_STRUCTS(GENERATE_NONSTATIC_VM_STRUCT_ENTRY, ^ /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:581:39: error: expected '}' before 'volatile' static_field(PerfMemory, volatile _initialized, jint) \ ^ /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.hpp:168:69: note: in definition of macro 'GENERATE_STATIC_VM_STRUCT_ENTRY' { QUOTE(typeName), QUOTE(fieldName), QUOTE(type), 1, 0, &typeName::fieldName }, ^~~~~~~~~ /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:2934:3: note: in expansion of macro 'VM_STRUCTS' VM_STRUCTS(GENERATE_NONSTATIC_VM_STRUCT_ENTRY, ^ /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.hpp:168:79: error: expected declaration before '}' token { QUOTE(typeName), QUOTE(fieldName), QUOTE(type), 1, 0, &typeName::fieldName }, ^ /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:581:6: note: in expansion of macro 'GENERATE_STATIC_VM_STRUCT_ENTRY' static_field(PerfMemory, volatile _initialized, jint) \ ^~~~~~~~~~~~ /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:2934:3: note: in expansion of macro 'VM_STRUCTS' VM_STRUCTS(GENERATE_NONSTATIC_VM_STRUCT_ENTRY, ^ gmake[3]: *** [lib/CompileJvm.gmk:210: /home/ysuenaga/OpenJDK/jdk10-hs/build/linux-x86_64-normal-server-fastdebug/hotspot/variant-server/libjvm/objs/vmStructs.o] Error 1 gmake[2]: *** [make/Main.gmk:266: hotspot-server-libs] Error 2 ERROR: Build failed for target 'images' in configuration 'linux-x86_64-normal-server-fastdebug' (exit code 2) --------------- I changed as below: --------------- diff -r 3e7702cd3f19 src/hotspot/share/runtime/perfMemory.cpp --- a/src/hotspot/share/runtime/perfMemory.cpp Thu Sep 07 15:40:20 2017 +0200 +++ b/src/hotspot/share/runtime/perfMemory.cpp Thu Oct 19 12:15:30 2017 +0900 @@ -51,8 +51,9 @@ char* PerfMemory::_end = NULL; char* PerfMemory::_top = NULL; size_t PerfMemory::_capacity = 0; -jint PerfMemory::_initialized = false; +volatile jint PerfMemory::_initialized = 0; PerfDataPrologue* PerfMemory::_prologue = NULL; +volatile bool PerfMemory::_destroyed = false; --- a/src/hotspot/share/runtime/perfMemory.hpp Thu Sep 07 15:40:20 2017 +0200 +++ b/src/hotspot/share/runtime/perfMemory.hpp Thu Oct 19 12:15:30 2017 +0900 @@ -113,13 +113,15 @@ */ class PerfMemory : AllStatic { friend class VMStructs; + friend class PerfMemoryTest; private: static char* _start; static char* _end; static char* _top; static size_t _capacity; static PerfDataPrologue* _prologue; - static jint _initialized; + static volatile jint _initialized; + static volatile bool _destroyed; diff -r 3e7702cd3f19 src/hotspot/share/runtime/vmStructs.cpp --- a/src/hotspot/share/runtime/vmStructs.cpp Thu Sep 07 15:40:20 2017 +0200 +++ b/src/hotspot/share/runtime/vmStructs.cpp Thu Oct 19 12:15:30 2017 +0900 @@ -578,7 +578,7 @@ static_field(PerfMemory, _top, char*) \ static_field(PerfMemory, _capacity, size_t) \ static_field(PerfMemory, _prologue, PerfDataPrologue*) \ - static_field(PerfMemory, _initialized, jint) \ + static_field(PerfMemory, volatile _initialized, jint) \ --------------- Thanks, Yasumasa On 2017/10/19 6:18, serguei.spitsyn at oracle.com wrote: > On 10/18/17 06:51, Yasumasa Suenaga wrote: >> Hi David, Serguei, >> >>> because as soon as we have checked is_usable() and abort happening in another thread may have changed that by calling destroy. >>> >>> This code is basically broken if we hit an abort path instead of a normal VM shutdown. >> >> Can we use MutexLocker for initialize() and destroy() ? >> >> >> I've tried to fix about your comments, but I have an issue about volatile. >> PerfMemory.java depends on PerfMemory::_initialized. However VMStructs cannot handle static volatile variables. >> I think two approaches as below: >> >> >> ? 1. Remove _initialized check from PerfMemory.java >> ???? SA will throw UnmappedAddressException if JSnap try to access invalid address including uninitialized memory. >> >> ? 2. Add static volatile support to VMStructs >> >> >> Which should we do? >> 1. is easy to fix. But 2. might be right way... > > Would the below work? : > > ?578????? static_field(PerfMemory, _initialized, volatile jint)????????????????????????????????? \ > > It'd be similar to this non-static case: > ?362?? nonstatic_field(ConstantPoolCacheEntry, _f1,????????????????????????????????? volatile Metadata*)??????????????????? \ > > > Thanks, > Serguei > >> >> >> Thanks, >> >> Yasumasa >> >> >> On 2017/10/18 21:34, David Holmes wrote: >>> Just to clarify ... >>> >>> On 18/10/2017 10:28 PM, David Holmes wrote: >>>> On 18/10/2017 8:26 PM, serguei.spitsyn at oracle.com wrote: >>>>> Hi David, >>>>> >>>>> Thank you for jumping to this review and helping Yasumasa to sort it out! >>>>> I've just discovered that this issue was already on the table for several months without a significant progress. >>>>> >>>>> >>>>> On 10/18/17 02:48, David Holmes wrote: >>>>>> Hi Serguei >>>>>> >>>>>> On 18/10/2017 7:25 PM, serguei.spitsyn at oracle.com wrote: >>>>>>> Hi Yasumasa, >>>>>>> >>>>>>> Sorry for a quite late participation. >>>>>>> >>>>>>> I looked at the previous webrevs and think that this one is much better. >>>>>>> >>>>>>> Some concern is if we need any kind of synchronization here, e.g. CAS. >>>>>>> But it depends on the PerfMemory class usage. >>>>>>> >>>>>>> Should we make the static variables '_initialized' and '_destroyed' volatile? >>>>>> >>>>>> For good measure - yes. >>>>>> >>>>>>> Also, the '_initialized' is set to 1 with: >>>>>>> ??? 159??? OrderAccess::release_store(&_initialized, 1); >>>>>>> >>>>>>> Should we do the same to set the '_destroyed'?: >>>>>>> 200 _destroyed = true; >>>>>> >>>>>> There is a benign initialization race but we need the release_store to ensure all the data fields can be read if _initialized is seen as true. But what is missing is a load_acquire() in is_initialized() to ensure we synchronize with that store! >>>>> >>>>> Yes, I noticed that the load_acquire() is missed. :| >>>>> >>>>>> >>>>>> There is also a potential for a destruction race (if multiple aborts happens concurrently in different threads) but that also seems benign. In this case there is no data being set so the store to _destroyed does not need to be a release_store. >>>>> >>>>> I'm not convinced yet this is benign as the PerfMemory::destroy() has this call: >>>>> ?? 197 delete_memory_region(); >>>> >>>> Yes though most of its work ends up being no-ops. >>>> >>>>> >>>>> Now, I started thinking about the asserts that call the is_useable(). >>>>> Should they be returns instead? >>>> >>>> I think this is a somewhat confused chunk of code. It's only fractionally thread-safe yet once in use could be in use concurrently with an aborting thread that calls destroy(). I don't think there is any simple fix for this. If we're in the process of crashing does it really matter if we trigger a secondary crash due to this? >>> >>> It doesn't matter if we do: >>> >>> assert(is_usable(),...); >>> // continue >>> >>> or >>> >>> if (!is_usable()) return; >>> // continue >>> >>> because as soon as we have checked is_usable() and abort happening in another thread may have changed that by calling destroy. >>> >>> This code is basically broken if we hit an abort path instead of a normal VM shutdown. >>> >>> David >>> ----- >>> >>>> The problems with this code go way beyond what Yasumasa is trying to address with the JSnap problem and I would not want to put it back on him to try and come up with an overall solution. >>>> >>>>> Then the is_destroyed() would better to have the load_acquire(). >>>> >>>> You could add a load_acquire and do the store_release. It certainly would not hurt, but I don't think it would actually benefit anything either. >>>> >>>> Cheers, >>>> David >>>> >>>>> Just interested to know what do you think on this. >>>>> >>>>> Thanks, >>>>> Serguei >>>>> >>>>>> >>>>>> Cheers, >>>>>> David >>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> Serguei >>>>>>> >>>>>>> >>>>>>> On 10/18/17 00:39, Yasumasa Suenaga wrote: >>>>>>>> Hi David, >>>>>>>> >>>>>>>> Thank you for your comment. >>>>>>>> I uploaded new webrev: >>>>>>>> >>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.07/ >>>>>>>> >>>>>>>> Serguei, please comment about this :-) >>>>>>>> >>>>>>>> >>>>>>>> Yasumasa >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> 2017-10-18 16:09 GMT+09:00 David Holmes: >>>>>>>>> Hi Yasumasa, >>>>>>>>> >>>>>>>>> On 18/10/2017 4:34 PM, Yasumasa Suenaga wrote: >>>>>>>>>> Hi David, >>>>>>>>>> >>>>>>>>>>> I don't think we need the extra fields, just ensure the existing ones >>>>>>>>>>> can't >>>>>>>>>>> be accessed (other than by the tools) after destroy is called. >>>>>>>>>> >>>>>>>>>> I've added PerfMemory::is_useable() to check whether we can access to >>>>>>>>>> PerfMemory. >>>>>>>>>> I think this webrev prevent to access to PerfMemory after destroy() call. >>>>>>>>>> >>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.06/ >>>>>>>>> >>>>>>>>> This: >>>>>>>>> >>>>>>>>> ?? 90 void PerfMemory::initialize() { >>>>>>>>> ?? 91 >>>>>>>>> ?? 92?? if (_prologue != NULL) >>>>>>>>> ?? 93???? // initialization already performed >>>>>>>>> ?? 94???? return; >>>>>>>>> >>>>>>>>> shouldn't check _prologue, but is_initialized(). >>>>>>>>> >>>>>>>>> ? 213?? assert(is_useable(), "called before initialization"); >>>>>>>>> >>>>>>>>> -> "called before init or after destroy" >>>>>>>>> >>>>>>>>> Could add a similar assert in PerfMemory::mark_updated(). >>>>>>>>> >>>>>>>>> Let's see what Serguei thinks. :) >>>>>>>>> >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> David >>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> >>>>>>>>>> Yasumasa >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> 2017-10-18 13:44 GMT+09:00 David Holmes: >>>>>>>>>>> On 18/10/2017 2:27 PM, Yasumasa Suenaga wrote: >>>>>>>>>>>> >>>>>>>>>>>> Hi David, >>>>>>>>>>>> >>>>>>>>>>>> 2017-10-18 12:55 GMT+09:00 David Holmes: >>>>>>>>>>>>> >>>>>>>>>>>>> On 18/10/2017 12:37 PM, Yasumasa Suenaga wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Hi David, >>>>>>>>>>>>>> >>>>>>>>>>>>>>> With your changes you no longer null out _prologue so the assertion >>>>>>>>>>>>>>> would >>>>>>>>>>>>>>> now not fail and we'd proceed to access the deleted memory region! >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Linux, PerfMemory::delete_memory_region() does not call munmap() >>>>>>>>>>>>>> for PerfMemory. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Perhaps not but there are still other actions that happen and the point >>>>>>>>>>>>> is >>>>>>>>>>>>> we should not be able to continue to use PerfMemory once it has been >>>>>>>>>>>>> destroyed (even if the destruction is only logical). >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> I received same comment from Dmitry in the past, but we couldn't >>>>>>>>>>>> decide how should we do. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> http://mail.openjdk.java.net/pipermail/serviceability-dev/2016-May/019728.html >>>>>>>>>>>> >>>>>>>>>>>> In that discussion, I uploaded another webrev which adds other fields >>>>>>>>>>>> for >>>>>>>>>>>> JSnap. >>>>>>>>>>>> Is it suitable? >>>>>>>>>>>> >>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.02/ >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> I don't think we need the extra fields, just ensure the existing ones >>>>>>>>>>> can't >>>>>>>>>>> be accessed (other than by the tools) after destroy is called. >>>>>>>>>>> >>>>>>>>>>>>>>> I'm unclear why you no longer clear all the fields set during >>>>>>>>>>>>>>> initialization? >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> PerfMemory.java in jdk.hotspot.agent needs these field values. >>>>>>>>>>>>>> `jhsdb jsnap --core` is failed if they are cleared. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> I'm not familiar with these tools. When do we produce a core file after >>>>>>>>>>>>> calling PerfMemory::destroy ? >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> PerfMemory::destroy() is called before aborting. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Ah - right. I assume we need to close off the perfdata file before we >>>>>>>>>>> abort. >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> David >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> ----------------------- >>>>>>>>>>>> #0? perfMemory_exit () >>>>>>>>>>>> ?????? at >>>>>>>>>>>> >>>>>>>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/share/vm/runtime/perfMemory.cpp:80 >>>>>>>>>>>> #1? 0x00007f99b091c949 in os::shutdown () >>>>>>>>>>>> ?????? at >>>>>>>>>>>> >>>>>>>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:1483 >>>>>>>>>>>> #2? 0x00007f99b091c980 in os::abort (dump_core=) >>>>>>>>>>>> ?????? at >>>>>>>>>>>> >>>>>>>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:1503 >>>>>>>>>>>> #3? 0x00007f99b0b689c3 in VMError::report_and_die ( >>>>>>>>>>>> ?????? this=this at entry=0x7ffcacf40b50) >>>>>>>>>>>> ?????? at >>>>>>>>>>>> >>>>>>>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/share/vm/utilities/vmError.cpp:1060 >>>>>>>>>>>> #4? 0x00007f99b0926f04 in JVM_handle_linux_signal (sig=sig at entry=11, >>>>>>>>>>>> ?????? info=info at entry=0x7ffcacf40df0, >>>>>>>>>>>> ucVoid=ucVoid at entry=0x7ffcacf40cc0, >>>>>>>>>>>> abort_if_unrecognized=abort_if_unrecognized at entry=1) >>>>>>>>>>>> ?????? at >>>>>>>>>>>> >>>>>>>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/os_cpu/linux_x86/vm/os_linux_x86.cpp:541 >>>>>>>>>>>> ----------------------- >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> >>>>>>>>>>>> Yasumasa >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>>>> But it seems to me that there are various checks of >>>>>>>>>>>>>>> _prologue that should really be checking is_initialized() and/or >>>>>>>>>>>>>>> is_destroyed() as a guard. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Should I change all assertions for _prologue? >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Assertions and direct guards. Checking _prologue is a placeholder for >>>>>>>>>>>>> the >>>>>>>>>>>>> real check. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> David >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> 2017-10-18 10:53 GMT+09:00 David Holmes: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Hi Yasumasa, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> By chance we ran into this bug which I analysed yesterday: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8189390 >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> We hit the assertion: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> #? Internal Error >>>>>>>>>>>>>>> (/open/src/hotspot/share/runtime/perfMemory.cpp:216), >>>>>>>>>>>>>>> pid=17874, tid=17875 >>>>>>>>>>>>>>> #? assert(_prologue != __null) failed: called before initialization >>>>>>>>>>>>>>> # >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> which is misleading because it can fail if called before >>>>>>>>>>>>>>> initialization, >>>>>>>>>>>>>>> or >>>>>>>>>>>>>>> after PerfMemory::destroy has been called. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> With your changes you no longer null out _prologue so the assertion >>>>>>>>>>>>>>> would >>>>>>>>>>>>>>> now not fail and we'd proceed to access the deleted memory region! >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I'm unclear why you no longer clear all the fields set during >>>>>>>>>>>>>>> initialization? But it seems to me that there are various checks of >>>>>>>>>>>>>>> _prologue that should really be checking is_initialized() and/or >>>>>>>>>>>>>>> is_destroyed() as a guard. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>> David >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On 16/10/2017 11:25 PM, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> PING: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Could you review it? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.05/ >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On 2017/10/03 13:18, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I added gtest unit test case for this change in new webrev: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.05/ >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Could you review it? >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> 2017-09-27 0:01 GMT+09:00 Yasumasa Suenaga: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> I uploaded new webrev to be adapted to jdk10/hs: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.04/ >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On 2017/09/21 7:45, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> PING: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Have you checked this issue? >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.03/ >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> On 2017/07/01 23:43, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> PING: >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Have you checked this issue? >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> On 2017/06/13 14:10, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> I want to discuss about JDK-8151815: Could not parse core image >>>>>>>>>>>>>>>>>>>>> with >>>>>>>>>>>>>>>>>>>>> JSnap. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> In last year, I found JSnap cannot parse coredump and I've sent >>>>>>>>>>>>>>>>>>>>> review >>>>>>>>>>>>>>>>>>>>> request for it as JDK-8151815. However it has not been reviewed >>>>>>>>>>>>>>>>>>>>> yet >>>>>>>>>>>>>>>>>>>>> [1]. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> We've discussed about safety implementation, but we could not >>>>>>>>>>>>>>>>>>>>> get >>>>>>>>>>>>>>>>>>>>> consensus. >>>>>>>>>>>>>>>>>>>>> IMHO all SA tools should be handled java processes and core >>>>>>>>>>>>>>>>>>>>> images, >>>>>>>>>>>>>>>>>>>>> and PerfCounter value is useful. So I fix this issue. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> I uploaded new webrev for this issue. I think this patch is >>>>>>>>>>>>>>>>>>>>> safety >>>>>>>>>>>>>>>>>>>>> because new flag PerfMemory::_destroyed guards double free, and >>>>>>>>>>>>>>>>>>>>> all >>>>>>>>>>>>>>>>>>>>> members in PerfMemory is accessible (they are not munmap'ed) >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.03/ >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Can you cooperate? >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> [1] >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> http://mail.openjdk.java.net/pipermail/serviceability-dev/2016-April/019480.html >>>>>>>>>>>>>>>>>>>>> >>>>>>> >>>>> > From yasuenag at gmail.com Thu Oct 19 03:21:26 2017 From: yasuenag at gmail.com (Yasumasa Suenaga) Date: Thu, 19 Oct 2017 12:21:26 +0900 Subject: PING: RFR: JDK-8151815: Could not parse core image with JSnap. In-Reply-To: References: <643d9ea2-5bb3-f6ef-8007-14cd1c580137@oracle.com> <4370fcb0-d06d-865b-8bab-e03ec1812e89@oracle.com> <84922d13-79ac-8904-1885-79ee0c2b4232@oracle.com> <967acad8-8378-cf99-0ff9-ba3852d51dd3@oracle.com> <41185116-0271-374c-da9e-74e173d6b58c@oracle.com> <7cb41834-75fa-9dc5-4f6d-c2b85f84dcee@oracle.com> <78b0bfca-1259-e5ef-1189-a7646d7fea36@oracle.com> <7b8ae324-c590-bb41-4bb6-2b4d18b12267@gmail.com> <024d8188-8d54-3345-d3b6-e757f4cafb6e@oracle.com> Message-ID: <38f55e3a-5053-f977-980e-59687cca5fd9@gmail.com> Sorry, I have mistake. But I cannot compile yet: diff -r 3e7702cd3f19 src/hotspot/share/runtime/vmStructs.cpp --- a/src/hotspot/share/runtime/vmStructs.cpp Thu Sep 07 15:40:20 2017 +0200 +++ b/src/hotspot/share/runtime/vmStructs.cpp Thu Oct 19 12:21:11 2017 +0900 @@ -578,7 +578,7 @@ static_field(PerfMemory, _top, char*) \ static_field(PerfMemory, _capacity, size_t) \ static_field(PerfMemory, _prologue, PerfDataPrologue*) \ - static_field(PerfMemory, _initialized, jint) \ + static_field(PerfMemory, _initialized, volatile jint) \ -------------- In file included from /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:104:0: /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.hpp:168:58: error: invalid conversion from 'volatile void*' to 'void*' [-fpermissive] { QUOTE(typeName), QUOTE(fieldName), QUOTE(type), 1, 0, &typeName::fieldName }, /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:581:6: note: in expansion of macro 'GENERATE_STATIC_VM_STRUCT_ENTRY' static_field(PerfMemory, _initialized, volatile jint) \ ^~~~~~~~~~~~ /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:2934:3: note: in expansion of macro 'VM_STRUCTS' VM_STRUCTS(GENERATE_NONSTATIC_VM_STRUCT_ENTRY, ^ gmake[3]: *** [lib/CompileJvm.gmk:210: /home/ysuenaga/OpenJDK/jdk10-hs/build/linux-x86_64-normal-server-fastdebug/hotspot/variant-server/libjvm/objs/vmStructs.o] Error 1 gmake[2]: *** [make/Main.gmk:266: hotspot-server-libs] Error 2 ERROR: Build failed for target 'images' in configuration 'linux-x86_64-normal-server-fastdebug' (exit code 2) -------------- On 2017/10/19 12:18, Yasumasa Suenaga wrote: > Hi Serguei, > >> Would the below work? : >> >> ? 578????? static_field(PerfMemory, _initialized, volatile jint)????????????????????????????????? \ >> >> It'd be similar to this non-static case: >> ? 362?? nonstatic_field(ConstantPoolCacheEntry, _f1,????????????????????????????????? volatile Metadata*)??????????????????? \ > > I got error messages as below: > > --------------- > In file included from /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:104:0: > /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:581:39: error: expected unqualified-id before 'volatile' > ????? static_field(PerfMemory,???????? volatile _initialized,????????????????????????????????? jint)????????????????????????????????? \ > ?????????????????????????????????????? ^ > /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.hpp:168:69: note: in definition of macro 'GENERATE_STATIC_VM_STRUCT_ENTRY' > ? { QUOTE(typeName), QUOTE(fieldName), QUOTE(type), 1, 0, &typeName::fieldName }, > ???????????????????????????????????????????????????????????????????? ^~~~~~~~~ > /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:2934:3: note: in expansion of macro 'VM_STRUCTS' > ?? VM_STRUCTS(GENERATE_NONSTATIC_VM_STRUCT_ENTRY, > ?? ^ > /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:581:39: error: expected '}' before 'volatile' > ????? static_field(PerfMemory,???????? volatile _initialized,????????????????????????????????? jint)????????????????????????????????? \ > ?????????????????????????????????????? ^ > /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.hpp:168:69: note: in definition of macro 'GENERATE_STATIC_VM_STRUCT_ENTRY' > ? { QUOTE(typeName), QUOTE(fieldName), QUOTE(type), 1, 0, &typeName::fieldName }, > ???????????????????????????????????????????????????????????????????? ^~~~~~~~~ > /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:2934:3: note: in expansion of macro 'VM_STRUCTS' > ?? VM_STRUCTS(GENERATE_NONSTATIC_VM_STRUCT_ENTRY, > ?? ^ > /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:581:39: error: expected '}' before 'volatile' > ????? static_field(PerfMemory,???????? volatile _initialized,????????????????????????????????? jint)????????????????????????????????? \ > ?????????????????????????????????????? ^ > /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.hpp:168:69: note: in definition of macro 'GENERATE_STATIC_VM_STRUCT_ENTRY' > ? { QUOTE(typeName), QUOTE(fieldName), QUOTE(type), 1, 0, &typeName::fieldName }, > ???????????????????????????????????????????????????????????????????? ^~~~~~~~~ > /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:2934:3: note: in expansion of macro 'VM_STRUCTS' > ?? VM_STRUCTS(GENERATE_NONSTATIC_VM_STRUCT_ENTRY, > ?? ^ > /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.hpp:168:79: error: expected declaration before '}' token > ? { QUOTE(typeName), QUOTE(fieldName), QUOTE(type), 1, 0, &typeName::fieldName }, > ?????????????????????????????????????????????????????????????????????????????? ^ > /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:581:6: note: in expansion of macro 'GENERATE_STATIC_VM_STRUCT_ENTRY' > ????? static_field(PerfMemory,???????? volatile _initialized,????????????????????????????????? jint)????????????????????????????????? \ > ????? ^~~~~~~~~~~~ > /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:2934:3: note: in expansion of macro 'VM_STRUCTS' > ?? VM_STRUCTS(GENERATE_NONSTATIC_VM_STRUCT_ENTRY, > ?? ^ > gmake[3]: *** [lib/CompileJvm.gmk:210: /home/ysuenaga/OpenJDK/jdk10-hs/build/linux-x86_64-normal-server-fastdebug/hotspot/variant-server/libjvm/objs/vmStructs.o] Error 1 > gmake[2]: *** [make/Main.gmk:266: hotspot-server-libs] Error 2 > > ERROR: Build failed for target 'images' in configuration 'linux-x86_64-normal-server-fastdebug' (exit code 2) > --------------- > > > I changed as below: > --------------- > diff -r 3e7702cd3f19 src/hotspot/share/runtime/perfMemory.cpp > --- a/src/hotspot/share/runtime/perfMemory.cpp? Thu Sep 07 15:40:20 2017 +0200 > +++ b/src/hotspot/share/runtime/perfMemory.cpp? Thu Oct 19 12:15:30 2017 +0900 > @@ -51,8 +51,9 @@ > ?char*??????????????????? PerfMemory::_end = NULL; > ?char*??????????????????? PerfMemory::_top = NULL; > ?size_t?????????????????? PerfMemory::_capacity = 0; > -jint???????????????????? PerfMemory::_initialized = false; > +volatile jint??????????? PerfMemory::_initialized = 0; > ?PerfDataPrologue*??????? PerfMemory::_prologue = NULL; > +volatile bool??????????? PerfMemory::_destroyed = false; > > --- a/src/hotspot/share/runtime/perfMemory.hpp? Thu Sep 07 15:40:20 2017 +0200 > +++ b/src/hotspot/share/runtime/perfMemory.hpp? Thu Oct 19 12:15:30 2017 +0900 > @@ -113,13 +113,15 @@ > ? */ > ?class PerfMemory : AllStatic { > ???? friend class VMStructs; > +??? friend class PerfMemoryTest; > ?? private: > ???? static char*? _start; > ???? static char*? _end; > ???? static char*? _top; > ???? static size_t _capacity; > ???? static PerfDataPrologue*? _prologue; > -??? static jint?? _initialized; > +??? static volatile jint????? _initialized; > +??? static volatile bool????? _destroyed; > > diff -r 3e7702cd3f19 src/hotspot/share/runtime/vmStructs.cpp > --- a/src/hotspot/share/runtime/vmStructs.cpp?? Thu Sep 07 15:40:20 2017 +0200 > +++ b/src/hotspot/share/runtime/vmStructs.cpp?? Thu Oct 19 12:15:30 2017 +0900 > @@ -578,7 +578,7 @@ > ????? static_field(PerfMemory,????????????????? _top,????????????????????????????????????????? char*)???????????????????????????????? \ > ????? static_field(PerfMemory,????????????????? _capacity,???????????????????????????????????? size_t)??????????????????????????????? \ > ????? static_field(PerfMemory,????????????????? _prologue,???????????????????????????????????? PerfDataPrologue*)???????????????????? \ > -???? static_field(PerfMemory,????????????????? _initialized,????????????????????????????????? jint)????????????????????????????????? \ > +???? static_field(PerfMemory,???????? volatile _initialized,????????????????????????????????? jint)????????????????????????????????? \ > --------------- > > > Thanks, > > Yasumasa > > > On 2017/10/19 6:18, serguei.spitsyn at oracle.com wrote: >> On 10/18/17 06:51, Yasumasa Suenaga wrote: >>> Hi David, Serguei, >>> >>>> because as soon as we have checked is_usable() and abort happening in another thread may have changed that by calling destroy. >>>> >>>> This code is basically broken if we hit an abort path instead of a normal VM shutdown. >>> >>> Can we use MutexLocker for initialize() and destroy() ? >>> >>> >>> I've tried to fix about your comments, but I have an issue about volatile. >>> PerfMemory.java depends on PerfMemory::_initialized. However VMStructs cannot handle static volatile variables. >>> I think two approaches as below: >>> >>> >>> ? 1. Remove _initialized check from PerfMemory.java >>> ???? SA will throw UnmappedAddressException if JSnap try to access invalid address including uninitialized memory. >>> >>> ? 2. Add static volatile support to VMStructs >>> >>> >>> Which should we do? >>> 1. is easy to fix. But 2. might be right way... >> >> Would the below work? : >> >> ??578????? static_field(PerfMemory, _initialized, volatile jint)????????????????????????????????? \ >> >> It'd be similar to this non-static case: >> ??362?? nonstatic_field(ConstantPoolCacheEntry, _f1,????????????????????????????????? volatile Metadata*)??????????????????? \ >> >> >> Thanks, >> Serguei >> >>> >>> >>> Thanks, >>> >>> Yasumasa >>> >>> >>> On 2017/10/18 21:34, David Holmes wrote: >>>> Just to clarify ... >>>> >>>> On 18/10/2017 10:28 PM, David Holmes wrote: >>>>> On 18/10/2017 8:26 PM, serguei.spitsyn at oracle.com wrote: >>>>>> Hi David, >>>>>> >>>>>> Thank you for jumping to this review and helping Yasumasa to sort it out! >>>>>> I've just discovered that this issue was already on the table for several months without a significant progress. >>>>>> >>>>>> >>>>>> On 10/18/17 02:48, David Holmes wrote: >>>>>>> Hi Serguei >>>>>>> >>>>>>> On 18/10/2017 7:25 PM, serguei.spitsyn at oracle.com wrote: >>>>>>>> Hi Yasumasa, >>>>>>>> >>>>>>>> Sorry for a quite late participation. >>>>>>>> >>>>>>>> I looked at the previous webrevs and think that this one is much better. >>>>>>>> >>>>>>>> Some concern is if we need any kind of synchronization here, e.g. CAS. >>>>>>>> But it depends on the PerfMemory class usage. >>>>>>>> >>>>>>>> Should we make the static variables '_initialized' and '_destroyed' volatile? >>>>>>> >>>>>>> For good measure - yes. >>>>>>> >>>>>>>> Also, the '_initialized' is set to 1 with: >>>>>>>> ??? 159??? OrderAccess::release_store(&_initialized, 1); >>>>>>>> >>>>>>>> Should we do the same to set the '_destroyed'?: >>>>>>>> 200 _destroyed = true; >>>>>>> >>>>>>> There is a benign initialization race but we need the release_store to ensure all the data fields can be read if _initialized is seen as true. But what is missing is a load_acquire() in is_initialized() to ensure we synchronize with that store! >>>>>> >>>>>> Yes, I noticed that the load_acquire() is missed. :| >>>>>> >>>>>>> >>>>>>> There is also a potential for a destruction race (if multiple aborts happens concurrently in different threads) but that also seems benign. In this case there is no data being set so the store to _destroyed does not need to be a release_store. >>>>>> >>>>>> I'm not convinced yet this is benign as the PerfMemory::destroy() has this call: >>>>>> ?? 197 delete_memory_region(); >>>>> >>>>> Yes though most of its work ends up being no-ops. >>>>> >>>>>> >>>>>> Now, I started thinking about the asserts that call the is_useable(). >>>>>> Should they be returns instead? >>>>> >>>>> I think this is a somewhat confused chunk of code. It's only fractionally thread-safe yet once in use could be in use concurrently with an aborting thread that calls destroy(). I don't think there is any simple fix for this. If we're in the process of crashing does it really matter if we trigger a secondary crash due to this? >>>> >>>> It doesn't matter if we do: >>>> >>>> assert(is_usable(),...); >>>> // continue >>>> >>>> or >>>> >>>> if (!is_usable()) return; >>>> // continue >>>> >>>> because as soon as we have checked is_usable() and abort happening in another thread may have changed that by calling destroy. >>>> >>>> This code is basically broken if we hit an abort path instead of a normal VM shutdown. >>>> >>>> David >>>> ----- >>>> >>>>> The problems with this code go way beyond what Yasumasa is trying to address with the JSnap problem and I would not want to put it back on him to try and come up with an overall solution. >>>>> >>>>>> Then the is_destroyed() would better to have the load_acquire(). >>>>> >>>>> You could add a load_acquire and do the store_release. It certainly would not hurt, but I don't think it would actually benefit anything either. >>>>> >>>>> Cheers, >>>>> David >>>>> >>>>>> Just interested to know what do you think on this. >>>>>> >>>>>> Thanks, >>>>>> Serguei >>>>>> >>>>>>> >>>>>>> Cheers, >>>>>>> David >>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Serguei >>>>>>>> >>>>>>>> >>>>>>>> On 10/18/17 00:39, Yasumasa Suenaga wrote: >>>>>>>>> Hi David, >>>>>>>>> >>>>>>>>> Thank you for your comment. >>>>>>>>> I uploaded new webrev: >>>>>>>>> >>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.07/ >>>>>>>>> >>>>>>>>> Serguei, please comment about this :-) >>>>>>>>> >>>>>>>>> >>>>>>>>> Yasumasa >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> 2017-10-18 16:09 GMT+09:00 David Holmes: >>>>>>>>>> Hi Yasumasa, >>>>>>>>>> >>>>>>>>>> On 18/10/2017 4:34 PM, Yasumasa Suenaga wrote: >>>>>>>>>>> Hi David, >>>>>>>>>>> >>>>>>>>>>>> I don't think we need the extra fields, just ensure the existing ones >>>>>>>>>>>> can't >>>>>>>>>>>> be accessed (other than by the tools) after destroy is called. >>>>>>>>>>> >>>>>>>>>>> I've added PerfMemory::is_useable() to check whether we can access to >>>>>>>>>>> PerfMemory. >>>>>>>>>>> I think this webrev prevent to access to PerfMemory after destroy() call. >>>>>>>>>>> >>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.06/ >>>>>>>>>> >>>>>>>>>> This: >>>>>>>>>> >>>>>>>>>> ?? 90 void PerfMemory::initialize() { >>>>>>>>>> ?? 91 >>>>>>>>>> ?? 92?? if (_prologue != NULL) >>>>>>>>>> ?? 93???? // initialization already performed >>>>>>>>>> ?? 94???? return; >>>>>>>>>> >>>>>>>>>> shouldn't check _prologue, but is_initialized(). >>>>>>>>>> >>>>>>>>>> ? 213?? assert(is_useable(), "called before initialization"); >>>>>>>>>> >>>>>>>>>> -> "called before init or after destroy" >>>>>>>>>> >>>>>>>>>> Could add a similar assert in PerfMemory::mark_updated(). >>>>>>>>>> >>>>>>>>>> Let's see what Serguei thinks. :) >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> David >>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> >>>>>>>>>>> Yasumasa >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> 2017-10-18 13:44 GMT+09:00 David Holmes: >>>>>>>>>>>> On 18/10/2017 2:27 PM, Yasumasa Suenaga wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> Hi David, >>>>>>>>>>>>> >>>>>>>>>>>>> 2017-10-18 12:55 GMT+09:00 David Holmes: >>>>>>>>>>>>>> >>>>>>>>>>>>>> On 18/10/2017 12:37 PM, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Hi David, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> With your changes you no longer null out _prologue so the assertion >>>>>>>>>>>>>>>> would >>>>>>>>>>>>>>>> now not fail and we'd proceed to access the deleted memory region! >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Linux, PerfMemory::delete_memory_region() does not call munmap() >>>>>>>>>>>>>>> for PerfMemory. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Perhaps not but there are still other actions that happen and the point >>>>>>>>>>>>>> is >>>>>>>>>>>>>> we should not be able to continue to use PerfMemory once it has been >>>>>>>>>>>>>> destroyed (even if the destruction is only logical). >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> I received same comment from Dmitry in the past, but we couldn't >>>>>>>>>>>>> decide how should we do. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> http://mail.openjdk.java.net/pipermail/serviceability-dev/2016-May/019728.html >>>>>>>>>>>>> >>>>>>>>>>>>> In that discussion, I uploaded another webrev which adds other fields >>>>>>>>>>>>> for >>>>>>>>>>>>> JSnap. >>>>>>>>>>>>> Is it suitable? >>>>>>>>>>>>> >>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.02/ >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> I don't think we need the extra fields, just ensure the existing ones >>>>>>>>>>>> can't >>>>>>>>>>>> be accessed (other than by the tools) after destroy is called. >>>>>>>>>>>> >>>>>>>>>>>>>>>> I'm unclear why you no longer clear all the fields set during >>>>>>>>>>>>>>>> initialization? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> PerfMemory.java in jdk.hotspot.agent needs these field values. >>>>>>>>>>>>>>> `jhsdb jsnap --core` is failed if they are cleared. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> I'm not familiar with these tools. When do we produce a core file after >>>>>>>>>>>>>> calling PerfMemory::destroy ? >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> PerfMemory::destroy() is called before aborting. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Ah - right. I assume we need to close off the perfdata file before we >>>>>>>>>>>> abort. >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> David >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> ----------------------- >>>>>>>>>>>>> #0? perfMemory_exit () >>>>>>>>>>>>> ?????? at >>>>>>>>>>>>> >>>>>>>>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/share/vm/runtime/perfMemory.cpp:80 >>>>>>>>>>>>> #1? 0x00007f99b091c949 in os::shutdown () >>>>>>>>>>>>> ?????? at >>>>>>>>>>>>> >>>>>>>>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:1483 >>>>>>>>>>>>> #2? 0x00007f99b091c980 in os::abort (dump_core=) >>>>>>>>>>>>> ?????? at >>>>>>>>>>>>> >>>>>>>>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:1503 >>>>>>>>>>>>> #3? 0x00007f99b0b689c3 in VMError::report_and_die ( >>>>>>>>>>>>> ?????? this=this at entry=0x7ffcacf40b50) >>>>>>>>>>>>> ?????? at >>>>>>>>>>>>> >>>>>>>>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/share/vm/utilities/vmError.cpp:1060 >>>>>>>>>>>>> #4? 0x00007f99b0926f04 in JVM_handle_linux_signal (sig=sig at entry=11, >>>>>>>>>>>>> ?????? info=info at entry=0x7ffcacf40df0, >>>>>>>>>>>>> ucVoid=ucVoid at entry=0x7ffcacf40cc0, >>>>>>>>>>>>> abort_if_unrecognized=abort_if_unrecognized at entry=1) >>>>>>>>>>>>> ?????? at >>>>>>>>>>>>> >>>>>>>>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/os_cpu/linux_x86/vm/os_linux_x86.cpp:541 >>>>>>>>>>>>> ----------------------- >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> >>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>>>> But it seems to me that there are various checks of >>>>>>>>>>>>>>>> _prologue that should really be checking is_initialized() and/or >>>>>>>>>>>>>>>> is_destroyed() as a guard. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Should I change all assertions for _prologue? >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Assertions and direct guards. Checking _prologue is a placeholder for >>>>>>>>>>>>>> the >>>>>>>>>>>>>> real check. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> David >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> 2017-10-18 10:53 GMT+09:00 David Holmes: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Hi Yasumasa, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> By chance we ran into this bug which I analysed yesterday: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8189390 >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> We hit the assertion: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> #? Internal Error >>>>>>>>>>>>>>>> (/open/src/hotspot/share/runtime/perfMemory.cpp:216), >>>>>>>>>>>>>>>> pid=17874, tid=17875 >>>>>>>>>>>>>>>> #? assert(_prologue != __null) failed: called before initialization >>>>>>>>>>>>>>>> # >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> which is misleading because it can fail if called before >>>>>>>>>>>>>>>> initialization, >>>>>>>>>>>>>>>> or >>>>>>>>>>>>>>>> after PerfMemory::destroy has been called. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> With your changes you no longer null out _prologue so the assertion >>>>>>>>>>>>>>>> would >>>>>>>>>>>>>>>> now not fail and we'd proceed to access the deleted memory region! >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I'm unclear why you no longer clear all the fields set during >>>>>>>>>>>>>>>> initialization? But it seems to me that there are various checks of >>>>>>>>>>>>>>>> _prologue that should really be checking is_initialized() and/or >>>>>>>>>>>>>>>> is_destroyed() as a guard. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On 16/10/2017 11:25 PM, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> PING: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Could you review it? >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.05/ >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On 2017/10/03 13:18, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> I added gtest unit test case for this change in new webrev: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.05/ >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Could you review it? >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> 2017-09-27 0:01 GMT+09:00 Yasumasa Suenaga: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> I uploaded new webrev to be adapted to jdk10/hs: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.04/ >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> On 2017/09/21 7:45, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> PING: >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Have you checked this issue? >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.03/ >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> On 2017/07/01 23:43, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> PING: >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Have you checked this issue? >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> On 2017/06/13 14:10, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> I want to discuss about JDK-8151815: Could not parse core image >>>>>>>>>>>>>>>>>>>>>> with >>>>>>>>>>>>>>>>>>>>>> JSnap. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> In last year, I found JSnap cannot parse coredump and I've sent >>>>>>>>>>>>>>>>>>>>>> review >>>>>>>>>>>>>>>>>>>>>> request for it as JDK-8151815. However it has not been reviewed >>>>>>>>>>>>>>>>>>>>>> yet >>>>>>>>>>>>>>>>>>>>>> [1]. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> We've discussed about safety implementation, but we could not >>>>>>>>>>>>>>>>>>>>>> get >>>>>>>>>>>>>>>>>>>>>> consensus. >>>>>>>>>>>>>>>>>>>>>> IMHO all SA tools should be handled java processes and core >>>>>>>>>>>>>>>>>>>>>> images, >>>>>>>>>>>>>>>>>>>>>> and PerfCounter value is useful. So I fix this issue. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> I uploaded new webrev for this issue. I think this patch is >>>>>>>>>>>>>>>>>>>>>> safety >>>>>>>>>>>>>>>>>>>>>> because new flag PerfMemory::_destroyed guards double free, and >>>>>>>>>>>>>>>>>>>>>> all >>>>>>>>>>>>>>>>>>>>>> members in PerfMemory is accessible (they are not munmap'ed) >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.03/ >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Can you cooperate? >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> [1] >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> http://mail.openjdk.java.net/pipermail/serviceability-dev/2016-April/019480.html >>>>>>>>>>>>>>>>>>>>>> >>>>>>>> >>>>>> >> From serguei.spitsyn at oracle.com Thu Oct 19 09:37:30 2017 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 19 Oct 2017 02:37:30 -0700 Subject: PING: RFR: JDK-8151815: Could not parse core image with JSnap. In-Reply-To: <38f55e3a-5053-f977-980e-59687cca5fd9@gmail.com> References: <643d9ea2-5bb3-f6ef-8007-14cd1c580137@oracle.com> <4370fcb0-d06d-865b-8bab-e03ec1812e89@oracle.com> <84922d13-79ac-8904-1885-79ee0c2b4232@oracle.com> <967acad8-8378-cf99-0ff9-ba3852d51dd3@oracle.com> <41185116-0271-374c-da9e-74e173d6b58c@oracle.com> <7cb41834-75fa-9dc5-4f6d-c2b85f84dcee@oracle.com> <78b0bfca-1259-e5ef-1189-a7646d7fea36@oracle.com> <7b8ae324-c590-bb41-4bb6-2b4d18b12267@gmail.com> <024d8188-8d54-3345-d3b6-e757f4cafb6e@oracle.com> <38f55e3a-5053-f977-980e-59687cca5fd9@gmail.com> Message-ID: <8e2c7e3f-2f3a-7168-712b-2642a4e88239@oracle.com> Hi Yasumasa, I see the problem. As it occurred making these variables volatile is non-trivial. But thank you a lot for trying! I'd suggest to fall back to your previous approach as synchronization was not there in the first place, and it is not a part of the original issue you are trying to fix (if David or anyone else does not a simple solution). But let's check if David does not object against it. I will sponsor your fix after you send me a patch. Thanks, Serguei On 10/18/17 20:21, Yasumasa Suenaga wrote: > Sorry, I have mistake. > But I cannot compile yet: > > diff -r 3e7702cd3f19 src/hotspot/share/runtime/vmStructs.cpp > --- a/src/hotspot/share/runtime/vmStructs.cpp?? Thu Sep 07 15:40:20 > 2017 +0200 > +++ b/src/hotspot/share/runtime/vmStructs.cpp?? Thu Oct 19 12:21:11 > 2017 +0900 > @@ -578,7 +578,7 @@ > ????? static_field(PerfMemory, _top, > char*)???????????????????????????????? \ > ????? static_field(PerfMemory, _capacity, > size_t)??????????????????????????????? \ > ????? static_field(PerfMemory, _prologue, > PerfDataPrologue*)???????????????????? \ > -???? static_field(PerfMemory, _initialized, > jint)????????????????????????????????? \ > +???? static_field(PerfMemory, > _initialized,????????????????????????????????? volatile > jint)??????????????????????????? \ > > -------------- > In file included from > /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:104:0: > /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.hpp:168:58: > error: invalid conversion from 'volatile void*' to 'void*' [-fpermissive] > ? { QUOTE(typeName), QUOTE(fieldName), QUOTE(type), 1, 0, > &typeName::fieldName }, > /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:581:6: > note: in expansion of macro 'GENERATE_STATIC_VM_STRUCT_ENTRY' > ????? static_field(PerfMemory, > _initialized,????????????????????????????????? volatile > jint)??????????????????????????? \ > ????? ^~~~~~~~~~~~ > /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:2934:3: > note: in expansion of macro 'VM_STRUCTS' > ?? VM_STRUCTS(GENERATE_NONSTATIC_VM_STRUCT_ENTRY, > ?? ^ > gmake[3]: *** [lib/CompileJvm.gmk:210: > /home/ysuenaga/OpenJDK/jdk10-hs/build/linux-x86_64-normal-server-fastdebug/hotspot/variant-server/libjvm/objs/vmStructs.o] > Error 1 > gmake[2]: *** [make/Main.gmk:266: hotspot-server-libs] Error 2 > > ERROR: Build failed for target 'images' in configuration > 'linux-x86_64-normal-server-fastdebug' (exit code 2) > -------------- > > > > On 2017/10/19 12:18, Yasumasa Suenaga wrote: >> Hi Serguei, >> >>> Would the below work? : >>> >>> ? 578????? static_field(PerfMemory, _initialized, volatile >>> jint)????????????????????????????????? \ >>> >>> It'd be similar to this non-static case: >>> ? 362?? nonstatic_field(ConstantPoolCacheEntry, >>> _f1,????????????????????????????????? volatile >>> Metadata*)??????????????????? \ >> >> I got error messages as below: >> >> --------------- >> In file included from >> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:104:0: >> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:581:39: >> error: expected unqualified-id before 'volatile' >> ?????? static_field(PerfMemory,???????? volatile _initialized, >> jint)????????????????????????????????? \ >> ??????????????????????????????????????? ^ >> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.hpp:168:69: >> note: in definition of macro 'GENERATE_STATIC_VM_STRUCT_ENTRY' >> ?? { QUOTE(typeName), QUOTE(fieldName), QUOTE(type), 1, 0, >> &typeName::fieldName }, >> ^~~~~~~~~ >> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:2934:3: >> note: in expansion of macro 'VM_STRUCTS' >> ??? VM_STRUCTS(GENERATE_NONSTATIC_VM_STRUCT_ENTRY, >> ??? ^ >> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:581:39: >> error: expected '}' before 'volatile' >> ?????? static_field(PerfMemory,???????? volatile _initialized, >> jint)????????????????????????????????? \ >> ??????????????????????????????????????? ^ >> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.hpp:168:69: >> note: in definition of macro 'GENERATE_STATIC_VM_STRUCT_ENTRY' >> ?? { QUOTE(typeName), QUOTE(fieldName), QUOTE(type), 1, 0, >> &typeName::fieldName }, >> ^~~~~~~~~ >> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:2934:3: >> note: in expansion of macro 'VM_STRUCTS' >> ??? VM_STRUCTS(GENERATE_NONSTATIC_VM_STRUCT_ENTRY, >> ??? ^ >> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:581:39: >> error: expected '}' before 'volatile' >> ?????? static_field(PerfMemory,???????? volatile _initialized, >> jint)????????????????????????????????? \ >> ??????????????????????????????????????? ^ >> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.hpp:168:69: >> note: in definition of macro 'GENERATE_STATIC_VM_STRUCT_ENTRY' >> ?? { QUOTE(typeName), QUOTE(fieldName), QUOTE(type), 1, 0, >> &typeName::fieldName }, >> ^~~~~~~~~ >> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:2934:3: >> note: in expansion of macro 'VM_STRUCTS' >> ??? VM_STRUCTS(GENERATE_NONSTATIC_VM_STRUCT_ENTRY, >> ??? ^ >> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.hpp:168:79: >> error: expected declaration before '}' token >> ?? { QUOTE(typeName), QUOTE(fieldName), QUOTE(type), 1, 0, >> &typeName::fieldName }, >> ^ >> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:581:6: >> note: in expansion of macro 'GENERATE_STATIC_VM_STRUCT_ENTRY' >> ?????? static_field(PerfMemory,???????? volatile _initialized, >> jint)????????????????????????????????? \ >> ?????? ^~~~~~~~~~~~ >> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:2934:3: >> note: in expansion of macro 'VM_STRUCTS' >> ??? VM_STRUCTS(GENERATE_NONSTATIC_VM_STRUCT_ENTRY, >> ??? ^ >> gmake[3]: *** [lib/CompileJvm.gmk:210: >> /home/ysuenaga/OpenJDK/jdk10-hs/build/linux-x86_64-normal-server-fastdebug/hotspot/variant-server/libjvm/objs/vmStructs.o] >> Error 1 >> gmake[2]: *** [make/Main.gmk:266: hotspot-server-libs] Error 2 >> >> ERROR: Build failed for target 'images' in configuration >> 'linux-x86_64-normal-server-fastdebug' (exit code 2) >> --------------- >> >> >> I changed as below: >> --------------- >> diff -r 3e7702cd3f19 src/hotspot/share/runtime/perfMemory.cpp >> --- a/src/hotspot/share/runtime/perfMemory.cpp? Thu Sep 07 15:40:20 >> 2017 +0200 >> +++ b/src/hotspot/share/runtime/perfMemory.cpp? Thu Oct 19 12:15:30 >> 2017 +0900 >> @@ -51,8 +51,9 @@ >> ??char*??????????????????? PerfMemory::_end = NULL; >> ??char*??????????????????? PerfMemory::_top = NULL; >> ??size_t?????????????????? PerfMemory::_capacity = 0; >> -jint???????????????????? PerfMemory::_initialized = false; >> +volatile jint??????????? PerfMemory::_initialized = 0; >> ??PerfDataPrologue*??????? PerfMemory::_prologue = NULL; >> +volatile bool??????????? PerfMemory::_destroyed = false; >> >> --- a/src/hotspot/share/runtime/perfMemory.hpp? Thu Sep 07 15:40:20 >> 2017 +0200 >> +++ b/src/hotspot/share/runtime/perfMemory.hpp? Thu Oct 19 12:15:30 >> 2017 +0900 >> @@ -113,13 +113,15 @@ >> ?? */ >> ??class PerfMemory : AllStatic { >> ????? friend class VMStructs; >> +??? friend class PerfMemoryTest; >> ??? private: >> ????? static char*? _start; >> ????? static char*? _end; >> ????? static char*? _top; >> ????? static size_t _capacity; >> ????? static PerfDataPrologue*? _prologue; >> -??? static jint?? _initialized; >> +??? static volatile jint????? _initialized; >> +??? static volatile bool????? _destroyed; >> >> diff -r 3e7702cd3f19 src/hotspot/share/runtime/vmStructs.cpp >> --- a/src/hotspot/share/runtime/vmStructs.cpp?? Thu Sep 07 15:40:20 >> 2017 +0200 >> +++ b/src/hotspot/share/runtime/vmStructs.cpp?? Thu Oct 19 12:15:30 >> 2017 +0900 >> @@ -578,7 +578,7 @@ >> ?????? static_field(PerfMemory, _top, >> char*)???????????????????????????????? \ >> ?????? static_field(PerfMemory, _capacity, >> size_t)??????????????????????????????? \ >> ?????? static_field(PerfMemory, _prologue, >> PerfDataPrologue*)???????????????????? \ >> -???? static_field(PerfMemory, _initialized, >> jint)????????????????????????????????? \ >> +???? static_field(PerfMemory,???????? volatile _initialized, >> jint)????????????????????????????????? \ >> --------------- >> >> >> Thanks, >> >> Yasumasa >> >> >> On 2017/10/19 6:18, serguei.spitsyn at oracle.com wrote: >>> On 10/18/17 06:51, Yasumasa Suenaga wrote: >>>> Hi David, Serguei, >>>> >>>>> because as soon as we have checked is_usable() and abort happening >>>>> in another thread may have changed that by calling destroy. >>>>> >>>>> This code is basically broken if we hit an abort path instead of a >>>>> normal VM shutdown. >>>> >>>> Can we use MutexLocker for initialize() and destroy() ? >>>> >>>> >>>> I've tried to fix about your comments, but I have an issue about >>>> volatile. >>>> PerfMemory.java depends on PerfMemory::_initialized. However >>>> VMStructs cannot handle static volatile variables. >>>> I think two approaches as below: >>>> >>>> >>>> ? 1. Remove _initialized check from PerfMemory.java >>>> ???? SA will throw UnmappedAddressException if JSnap try to access >>>> invalid address including uninitialized memory. >>>> >>>> ? 2. Add static volatile support to VMStructs >>>> >>>> >>>> Which should we do? >>>> 1. is easy to fix. But 2. might be right way... >>> >>> Would the below work? : >>> >>> ??578????? static_field(PerfMemory, _initialized, volatile >>> jint)????????????????????????????????? \ >>> >>> It'd be similar to this non-static case: >>> ??362?? nonstatic_field(ConstantPoolCacheEntry, >>> _f1,????????????????????????????????? volatile >>> Metadata*)??????????????????? \ >>> >>> >>> Thanks, >>> Serguei >>> >>>> >>>> >>>> Thanks, >>>> >>>> Yasumasa >>>> >>>> >>>> On 2017/10/18 21:34, David Holmes wrote: >>>>> Just to clarify ... >>>>> >>>>> On 18/10/2017 10:28 PM, David Holmes wrote: >>>>>> On 18/10/2017 8:26 PM, serguei.spitsyn at oracle.com wrote: >>>>>>> Hi David, >>>>>>> >>>>>>> Thank you for jumping to this review and helping Yasumasa to >>>>>>> sort it out! >>>>>>> I've just discovered that this issue was already on the table >>>>>>> for several months without a significant progress. >>>>>>> >>>>>>> >>>>>>> On 10/18/17 02:48, David Holmes wrote: >>>>>>>> Hi Serguei >>>>>>>> >>>>>>>> On 18/10/2017 7:25 PM, serguei.spitsyn at oracle.com wrote: >>>>>>>>> Hi Yasumasa, >>>>>>>>> >>>>>>>>> Sorry for a quite late participation. >>>>>>>>> >>>>>>>>> I looked at the previous webrevs and think that this one is >>>>>>>>> much better. >>>>>>>>> >>>>>>>>> Some concern is if we need any kind of synchronization here, >>>>>>>>> e.g. CAS. >>>>>>>>> But it depends on the PerfMemory class usage. >>>>>>>>> >>>>>>>>> Should we make the static variables '_initialized' and >>>>>>>>> '_destroyed' volatile? >>>>>>>> >>>>>>>> For good measure - yes. >>>>>>>> >>>>>>>>> Also, the '_initialized' is set to 1 with: >>>>>>>>> ??? 159 OrderAccess::release_store(&_initialized, 1); >>>>>>>>> >>>>>>>>> Should we do the same to set the '_destroyed'?: >>>>>>>>> 200 _destroyed = true; >>>>>>>> >>>>>>>> There is a benign initialization race but we need the >>>>>>>> release_store to ensure all the data fields can be read if >>>>>>>> _initialized is seen as true. But what is missing is a >>>>>>>> load_acquire() in is_initialized() to ensure we synchronize >>>>>>>> with that store! >>>>>>> >>>>>>> Yes, I noticed that the load_acquire() is missed. :| >>>>>>> >>>>>>>> >>>>>>>> There is also a potential for a destruction race (if multiple >>>>>>>> aborts happens concurrently in different threads) but that also >>>>>>>> seems benign. In this case there is no data being set so the >>>>>>>> store to _destroyed does not need to be a release_store. >>>>>>> >>>>>>> I'm not convinced yet this is benign as the >>>>>>> PerfMemory::destroy() has this call: >>>>>>> ?? 197 delete_memory_region(); >>>>>> >>>>>> Yes though most of its work ends up being no-ops. >>>>>> >>>>>>> >>>>>>> Now, I started thinking about the asserts that call the >>>>>>> is_useable(). >>>>>>> Should they be returns instead? >>>>>> >>>>>> I think this is a somewhat confused chunk of code. It's only >>>>>> fractionally thread-safe yet once in use could be in use >>>>>> concurrently with an aborting thread that calls destroy(). I >>>>>> don't think there is any simple fix for this. If we're in the >>>>>> process of crashing does it really matter if we trigger a >>>>>> secondary crash due to this? >>>>> >>>>> It doesn't matter if we do: >>>>> >>>>> assert(is_usable(),...); >>>>> // continue >>>>> >>>>> or >>>>> >>>>> if (!is_usable()) return; >>>>> // continue >>>>> >>>>> because as soon as we have checked is_usable() and abort happening >>>>> in another thread may have changed that by calling destroy. >>>>> >>>>> This code is basically broken if we hit an abort path instead of a >>>>> normal VM shutdown. >>>>> >>>>> David >>>>> ----- >>>>> >>>>>> The problems with this code go way beyond what Yasumasa is trying >>>>>> to address with the JSnap problem and I would not want to put it >>>>>> back on him to try and come up with an overall solution. >>>>>> >>>>>>> Then the is_destroyed() would better to have the load_acquire(). >>>>>> >>>>>> You could add a load_acquire and do the store_release. It >>>>>> certainly would not hurt, but I don't think it would actually >>>>>> benefit anything either. >>>>>> >>>>>> Cheers, >>>>>> David >>>>>> >>>>>>> Just interested to know what do you think on this. >>>>>>> >>>>>>> Thanks, >>>>>>> Serguei >>>>>>> >>>>>>>> >>>>>>>> Cheers, >>>>>>>> David >>>>>>>> >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Serguei >>>>>>>>> >>>>>>>>> >>>>>>>>> On 10/18/17 00:39, Yasumasa Suenaga wrote: >>>>>>>>>> Hi David, >>>>>>>>>> >>>>>>>>>> Thank you for your comment. >>>>>>>>>> I uploaded new webrev: >>>>>>>>>> >>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.07/ >>>>>>>>>> >>>>>>>>>> Serguei, please comment about this :-) >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Yasumasa >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> 2017-10-18 16:09 GMT+09:00 David >>>>>>>>>> Holmes: >>>>>>>>>>> Hi Yasumasa, >>>>>>>>>>> >>>>>>>>>>> On 18/10/2017 4:34 PM, Yasumasa Suenaga wrote: >>>>>>>>>>>> Hi David, >>>>>>>>>>>> >>>>>>>>>>>>> I don't think we need the extra fields, just ensure the >>>>>>>>>>>>> existing ones >>>>>>>>>>>>> can't >>>>>>>>>>>>> be accessed (other than by the tools) after destroy is >>>>>>>>>>>>> called. >>>>>>>>>>>> >>>>>>>>>>>> I've added PerfMemory::is_useable() to check whether we can >>>>>>>>>>>> access to >>>>>>>>>>>> PerfMemory. >>>>>>>>>>>> I think this webrev prevent to access to PerfMemory after >>>>>>>>>>>> destroy() call. >>>>>>>>>>>> >>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.06/ >>>>>>>>>>> >>>>>>>>>>> This: >>>>>>>>>>> >>>>>>>>>>> ?? 90 void PerfMemory::initialize() { >>>>>>>>>>> ?? 91 >>>>>>>>>>> ?? 92?? if (_prologue != NULL) >>>>>>>>>>> ?? 93???? // initialization already performed >>>>>>>>>>> ?? 94???? return; >>>>>>>>>>> >>>>>>>>>>> shouldn't check _prologue, but is_initialized(). >>>>>>>>>>> >>>>>>>>>>> ? 213?? assert(is_useable(), "called before initialization"); >>>>>>>>>>> >>>>>>>>>>> -> "called before init or after destroy" >>>>>>>>>>> >>>>>>>>>>> Could add a similar assert in PerfMemory::mark_updated(). >>>>>>>>>>> >>>>>>>>>>> Let's see what Serguei thinks. :) >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> David >>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> >>>>>>>>>>>> Yasumasa >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> 2017-10-18 13:44 GMT+09:00 David >>>>>>>>>>>> Holmes: >>>>>>>>>>>>> On 18/10/2017 2:27 PM, Yasumasa Suenaga wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>> Hi David, >>>>>>>>>>>>>> >>>>>>>>>>>>>> 2017-10-18 12:55 GMT+09:00 David >>>>>>>>>>>>>> Holmes: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On 18/10/2017 12:37 PM, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Hi David, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> With your changes you no longer null out _prologue so >>>>>>>>>>>>>>>>> the assertion >>>>>>>>>>>>>>>>> would >>>>>>>>>>>>>>>>> now not fail and we'd proceed to access the deleted >>>>>>>>>>>>>>>>> memory region! >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Linux, PerfMemory::delete_memory_region() does not >>>>>>>>>>>>>>>> call munmap() >>>>>>>>>>>>>>>> for PerfMemory. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Perhaps not but there are still other actions that >>>>>>>>>>>>>>> happen and the point >>>>>>>>>>>>>>> is >>>>>>>>>>>>>>> we should not be able to continue to use PerfMemory once >>>>>>>>>>>>>>> it has been >>>>>>>>>>>>>>> destroyed (even if the destruction is only logical). >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> I received same comment from Dmitry in the past, but we >>>>>>>>>>>>>> couldn't >>>>>>>>>>>>>> decide how should we do. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> http://mail.openjdk.java.net/pipermail/serviceability-dev/2016-May/019728.html >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> In that discussion, I uploaded another webrev which adds >>>>>>>>>>>>>> other fields >>>>>>>>>>>>>> for >>>>>>>>>>>>>> JSnap. >>>>>>>>>>>>>> Is it suitable? >>>>>>>>>>>>>> >>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.02/ >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> I don't think we need the extra fields, just ensure the >>>>>>>>>>>>> existing ones >>>>>>>>>>>>> can't >>>>>>>>>>>>> be accessed (other than by the tools) after destroy is >>>>>>>>>>>>> called. >>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I'm unclear why you no longer clear all the fields set >>>>>>>>>>>>>>>>> during >>>>>>>>>>>>>>>>> initialization? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> PerfMemory.java in jdk.hotspot.agent needs these field >>>>>>>>>>>>>>>> values. >>>>>>>>>>>>>>>> `jhsdb jsnap --core` is failed if they are cleared. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I'm not familiar with these tools. When do we produce a >>>>>>>>>>>>>>> core file after >>>>>>>>>>>>>>> calling PerfMemory::destroy ? >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> PerfMemory::destroy() is called before aborting. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Ah - right. I assume we need to close off the perfdata >>>>>>>>>>>>> file before we >>>>>>>>>>>>> abort. >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> David >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>> ----------------------- >>>>>>>>>>>>>> #0? perfMemory_exit () >>>>>>>>>>>>>> ?????? at >>>>>>>>>>>>>> >>>>>>>>>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/share/vm/runtime/perfMemory.cpp:80 >>>>>>>>>>>>>> >>>>>>>>>>>>>> #1? 0x00007f99b091c949 in os::shutdown () >>>>>>>>>>>>>> ?????? at >>>>>>>>>>>>>> >>>>>>>>>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:1483 >>>>>>>>>>>>>> >>>>>>>>>>>>>> #2? 0x00007f99b091c980 in os::abort (dump_core=>>>>>>>>>>>>> out>) >>>>>>>>>>>>>> ?????? at >>>>>>>>>>>>>> >>>>>>>>>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:1503 >>>>>>>>>>>>>> >>>>>>>>>>>>>> #3? 0x00007f99b0b689c3 in VMError::report_and_die ( >>>>>>>>>>>>>> ?????? this=this at entry=0x7ffcacf40b50) >>>>>>>>>>>>>> ?????? at >>>>>>>>>>>>>> >>>>>>>>>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/share/vm/utilities/vmError.cpp:1060 >>>>>>>>>>>>>> >>>>>>>>>>>>>> #4? 0x00007f99b0926f04 in JVM_handle_linux_signal >>>>>>>>>>>>>> (sig=sig at entry=11, >>>>>>>>>>>>>> ?????? info=info at entry=0x7ffcacf40df0, >>>>>>>>>>>>>> ucVoid=ucVoid at entry=0x7ffcacf40cc0, >>>>>>>>>>>>>> abort_if_unrecognized=abort_if_unrecognized at entry=1) >>>>>>>>>>>>>> ?????? at >>>>>>>>>>>>>> >>>>>>>>>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/os_cpu/linux_x86/vm/os_linux_x86.cpp:541 >>>>>>>>>>>>>> >>>>>>>>>>>>>> ----------------------- >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> But it seems to me that there are various checks of >>>>>>>>>>>>>>>>> _prologue that should really be checking >>>>>>>>>>>>>>>>> is_initialized() and/or >>>>>>>>>>>>>>>>> is_destroyed() as a guard. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Should I change all assertions for _prologue? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Assertions and direct guards. Checking _prologue is a >>>>>>>>>>>>>>> placeholder for >>>>>>>>>>>>>>> the >>>>>>>>>>>>>>> real check. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>> David >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> 2017-10-18 10:53 GMT+09:00 David >>>>>>>>>>>>>>>> Holmes: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Hi Yasumasa, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> By chance we ran into this bug which I analysed >>>>>>>>>>>>>>>>> yesterday: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8189390 >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> We hit the assertion: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> #? Internal Error >>>>>>>>>>>>>>>>> (/open/src/hotspot/share/runtime/perfMemory.cpp:216), >>>>>>>>>>>>>>>>> pid=17874, tid=17875 >>>>>>>>>>>>>>>>> #? assert(_prologue != __null) failed: called before >>>>>>>>>>>>>>>>> initialization >>>>>>>>>>>>>>>>> # >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> which is misleading because it can fail if called before >>>>>>>>>>>>>>>>> initialization, >>>>>>>>>>>>>>>>> or >>>>>>>>>>>>>>>>> after PerfMemory::destroy has been called. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> With your changes you no longer null out _prologue so >>>>>>>>>>>>>>>>> the assertion >>>>>>>>>>>>>>>>> would >>>>>>>>>>>>>>>>> now not fail and we'd proceed to access the deleted >>>>>>>>>>>>>>>>> memory region! >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I'm unclear why you no longer clear all the fields set >>>>>>>>>>>>>>>>> during >>>>>>>>>>>>>>>>> initialization? But it seems to me that there are >>>>>>>>>>>>>>>>> various checks of >>>>>>>>>>>>>>>>> _prologue that should really be checking >>>>>>>>>>>>>>>>> is_initialized() and/or >>>>>>>>>>>>>>>>> is_destroyed() as a guard. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On 16/10/2017 11:25 PM, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> PING: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Could you review it? >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.05/ >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On 2017/10/03 13:18, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> I added gtest unit test case for this change in new >>>>>>>>>>>>>>>>>>> webrev: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.05/ >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Could you review it? >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> 2017-09-27 0:01 GMT+09:00 Yasumasa >>>>>>>>>>>>>>>>>>> Suenaga: >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> I uploaded new webrev to be adapted to jdk10/hs: >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.04/ >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> On 2017/09/21 7:45, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> PING: >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Have you checked this issue? >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.03/ >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> On 2017/07/01 23:43, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> PING: >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Have you checked this issue? >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> On 2017/06/13 14:10, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> I want to discuss about JDK-8151815: Could not >>>>>>>>>>>>>>>>>>>>>>> parse core image >>>>>>>>>>>>>>>>>>>>>>> with >>>>>>>>>>>>>>>>>>>>>>> JSnap. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> In last year, I found JSnap cannot parse >>>>>>>>>>>>>>>>>>>>>>> coredump and I've sent >>>>>>>>>>>>>>>>>>>>>>> review >>>>>>>>>>>>>>>>>>>>>>> request for it as JDK-8151815. However it has >>>>>>>>>>>>>>>>>>>>>>> not been reviewed >>>>>>>>>>>>>>>>>>>>>>> yet >>>>>>>>>>>>>>>>>>>>>>> [1]. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> We've discussed about safety implementation, but >>>>>>>>>>>>>>>>>>>>>>> we could not >>>>>>>>>>>>>>>>>>>>>>> get >>>>>>>>>>>>>>>>>>>>>>> consensus. >>>>>>>>>>>>>>>>>>>>>>> IMHO all SA tools should be handled java >>>>>>>>>>>>>>>>>>>>>>> processes and core >>>>>>>>>>>>>>>>>>>>>>> images, >>>>>>>>>>>>>>>>>>>>>>> and PerfCounter value is useful. So I fix this >>>>>>>>>>>>>>>>>>>>>>> issue. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> I uploaded new webrev for this issue. I think >>>>>>>>>>>>>>>>>>>>>>> this patch is >>>>>>>>>>>>>>>>>>>>>>> safety >>>>>>>>>>>>>>>>>>>>>>> because new flag PerfMemory::_destroyed guards >>>>>>>>>>>>>>>>>>>>>>> double free, and >>>>>>>>>>>>>>>>>>>>>>> all >>>>>>>>>>>>>>>>>>>>>>> members in PerfMemory is accessible (they are >>>>>>>>>>>>>>>>>>>>>>> not munmap'ed) >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.03/ >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Can you cooperate? >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> [1] >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> http://mail.openjdk.java.net/pipermail/serviceability-dev/2016-April/019480.html >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>> >>>>>>> >>> From markus.gronlund at oracle.com Thu Oct 19 09:42:49 2017 From: markus.gronlund at oracle.com (Markus Gronlund) Date: Thu, 19 Oct 2017 02:42:49 -0700 (PDT) Subject: RFR(S): 8189425: Minor updates in support of closed changes In-Reply-To: <59E7B3E6.8000101@oracle.com> References: <59E6BB27.8020605@oracle.com> <9acd0472-de6b-4f6f-454e-7a325492af15@oracle.com> <59E7B3E6.8000101@oracle.com> Message-ID: Hi Erik, Looks good. Thanks Markus -----Original Message----- From: Erik Gahlin Sent: den 18 oktober 2017 22:05 To: David Holmes; serviceability-dev at openjdk.java.net Subject: Re: RFR(S): 8189425: Minor updates in support of closed changes Hi David, > Hi Erik, > > On 18/10/2017 12:23 PM, Erik Gahlin wrote: >> Hi, >> >> Could I have a review of this change that will adjust an assertion >> and > > Can you explain the adjustment please. We have closed code that modifies the mark word and then changes it back during a safepoint. When the mark word is modified, we reuse GC infrastructure that run into the assert. If we change the assert to ignore checking that the mark word is NULL, we don't run into the problem. > >> remove a lock associated with JFR. > I forgot to modify the header file, see updated webrev. http://cr.openjdk.java.net/~egahlin/8189425_1/ I also made a change to GrowableArray, the insert_sorted method now takes a const. Thanks Erik > That bit is fine :) > > Thanks, > David > >> Webrev: >> http://cr.openjdk.java.net/~egahlin/8189425_0 >> >> Bug: >> https://bugs.openjdk.java.net/browse/JDK-8189425 >> >> Thanks >> Erik >> >> From rkennke at redhat.com Thu Oct 19 10:28:30 2017 From: rkennke at redhat.com (Roman Kennke) Date: Thu, 19 Oct 2017 12:28:30 +0200 Subject: RFR: 8189373: jmap -heap exited with error code In-Reply-To: <5f387be5-bd8a-91c6-39d4-871a74e80355@redhat.com> References: <5f387be5-bd8a-91c6-39d4-871a74e80355@redhat.com> Message-ID: Am 18.10.2017 um 20:29 schrieb Roman Kennke: > My recent CMSHeap extraction has broken the JVM servicability agent. > It looks like I actually need a little bit more boilerplate to make it > happy: > > http://cr.openjdk.java.net/~rkennke/8189373/webrev.00/ > > > It does fix the test that's mentioned in the bug report: > > https://bugs.openjdk.java.net/browse/JDK-8189373 > > Is this the correct way to fix it? > > Roman > > Ping? Can I get a review for this one from someone who knows the SA? This is an integration blocker. Thanks, Roman From david.holmes at oracle.com Thu Oct 19 10:41:05 2017 From: david.holmes at oracle.com (David Holmes) Date: Thu, 19 Oct 2017 20:41:05 +1000 Subject: PING: RFR: JDK-8151815: Could not parse core image with JSnap. In-Reply-To: <8e2c7e3f-2f3a-7168-712b-2642a4e88239@oracle.com> References: <643d9ea2-5bb3-f6ef-8007-14cd1c580137@oracle.com> <4370fcb0-d06d-865b-8bab-e03ec1812e89@oracle.com> <84922d13-79ac-8904-1885-79ee0c2b4232@oracle.com> <967acad8-8378-cf99-0ff9-ba3852d51dd3@oracle.com> <41185116-0271-374c-da9e-74e173d6b58c@oracle.com> <7cb41834-75fa-9dc5-4f6d-c2b85f84dcee@oracle.com> <78b0bfca-1259-e5ef-1189-a7646d7fea36@oracle.com> <7b8ae324-c590-bb41-4bb6-2b4d18b12267@gmail.com> <024d8188-8d54-3345-d3b6-e757f4cafb6e@oracle.com> <38f55e3a-5053-f977-980e-59687cca5fd9@gmail.com> <8e2c7e3f-2f3a-7168-712b-2642a4e88239@oracle.com> Message-ID: <05720ca7-73ef-2761-e8f5-1d0fc45c8bdc@oracle.com> Hi Serguei, Yasumasa, I suggest we leave the volatile off for now and file a RFE to add volatile_static_field support to VMStructs and update later. I don't think trying to introduce locking would be a good idea as it would likely lead to deadlocks when a crash occurs. This could also be investigated as a future RFE if desired. Thanks, David On 19/10/2017 7:37 PM, serguei.spitsyn at oracle.com wrote: > Hi Yasumasa, > > I see the problem. > As it occurred making these variables volatile is non-trivial. > But thank you a lot for trying! > > I'd suggest to fall back to your previous approach as synchronization > was not there > in the first place, and it is not a part of the original issue you are > trying to fix > (if David or anyone else does not a simple solution). > But let's check if David does not object against it. > > I will sponsor your fix after you send me a patch. > > Thanks, > Serguei > > > On 10/18/17 20:21, Yasumasa Suenaga wrote: >> Sorry, I have mistake. >> But I cannot compile yet: >> >> diff -r 3e7702cd3f19 src/hotspot/share/runtime/vmStructs.cpp >> --- a/src/hotspot/share/runtime/vmStructs.cpp?? Thu Sep 07 15:40:20 >> 2017 +0200 >> +++ b/src/hotspot/share/runtime/vmStructs.cpp?? Thu Oct 19 12:21:11 >> 2017 +0900 >> @@ -578,7 +578,7 @@ >> ????? static_field(PerfMemory, _top, >> char*)???????????????????????????????? \ >> ????? static_field(PerfMemory, _capacity, >> size_t)??????????????????????????????? \ >> ????? static_field(PerfMemory, _prologue, >> PerfDataPrologue*)???????????????????? \ >> -???? static_field(PerfMemory, _initialized, >> jint)????????????????????????????????? \ >> +???? static_field(PerfMemory, >> _initialized,????????????????????????????????? volatile >> jint)??????????????????????????? \ >> >> -------------- >> In file included from >> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:104:0: >> >> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.hpp:168:58: >> error: invalid conversion from 'volatile void*' to 'void*' [-fpermissive] >> ? { QUOTE(typeName), QUOTE(fieldName), QUOTE(type), 1, 0, >> &typeName::fieldName }, >> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:581:6: >> note: in expansion of macro 'GENERATE_STATIC_VM_STRUCT_ENTRY' >> ????? static_field(PerfMemory, >> _initialized,????????????????????????????????? volatile >> jint)??????????????????????????? \ >> ????? ^~~~~~~~~~~~ >> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:2934:3: >> note: in expansion of macro 'VM_STRUCTS' >> ?? VM_STRUCTS(GENERATE_NONSTATIC_VM_STRUCT_ENTRY, >> ?? ^ >> gmake[3]: *** [lib/CompileJvm.gmk:210: >> /home/ysuenaga/OpenJDK/jdk10-hs/build/linux-x86_64-normal-server-fastdebug/hotspot/variant-server/libjvm/objs/vmStructs.o] >> Error 1 >> gmake[2]: *** [make/Main.gmk:266: hotspot-server-libs] Error 2 >> >> ERROR: Build failed for target 'images' in configuration >> 'linux-x86_64-normal-server-fastdebug' (exit code 2) >> -------------- >> >> >> >> On 2017/10/19 12:18, Yasumasa Suenaga wrote: >>> Hi Serguei, >>> >>>> Would the below work? : >>>> >>>> ? 578????? static_field(PerfMemory, _initialized, volatile >>>> jint)????????????????????????????????? \ >>>> >>>> It'd be similar to this non-static case: >>>> ? 362?? nonstatic_field(ConstantPoolCacheEntry, >>>> _f1,????????????????????????????????? volatile >>>> Metadata*)??????????????????? \ >>> >>> I got error messages as below: >>> >>> --------------- >>> In file included from >>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:104:0: >>> >>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:581:39: >>> error: expected unqualified-id before 'volatile' >>> ?????? static_field(PerfMemory,???????? volatile _initialized, >>> jint)????????????????????????????????? \ >>> ??????????????????????????????????????? ^ >>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.hpp:168:69: >>> note: in definition of macro 'GENERATE_STATIC_VM_STRUCT_ENTRY' >>> ?? { QUOTE(typeName), QUOTE(fieldName), QUOTE(type), 1, 0, >>> &typeName::fieldName }, >>> ^~~~~~~~~ >>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:2934:3: >>> note: in expansion of macro 'VM_STRUCTS' >>> ??? VM_STRUCTS(GENERATE_NONSTATIC_VM_STRUCT_ENTRY, >>> ??? ^ >>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:581:39: >>> error: expected '}' before 'volatile' >>> ?????? static_field(PerfMemory,???????? volatile _initialized, >>> jint)????????????????????????????????? \ >>> ??????????????????????????????????????? ^ >>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.hpp:168:69: >>> note: in definition of macro 'GENERATE_STATIC_VM_STRUCT_ENTRY' >>> ?? { QUOTE(typeName), QUOTE(fieldName), QUOTE(type), 1, 0, >>> &typeName::fieldName }, >>> ^~~~~~~~~ >>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:2934:3: >>> note: in expansion of macro 'VM_STRUCTS' >>> ??? VM_STRUCTS(GENERATE_NONSTATIC_VM_STRUCT_ENTRY, >>> ??? ^ >>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:581:39: >>> error: expected '}' before 'volatile' >>> ?????? static_field(PerfMemory,???????? volatile _initialized, >>> jint)????????????????????????????????? \ >>> ??????????????????????????????????????? ^ >>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.hpp:168:69: >>> note: in definition of macro 'GENERATE_STATIC_VM_STRUCT_ENTRY' >>> ?? { QUOTE(typeName), QUOTE(fieldName), QUOTE(type), 1, 0, >>> &typeName::fieldName }, >>> ^~~~~~~~~ >>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:2934:3: >>> note: in expansion of macro 'VM_STRUCTS' >>> ??? VM_STRUCTS(GENERATE_NONSTATIC_VM_STRUCT_ENTRY, >>> ??? ^ >>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.hpp:168:79: >>> error: expected declaration before '}' token >>> ?? { QUOTE(typeName), QUOTE(fieldName), QUOTE(type), 1, 0, >>> &typeName::fieldName }, >>> ^ >>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:581:6: >>> note: in expansion of macro 'GENERATE_STATIC_VM_STRUCT_ENTRY' >>> ?????? static_field(PerfMemory,???????? volatile _initialized, >>> jint)????????????????????????????????? \ >>> ?????? ^~~~~~~~~~~~ >>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:2934:3: >>> note: in expansion of macro 'VM_STRUCTS' >>> ??? VM_STRUCTS(GENERATE_NONSTATIC_VM_STRUCT_ENTRY, >>> ??? ^ >>> gmake[3]: *** [lib/CompileJvm.gmk:210: >>> /home/ysuenaga/OpenJDK/jdk10-hs/build/linux-x86_64-normal-server-fastdebug/hotspot/variant-server/libjvm/objs/vmStructs.o] >>> Error 1 >>> gmake[2]: *** [make/Main.gmk:266: hotspot-server-libs] Error 2 >>> >>> ERROR: Build failed for target 'images' in configuration >>> 'linux-x86_64-normal-server-fastdebug' (exit code 2) >>> --------------- >>> >>> >>> I changed as below: >>> --------------- >>> diff -r 3e7702cd3f19 src/hotspot/share/runtime/perfMemory.cpp >>> --- a/src/hotspot/share/runtime/perfMemory.cpp? Thu Sep 07 15:40:20 >>> 2017 +0200 >>> +++ b/src/hotspot/share/runtime/perfMemory.cpp? Thu Oct 19 12:15:30 >>> 2017 +0900 >>> @@ -51,8 +51,9 @@ >>> ??char*??????????????????? PerfMemory::_end = NULL; >>> ??char*??????????????????? PerfMemory::_top = NULL; >>> ??size_t?????????????????? PerfMemory::_capacity = 0; >>> -jint???????????????????? PerfMemory::_initialized = false; >>> +volatile jint??????????? PerfMemory::_initialized = 0; >>> ??PerfDataPrologue*??????? PerfMemory::_prologue = NULL; >>> +volatile bool??????????? PerfMemory::_destroyed = false; >>> >>> --- a/src/hotspot/share/runtime/perfMemory.hpp? Thu Sep 07 15:40:20 >>> 2017 +0200 >>> +++ b/src/hotspot/share/runtime/perfMemory.hpp? Thu Oct 19 12:15:30 >>> 2017 +0900 >>> @@ -113,13 +113,15 @@ >>> ?? */ >>> ??class PerfMemory : AllStatic { >>> ????? friend class VMStructs; >>> +??? friend class PerfMemoryTest; >>> ??? private: >>> ????? static char*? _start; >>> ????? static char*? _end; >>> ????? static char*? _top; >>> ????? static size_t _capacity; >>> ????? static PerfDataPrologue*? _prologue; >>> -??? static jint?? _initialized; >>> +??? static volatile jint????? _initialized; >>> +??? static volatile bool????? _destroyed; >>> >>> diff -r 3e7702cd3f19 src/hotspot/share/runtime/vmStructs.cpp >>> --- a/src/hotspot/share/runtime/vmStructs.cpp?? Thu Sep 07 15:40:20 >>> 2017 +0200 >>> +++ b/src/hotspot/share/runtime/vmStructs.cpp?? Thu Oct 19 12:15:30 >>> 2017 +0900 >>> @@ -578,7 +578,7 @@ >>> ?????? static_field(PerfMemory, _top, >>> char*)???????????????????????????????? \ >>> ?????? static_field(PerfMemory, _capacity, >>> size_t)??????????????????????????????? \ >>> ?????? static_field(PerfMemory, _prologue, >>> PerfDataPrologue*)???????????????????? \ >>> -???? static_field(PerfMemory, _initialized, >>> jint)????????????????????????????????? \ >>> +???? static_field(PerfMemory,???????? volatile _initialized, >>> jint)????????????????????????????????? \ >>> --------------- >>> >>> >>> Thanks, >>> >>> Yasumasa >>> >>> >>> On 2017/10/19 6:18, serguei.spitsyn at oracle.com wrote: >>>> On 10/18/17 06:51, Yasumasa Suenaga wrote: >>>>> Hi David, Serguei, >>>>> >>>>>> because as soon as we have checked is_usable() and abort happening >>>>>> in another thread may have changed that by calling destroy. >>>>>> >>>>>> This code is basically broken if we hit an abort path instead of a >>>>>> normal VM shutdown. >>>>> >>>>> Can we use MutexLocker for initialize() and destroy() ? >>>>> >>>>> >>>>> I've tried to fix about your comments, but I have an issue about >>>>> volatile. >>>>> PerfMemory.java depends on PerfMemory::_initialized. However >>>>> VMStructs cannot handle static volatile variables. >>>>> I think two approaches as below: >>>>> >>>>> >>>>> ? 1. Remove _initialized check from PerfMemory.java >>>>> ???? SA will throw UnmappedAddressException if JSnap try to access >>>>> invalid address including uninitialized memory. >>>>> >>>>> ? 2. Add static volatile support to VMStructs >>>>> >>>>> >>>>> Which should we do? >>>>> 1. is easy to fix. But 2. might be right way... >>>> >>>> Would the below work? : >>>> >>>> ??578????? static_field(PerfMemory, _initialized, volatile >>>> jint)????????????????????????????????? \ >>>> >>>> It'd be similar to this non-static case: >>>> ??362?? nonstatic_field(ConstantPoolCacheEntry, >>>> _f1,????????????????????????????????? volatile >>>> Metadata*)??????????????????? \ >>>> >>>> >>>> Thanks, >>>> Serguei >>>> >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Yasumasa >>>>> >>>>> >>>>> On 2017/10/18 21:34, David Holmes wrote: >>>>>> Just to clarify ... >>>>>> >>>>>> On 18/10/2017 10:28 PM, David Holmes wrote: >>>>>>> On 18/10/2017 8:26 PM, serguei.spitsyn at oracle.com wrote: >>>>>>>> Hi David, >>>>>>>> >>>>>>>> Thank you for jumping to this review and helping Yasumasa to >>>>>>>> sort it out! >>>>>>>> I've just discovered that this issue was already on the table >>>>>>>> for several months without a significant progress. >>>>>>>> >>>>>>>> >>>>>>>> On 10/18/17 02:48, David Holmes wrote: >>>>>>>>> Hi Serguei >>>>>>>>> >>>>>>>>> On 18/10/2017 7:25 PM, serguei.spitsyn at oracle.com wrote: >>>>>>>>>> Hi Yasumasa, >>>>>>>>>> >>>>>>>>>> Sorry for a quite late participation. >>>>>>>>>> >>>>>>>>>> I looked at the previous webrevs and think that this one is >>>>>>>>>> much better. >>>>>>>>>> >>>>>>>>>> Some concern is if we need any kind of synchronization here, >>>>>>>>>> e.g. CAS. >>>>>>>>>> But it depends on the PerfMemory class usage. >>>>>>>>>> >>>>>>>>>> Should we make the static variables '_initialized' and >>>>>>>>>> '_destroyed' volatile? >>>>>>>>> >>>>>>>>> For good measure - yes. >>>>>>>>> >>>>>>>>>> Also, the '_initialized' is set to 1 with: >>>>>>>>>> ??? 159 OrderAccess::release_store(&_initialized, 1); >>>>>>>>>> >>>>>>>>>> Should we do the same to set the '_destroyed'?: >>>>>>>>>> 200 _destroyed = true; >>>>>>>>> >>>>>>>>> There is a benign initialization race but we need the >>>>>>>>> release_store to ensure all the data fields can be read if >>>>>>>>> _initialized is seen as true. But what is missing is a >>>>>>>>> load_acquire() in is_initialized() to ensure we synchronize >>>>>>>>> with that store! >>>>>>>> >>>>>>>> Yes, I noticed that the load_acquire() is missed. :| >>>>>>>> >>>>>>>>> >>>>>>>>> There is also a potential for a destruction race (if multiple >>>>>>>>> aborts happens concurrently in different threads) but that also >>>>>>>>> seems benign. In this case there is no data being set so the >>>>>>>>> store to _destroyed does not need to be a release_store. >>>>>>>> >>>>>>>> I'm not convinced yet this is benign as the >>>>>>>> PerfMemory::destroy() has this call: >>>>>>>> ?? 197 delete_memory_region(); >>>>>>> >>>>>>> Yes though most of its work ends up being no-ops. >>>>>>> >>>>>>>> >>>>>>>> Now, I started thinking about the asserts that call the >>>>>>>> is_useable(). >>>>>>>> Should they be returns instead? >>>>>>> >>>>>>> I think this is a somewhat confused chunk of code. It's only >>>>>>> fractionally thread-safe yet once in use could be in use >>>>>>> concurrently with an aborting thread that calls destroy(). I >>>>>>> don't think there is any simple fix for this. If we're in the >>>>>>> process of crashing does it really matter if we trigger a >>>>>>> secondary crash due to this? >>>>>> >>>>>> It doesn't matter if we do: >>>>>> >>>>>> assert(is_usable(),...); >>>>>> // continue >>>>>> >>>>>> or >>>>>> >>>>>> if (!is_usable()) return; >>>>>> // continue >>>>>> >>>>>> because as soon as we have checked is_usable() and abort happening >>>>>> in another thread may have changed that by calling destroy. >>>>>> >>>>>> This code is basically broken if we hit an abort path instead of a >>>>>> normal VM shutdown. >>>>>> >>>>>> David >>>>>> ----- >>>>>> >>>>>>> The problems with this code go way beyond what Yasumasa is trying >>>>>>> to address with the JSnap problem and I would not want to put it >>>>>>> back on him to try and come up with an overall solution. >>>>>>> >>>>>>>> Then the is_destroyed() would better to have the load_acquire(). >>>>>>> >>>>>>> You could add a load_acquire and do the store_release. It >>>>>>> certainly would not hurt, but I don't think it would actually >>>>>>> benefit anything either. >>>>>>> >>>>>>> Cheers, >>>>>>> David >>>>>>> >>>>>>>> Just interested to know what do you think on this. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Serguei >>>>>>>> >>>>>>>>> >>>>>>>>> Cheers, >>>>>>>>> David >>>>>>>>> >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Serguei >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 10/18/17 00:39, Yasumasa Suenaga wrote: >>>>>>>>>>> Hi David, >>>>>>>>>>> >>>>>>>>>>> Thank you for your comment. >>>>>>>>>>> I uploaded new webrev: >>>>>>>>>>> >>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.07/ >>>>>>>>>>> >>>>>>>>>>> Serguei, please comment about this :-) >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Yasumasa >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> 2017-10-18 16:09 GMT+09:00 David >>>>>>>>>>> Holmes: >>>>>>>>>>>> Hi Yasumasa, >>>>>>>>>>>> >>>>>>>>>>>> On 18/10/2017 4:34 PM, Yasumasa Suenaga wrote: >>>>>>>>>>>>> Hi David, >>>>>>>>>>>>> >>>>>>>>>>>>>> I don't think we need the extra fields, just ensure the >>>>>>>>>>>>>> existing ones >>>>>>>>>>>>>> can't >>>>>>>>>>>>>> be accessed (other than by the tools) after destroy is >>>>>>>>>>>>>> called. >>>>>>>>>>>>> >>>>>>>>>>>>> I've added PerfMemory::is_useable() to check whether we can >>>>>>>>>>>>> access to >>>>>>>>>>>>> PerfMemory. >>>>>>>>>>>>> I think this webrev prevent to access to PerfMemory after >>>>>>>>>>>>> destroy() call. >>>>>>>>>>>>> >>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.06/ >>>>>>>>>>>> >>>>>>>>>>>> This: >>>>>>>>>>>> >>>>>>>>>>>> ?? 90 void PerfMemory::initialize() { >>>>>>>>>>>> ?? 91 >>>>>>>>>>>> ?? 92?? if (_prologue != NULL) >>>>>>>>>>>> ?? 93???? // initialization already performed >>>>>>>>>>>> ?? 94???? return; >>>>>>>>>>>> >>>>>>>>>>>> shouldn't check _prologue, but is_initialized(). >>>>>>>>>>>> >>>>>>>>>>>> ? 213?? assert(is_useable(), "called before initialization"); >>>>>>>>>>>> >>>>>>>>>>>> -> "called before init or after destroy" >>>>>>>>>>>> >>>>>>>>>>>> Could add a similar assert in PerfMemory::mark_updated(). >>>>>>>>>>>> >>>>>>>>>>>> Let's see what Serguei thinks. :) >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> David >>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> >>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> 2017-10-18 13:44 GMT+09:00 David >>>>>>>>>>>>> Holmes: >>>>>>>>>>>>>> On 18/10/2017 2:27 PM, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Hi David, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> 2017-10-18 12:55 GMT+09:00 David >>>>>>>>>>>>>>> Holmes: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On 18/10/2017 12:37 PM, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Hi David, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> With your changes you no longer null out _prologue so >>>>>>>>>>>>>>>>>> the assertion >>>>>>>>>>>>>>>>>> would >>>>>>>>>>>>>>>>>> now not fail and we'd proceed to access the deleted >>>>>>>>>>>>>>>>>> memory region! >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On Linux, PerfMemory::delete_memory_region() does not >>>>>>>>>>>>>>>>> call munmap() >>>>>>>>>>>>>>>>> for PerfMemory. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Perhaps not but there are still other actions that >>>>>>>>>>>>>>>> happen and the point >>>>>>>>>>>>>>>> is >>>>>>>>>>>>>>>> we should not be able to continue to use PerfMemory once >>>>>>>>>>>>>>>> it has been >>>>>>>>>>>>>>>> destroyed (even if the destruction is only logical). >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I received same comment from Dmitry in the past, but we >>>>>>>>>>>>>>> couldn't >>>>>>>>>>>>>>> decide how should we do. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> http://mail.openjdk.java.net/pipermail/serviceability-dev/2016-May/019728.html >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> In that discussion, I uploaded another webrev which adds >>>>>>>>>>>>>>> other fields >>>>>>>>>>>>>>> for >>>>>>>>>>>>>>> JSnap. >>>>>>>>>>>>>>> Is it suitable? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.02/ >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> I don't think we need the extra fields, just ensure the >>>>>>>>>>>>>> existing ones >>>>>>>>>>>>>> can't >>>>>>>>>>>>>> be accessed (other than by the tools) after destroy is >>>>>>>>>>>>>> called. >>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> I'm unclear why you no longer clear all the fields set >>>>>>>>>>>>>>>>>> during >>>>>>>>>>>>>>>>>> initialization? >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> PerfMemory.java in jdk.hotspot.agent needs these field >>>>>>>>>>>>>>>>> values. >>>>>>>>>>>>>>>>> `jhsdb jsnap --core` is failed if they are cleared. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I'm not familiar with these tools. When do we produce a >>>>>>>>>>>>>>>> core file after >>>>>>>>>>>>>>>> calling PerfMemory::destroy ? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> PerfMemory::destroy() is called before aborting. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Ah - right. I assume we need to close off the perfdata >>>>>>>>>>>>>> file before we >>>>>>>>>>>>>> abort. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> David >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>> ----------------------- >>>>>>>>>>>>>>> #0? perfMemory_exit () >>>>>>>>>>>>>>> ?????? at >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/share/vm/runtime/perfMemory.cpp:80 >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> #1? 0x00007f99b091c949 in os::shutdown () >>>>>>>>>>>>>>> ?????? at >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:1483 >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> #2? 0x00007f99b091c980 in os::abort (dump_core=>>>>>>>>>>>>>> out>) >>>>>>>>>>>>>>> ?????? at >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:1503 >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> #3? 0x00007f99b0b689c3 in VMError::report_and_die ( >>>>>>>>>>>>>>> ?????? this=this at entry=0x7ffcacf40b50) >>>>>>>>>>>>>>> ?????? at >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/share/vm/utilities/vmError.cpp:1060 >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> #4? 0x00007f99b0926f04 in JVM_handle_linux_signal >>>>>>>>>>>>>>> (sig=sig at entry=11, >>>>>>>>>>>>>>> ?????? info=info at entry=0x7ffcacf40df0, >>>>>>>>>>>>>>> ucVoid=ucVoid at entry=0x7ffcacf40cc0, >>>>>>>>>>>>>>> abort_if_unrecognized=abort_if_unrecognized at entry=1) >>>>>>>>>>>>>>> ?????? at >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/os_cpu/linux_x86/vm/os_linux_x86.cpp:541 >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> ----------------------- >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> But it seems to me that there are various checks of >>>>>>>>>>>>>>>>>> _prologue that should really be checking >>>>>>>>>>>>>>>>>> is_initialized() and/or >>>>>>>>>>>>>>>>>> is_destroyed() as a guard. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Should I change all assertions for _prologue? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Assertions and direct guards. Checking _prologue is a >>>>>>>>>>>>>>>> placeholder for >>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>> real check. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> 2017-10-18 10:53 GMT+09:00 David >>>>>>>>>>>>>>>>> Holmes: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Hi Yasumasa, >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> By chance we ran into this bug which I analysed >>>>>>>>>>>>>>>>>> yesterday: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8189390 >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> We hit the assertion: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> #? Internal Error >>>>>>>>>>>>>>>>>> (/open/src/hotspot/share/runtime/perfMemory.cpp:216), >>>>>>>>>>>>>>>>>> pid=17874, tid=17875 >>>>>>>>>>>>>>>>>> #? assert(_prologue != __null) failed: called before >>>>>>>>>>>>>>>>>> initialization >>>>>>>>>>>>>>>>>> # >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> which is misleading because it can fail if called before >>>>>>>>>>>>>>>>>> initialization, >>>>>>>>>>>>>>>>>> or >>>>>>>>>>>>>>>>>> after PerfMemory::destroy has been called. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> With your changes you no longer null out _prologue so >>>>>>>>>>>>>>>>>> the assertion >>>>>>>>>>>>>>>>>> would >>>>>>>>>>>>>>>>>> now not fail and we'd proceed to access the deleted >>>>>>>>>>>>>>>>>> memory region! >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> I'm unclear why you no longer clear all the fields set >>>>>>>>>>>>>>>>>> during >>>>>>>>>>>>>>>>>> initialization? But it seems to me that there are >>>>>>>>>>>>>>>>>> various checks of >>>>>>>>>>>>>>>>>> _prologue that should really be checking >>>>>>>>>>>>>>>>>> is_initialized() and/or >>>>>>>>>>>>>>>>>> is_destroyed() as a guard. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On 16/10/2017 11:25 PM, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> PING: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Could you review it? >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.05/ >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> On 2017/10/03 13:18, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> I added gtest unit test case for this change in new >>>>>>>>>>>>>>>>>>>> webrev: >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.05/ >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Could you review it? >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> 2017-09-27 0:01 GMT+09:00 Yasumasa >>>>>>>>>>>>>>>>>>>> Suenaga: >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> I uploaded new webrev to be adapted to jdk10/hs: >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.04/ >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> On 2017/09/21 7:45, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> PING: >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Have you checked this issue? >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.03/ >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> On 2017/07/01 23:43, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> PING: >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Have you checked this issue? >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> On 2017/06/13 14:10, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> I want to discuss about JDK-8151815: Could not >>>>>>>>>>>>>>>>>>>>>>>> parse core image >>>>>>>>>>>>>>>>>>>>>>>> with >>>>>>>>>>>>>>>>>>>>>>>> JSnap. >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> In last year, I found JSnap cannot parse >>>>>>>>>>>>>>>>>>>>>>>> coredump and I've sent >>>>>>>>>>>>>>>>>>>>>>>> review >>>>>>>>>>>>>>>>>>>>>>>> request for it as JDK-8151815. However it has >>>>>>>>>>>>>>>>>>>>>>>> not been reviewed >>>>>>>>>>>>>>>>>>>>>>>> yet >>>>>>>>>>>>>>>>>>>>>>>> [1]. >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> We've discussed about safety implementation, but >>>>>>>>>>>>>>>>>>>>>>>> we could not >>>>>>>>>>>>>>>>>>>>>>>> get >>>>>>>>>>>>>>>>>>>>>>>> consensus. >>>>>>>>>>>>>>>>>>>>>>>> IMHO all SA tools should be handled java >>>>>>>>>>>>>>>>>>>>>>>> processes and core >>>>>>>>>>>>>>>>>>>>>>>> images, >>>>>>>>>>>>>>>>>>>>>>>> and PerfCounter value is useful. So I fix this >>>>>>>>>>>>>>>>>>>>>>>> issue. >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> I uploaded new webrev for this issue. I think >>>>>>>>>>>>>>>>>>>>>>>> this patch is >>>>>>>>>>>>>>>>>>>>>>>> safety >>>>>>>>>>>>>>>>>>>>>>>> because new flag PerfMemory::_destroyed guards >>>>>>>>>>>>>>>>>>>>>>>> double free, and >>>>>>>>>>>>>>>>>>>>>>>> all >>>>>>>>>>>>>>>>>>>>>>>> members in PerfMemory is accessible (they are >>>>>>>>>>>>>>>>>>>>>>>> not munmap'ed) >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.03/ >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> Can you cooperate? >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> [1] >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> http://mail.openjdk.java.net/pipermail/serviceability-dev/2016-April/019480.html >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>> >>>>>>>> >>>> > From jini.george at oracle.com Thu Oct 19 10:57:49 2017 From: jini.george at oracle.com (Jini George) Date: Thu, 19 Oct 2017 16:27:49 +0530 Subject: RFR: 8189373: jmap -heap exited with error code In-Reply-To: <5f387be5-bd8a-91c6-39d4-871a74e80355@redhat.com> References: <5f387be5-bd8a-91c6-39d4-871a74e80355@redhat.com> Message-ID: Your changes look good to me, Roman. Nit: Please do change the copyright year to 2017 in CollectedHeapName.java. Thank you, Jini (not a Reviewer). On 10/18/2017 11:59 PM, Roman Kennke wrote: > My recent CMSHeap extraction has broken the JVM servicability agent. It > looks like I actually need a little bit more boilerplate to make it happy: > > http://cr.openjdk.java.net/~rkennke/8189373/webrev.00/ > > > It does fix the test that's mentioned in the bug report: > > https://bugs.openjdk.java.net/browse/JDK-8189373 > > Is this the correct way to fix it? > > Roman > > From david.holmes at oracle.com Thu Oct 19 11:00:55 2017 From: david.holmes at oracle.com (David Holmes) Date: Thu, 19 Oct 2017 21:00:55 +1000 Subject: RFR: 8189373: jmap -heap exited with error code In-Reply-To: References: <5f387be5-bd8a-91c6-39d4-871a74e80355@redhat.com> Message-ID: <6b12f58a-35eb-d7a1-1684-64ed6b73708b@oracle.com> I also took a look and it seemed okay - Jini is the expert so that's good enough for me. :) Reviewed. David On 19/10/2017 8:57 PM, Jini George wrote: > Your changes look good to me, Roman. > > Nit: Please do change the copyright year to 2017 in CollectedHeapName.java. > > Thank you, > Jini (not a Reviewer). > > On 10/18/2017 11:59 PM, Roman Kennke wrote: >> My recent CMSHeap extraction has broken the JVM servicability agent. >> It looks like I actually need a little bit more boilerplate to make it >> happy: >> >> http://cr.openjdk.java.net/~rkennke/8189373/webrev.00/ >> >> >> It does fix the test that's mentioned in the bug report: >> >> https://bugs.openjdk.java.net/browse/JDK-8189373 >> >> Is this the correct way to fix it? >> >> Roman >> >> From rkennke at redhat.com Thu Oct 19 11:28:39 2017 From: rkennke at redhat.com (Roman Kennke) Date: Thu, 19 Oct 2017 13:28:39 +0200 Subject: RFR: 8189373: jmap -heap exited with error code In-Reply-To: <6b12f58a-35eb-d7a1-1684-64ed6b73708b@oracle.com> References: <5f387be5-bd8a-91c6-39d4-871a74e80355@redhat.com> <6b12f58a-35eb-d7a1-1684-64ed6b73708b@oracle.com> Message-ID: <17635b8d-7071-acbe-469e-688233862033@redhat.com> Hi David, hi Jini, thank you both for review! Now I need a sponsor. Final patch (incl. summary and reviewed-by): http://cr.openjdk.java.net/~rkennke/8189373/webrev.01/ Thank you! Roman > I also took a look and it seemed okay - Jini is the expert so that's > good enough for me. :) > > Reviewed. > > David > > On 19/10/2017 8:57 PM, Jini George wrote: >> Your changes look good to me, Roman. >> >> Nit: Please do change the copyright year to 2017 in >> CollectedHeapName.java. >> >> Thank you, >> Jini (not a Reviewer). >> >> On 10/18/2017 11:59 PM, Roman Kennke wrote: >>> My recent CMSHeap extraction has broken the JVM servicability agent. >>> It looks like I actually need a little bit more boilerplate to make >>> it happy: >>> >>> http://cr.openjdk.java.net/~rkennke/8189373/webrev.00/ >>> >>> >>> It does fix the test that's mentioned in the bug report: >>> >>> https://bugs.openjdk.java.net/browse/JDK-8189373 >>> >>> Is this the correct way to fix it? >>> >>> Roman >>> >>> From yasuenag at gmail.com Thu Oct 19 11:44:30 2017 From: yasuenag at gmail.com (Yasumasa Suenaga) Date: Thu, 19 Oct 2017 20:44:30 +0900 Subject: PING: RFR: JDK-8151815: Could not parse core image with JSnap. In-Reply-To: <05720ca7-73ef-2761-e8f5-1d0fc45c8bdc@oracle.com> References: <643d9ea2-5bb3-f6ef-8007-14cd1c580137@oracle.com> <4370fcb0-d06d-865b-8bab-e03ec1812e89@oracle.com> <84922d13-79ac-8904-1885-79ee0c2b4232@oracle.com> <967acad8-8378-cf99-0ff9-ba3852d51dd3@oracle.com> <41185116-0271-374c-da9e-74e173d6b58c@oracle.com> <7cb41834-75fa-9dc5-4f6d-c2b85f84dcee@oracle.com> <78b0bfca-1259-e5ef-1189-a7646d7fea36@oracle.com> <7b8ae324-c590-bb41-4bb6-2b4d18b12267@gmail.com> <024d8188-8d54-3345-d3b6-e757f4cafb6e@oracle.com> <38f55e3a-5053-f977-980e-59687cca5fd9@gmail.com> <8e2c7e3f-2f3a-7168-712b-2642a4e88239@oracle.com> <05720ca7-73ef-2761-e8f5-1d0fc45c8bdc@oracle.com> Message-ID: <6bffc41e-80e6-6863-f46a-fba7ba66dc2a@gmail.com> Hi, > I suggest we leave the volatile off for now and file a RFE to add volatile_static_field support to VMStructs and update later. Okay. David or Serguei, could you file it? >> I'd suggest to fall back to your previous approach as synchronization was not there >> in the first place, and it is not a part of the original issue you are trying to fix >> (if David or anyone else does not a simple solution). > I don't think trying to introduce locking would be a good idea as it would likely lead to deadlocks when a crash occurs. This could also be investigated as a future RFE if desired. Sorry, I have mistake the spell of "usable". I've fixed it in new webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.08/ Can I list David and Serguei as Reviewer? I will send a changeset to Serguei if it can. Thanks, Yasumasa On 2017/10/19 19:41, David Holmes wrote: > Hi Serguei, Yasumasa, > > I suggest we leave the volatile off for now and file a RFE to add volatile_static_field support to VMStructs and update later. > > I don't think trying to introduce locking would be a good idea as it would likely lead to deadlocks when a crash occurs. This could also be investigated as a future RFE if desired. > > Thanks, > David > > On 19/10/2017 7:37 PM, serguei.spitsyn at oracle.com wrote: >> Hi Yasumasa, >> >> I see the problem. >> As it occurred making these variables volatile is non-trivial. >> But thank you a lot for trying! >> >> I'd suggest to fall back to your previous approach as synchronization was not there >> in the first place, and it is not a part of the original issue you are trying to fix >> (if David or anyone else does not a simple solution). >> But let's check if David does not object against it. >> >> I will sponsor your fix after you send me a patch. >> >> Thanks, >> Serguei >> >> >> On 10/18/17 20:21, Yasumasa Suenaga wrote: >>> Sorry, I have mistake. >>> But I cannot compile yet: >>> >>> diff -r 3e7702cd3f19 src/hotspot/share/runtime/vmStructs.cpp >>> --- a/src/hotspot/share/runtime/vmStructs.cpp?? Thu Sep 07 15:40:20 2017 +0200 >>> +++ b/src/hotspot/share/runtime/vmStructs.cpp?? Thu Oct 19 12:21:11 2017 +0900 >>> @@ -578,7 +578,7 @@ >>> ????? static_field(PerfMemory, _top, char*)???????????????????????????????? \ >>> ????? static_field(PerfMemory, _capacity, size_t)??????????????????????????????? \ >>> ????? static_field(PerfMemory, _prologue, PerfDataPrologue*)???????????????????? \ >>> -???? static_field(PerfMemory, _initialized, jint)????????????????????????????????? \ >>> +???? static_field(PerfMemory, _initialized,????????????????????????????????? volatile jint)??????????????????????????? \ >>> >>> -------------- >>> In file included from /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:104:0: >>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.hpp:168:58: error: invalid conversion from 'volatile void*' to 'void*' [-fpermissive] >>> ? { QUOTE(typeName), QUOTE(fieldName), QUOTE(type), 1, 0, &typeName::fieldName }, >>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:581:6: note: in expansion of macro 'GENERATE_STATIC_VM_STRUCT_ENTRY' >>> ????? static_field(PerfMemory, _initialized,????????????????????????????????? volatile jint)??????????????????????????? \ >>> ????? ^~~~~~~~~~~~ >>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:2934:3: note: in expansion of macro 'VM_STRUCTS' >>> ?? VM_STRUCTS(GENERATE_NONSTATIC_VM_STRUCT_ENTRY, >>> ?? ^ >>> gmake[3]: *** [lib/CompileJvm.gmk:210: /home/ysuenaga/OpenJDK/jdk10-hs/build/linux-x86_64-normal-server-fastdebug/hotspot/variant-server/libjvm/objs/vmStructs.o] Error 1 >>> gmake[2]: *** [make/Main.gmk:266: hotspot-server-libs] Error 2 >>> >>> ERROR: Build failed for target 'images' in configuration 'linux-x86_64-normal-server-fastdebug' (exit code 2) >>> -------------- >>> >>> >>> >>> On 2017/10/19 12:18, Yasumasa Suenaga wrote: >>>> Hi Serguei, >>>> >>>>> Would the below work? : >>>>> >>>>> ? 578????? static_field(PerfMemory, _initialized, volatile jint)????????????????????????????????? \ >>>>> >>>>> It'd be similar to this non-static case: >>>>> ? 362?? nonstatic_field(ConstantPoolCacheEntry, _f1,????????????????????????????????? volatile Metadata*)??????????????????? \ >>>> >>>> I got error messages as below: >>>> >>>> --------------- >>>> In file included from /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:104:0: >>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:581:39: error: expected unqualified-id before 'volatile' >>>> ?????? static_field(PerfMemory,???????? volatile _initialized, jint)????????????????????????????????? \ >>>> ??????????????????????????????????????? ^ >>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.hpp:168:69: note: in definition of macro 'GENERATE_STATIC_VM_STRUCT_ENTRY' >>>> ?? { QUOTE(typeName), QUOTE(fieldName), QUOTE(type), 1, 0, &typeName::fieldName }, >>>> ^~~~~~~~~ >>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:2934:3: note: in expansion of macro 'VM_STRUCTS' >>>> ??? VM_STRUCTS(GENERATE_NONSTATIC_VM_STRUCT_ENTRY, >>>> ??? ^ >>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:581:39: error: expected '}' before 'volatile' >>>> ?????? static_field(PerfMemory,???????? volatile _initialized, jint)????????????????????????????????? \ >>>> ??????????????????????????????????????? ^ >>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.hpp:168:69: note: in definition of macro 'GENERATE_STATIC_VM_STRUCT_ENTRY' >>>> ?? { QUOTE(typeName), QUOTE(fieldName), QUOTE(type), 1, 0, &typeName::fieldName }, >>>> ^~~~~~~~~ >>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:2934:3: note: in expansion of macro 'VM_STRUCTS' >>>> ??? VM_STRUCTS(GENERATE_NONSTATIC_VM_STRUCT_ENTRY, >>>> ??? ^ >>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:581:39: error: expected '}' before 'volatile' >>>> ?????? static_field(PerfMemory,???????? volatile _initialized, jint)????????????????????????????????? \ >>>> ??????????????????????????????????????? ^ >>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.hpp:168:69: note: in definition of macro 'GENERATE_STATIC_VM_STRUCT_ENTRY' >>>> ?? { QUOTE(typeName), QUOTE(fieldName), QUOTE(type), 1, 0, &typeName::fieldName }, >>>> ^~~~~~~~~ >>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:2934:3: note: in expansion of macro 'VM_STRUCTS' >>>> ??? VM_STRUCTS(GENERATE_NONSTATIC_VM_STRUCT_ENTRY, >>>> ??? ^ >>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.hpp:168:79: error: expected declaration before '}' token >>>> ?? { QUOTE(typeName), QUOTE(fieldName), QUOTE(type), 1, 0, &typeName::fieldName }, >>>> ^ >>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:581:6: note: in expansion of macro 'GENERATE_STATIC_VM_STRUCT_ENTRY' >>>> ?????? static_field(PerfMemory,???????? volatile _initialized, jint)????????????????????????????????? \ >>>> ?????? ^~~~~~~~~~~~ >>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:2934:3: note: in expansion of macro 'VM_STRUCTS' >>>> ??? VM_STRUCTS(GENERATE_NONSTATIC_VM_STRUCT_ENTRY, >>>> ??? ^ >>>> gmake[3]: *** [lib/CompileJvm.gmk:210: /home/ysuenaga/OpenJDK/jdk10-hs/build/linux-x86_64-normal-server-fastdebug/hotspot/variant-server/libjvm/objs/vmStructs.o] Error 1 >>>> gmake[2]: *** [make/Main.gmk:266: hotspot-server-libs] Error 2 >>>> >>>> ERROR: Build failed for target 'images' in configuration 'linux-x86_64-normal-server-fastdebug' (exit code 2) >>>> --------------- >>>> >>>> >>>> I changed as below: >>>> --------------- >>>> diff -r 3e7702cd3f19 src/hotspot/share/runtime/perfMemory.cpp >>>> --- a/src/hotspot/share/runtime/perfMemory.cpp? Thu Sep 07 15:40:20 2017 +0200 >>>> +++ b/src/hotspot/share/runtime/perfMemory.cpp? Thu Oct 19 12:15:30 2017 +0900 >>>> @@ -51,8 +51,9 @@ >>>> ??char*??????????????????? PerfMemory::_end = NULL; >>>> ??char*??????????????????? PerfMemory::_top = NULL; >>>> ??size_t?????????????????? PerfMemory::_capacity = 0; >>>> -jint???????????????????? PerfMemory::_initialized = false; >>>> +volatile jint??????????? PerfMemory::_initialized = 0; >>>> ??PerfDataPrologue*??????? PerfMemory::_prologue = NULL; >>>> +volatile bool??????????? PerfMemory::_destroyed = false; >>>> >>>> --- a/src/hotspot/share/runtime/perfMemory.hpp? Thu Sep 07 15:40:20 2017 +0200 >>>> +++ b/src/hotspot/share/runtime/perfMemory.hpp? Thu Oct 19 12:15:30 2017 +0900 >>>> @@ -113,13 +113,15 @@ >>>> ?? */ >>>> ??class PerfMemory : AllStatic { >>>> ????? friend class VMStructs; >>>> +??? friend class PerfMemoryTest; >>>> ??? private: >>>> ????? static char*? _start; >>>> ????? static char*? _end; >>>> ????? static char*? _top; >>>> ????? static size_t _capacity; >>>> ????? static PerfDataPrologue*? _prologue; >>>> -??? static jint?? _initialized; >>>> +??? static volatile jint????? _initialized; >>>> +??? static volatile bool????? _destroyed; >>>> >>>> diff -r 3e7702cd3f19 src/hotspot/share/runtime/vmStructs.cpp >>>> --- a/src/hotspot/share/runtime/vmStructs.cpp?? Thu Sep 07 15:40:20 2017 +0200 >>>> +++ b/src/hotspot/share/runtime/vmStructs.cpp?? Thu Oct 19 12:15:30 2017 +0900 >>>> @@ -578,7 +578,7 @@ >>>> ?????? static_field(PerfMemory, _top, char*)???????????????????????????????? \ >>>> ?????? static_field(PerfMemory, _capacity, size_t)??????????????????????????????? \ >>>> ?????? static_field(PerfMemory, _prologue, PerfDataPrologue*)???????????????????? \ >>>> -???? static_field(PerfMemory, _initialized, jint)????????????????????????????????? \ >>>> +???? static_field(PerfMemory,???????? volatile _initialized, jint)????????????????????????????????? \ >>>> --------------- >>>> >>>> >>>> Thanks, >>>> >>>> Yasumasa >>>> >>>> >>>> On 2017/10/19 6:18, serguei.spitsyn at oracle.com wrote: >>>>> On 10/18/17 06:51, Yasumasa Suenaga wrote: >>>>>> Hi David, Serguei, >>>>>> >>>>>>> because as soon as we have checked is_usable() and abort happening in another thread may have changed that by calling destroy. >>>>>>> >>>>>>> This code is basically broken if we hit an abort path instead of a normal VM shutdown. >>>>>> >>>>>> Can we use MutexLocker for initialize() and destroy() ? >>>>>> >>>>>> >>>>>> I've tried to fix about your comments, but I have an issue about volatile. >>>>>> PerfMemory.java depends on PerfMemory::_initialized. However VMStructs cannot handle static volatile variables. >>>>>> I think two approaches as below: >>>>>> >>>>>> >>>>>> ? 1. Remove _initialized check from PerfMemory.java >>>>>> ???? SA will throw UnmappedAddressException if JSnap try to access invalid address including uninitialized memory. >>>>>> >>>>>> ? 2. Add static volatile support to VMStructs >>>>>> >>>>>> >>>>>> Which should we do? >>>>>> 1. is easy to fix. But 2. might be right way... >>>>> >>>>> Would the below work? : >>>>> >>>>> ??578????? static_field(PerfMemory, _initialized, volatile jint)????????????????????????????????? \ >>>>> >>>>> It'd be similar to this non-static case: >>>>> ??362?? nonstatic_field(ConstantPoolCacheEntry, _f1,????????????????????????????????? volatile Metadata*)??????????????????? \ >>>>> >>>>> >>>>> Thanks, >>>>> Serguei >>>>> >>>>>> >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Yasumasa >>>>>> >>>>>> >>>>>> On 2017/10/18 21:34, David Holmes wrote: >>>>>>> Just to clarify ... >>>>>>> >>>>>>> On 18/10/2017 10:28 PM, David Holmes wrote: >>>>>>>> On 18/10/2017 8:26 PM, serguei.spitsyn at oracle.com wrote: >>>>>>>>> Hi David, >>>>>>>>> >>>>>>>>> Thank you for jumping to this review and helping Yasumasa to sort it out! >>>>>>>>> I've just discovered that this issue was already on the table for several months without a significant progress. >>>>>>>>> >>>>>>>>> >>>>>>>>> On 10/18/17 02:48, David Holmes wrote: >>>>>>>>>> Hi Serguei >>>>>>>>>> >>>>>>>>>> On 18/10/2017 7:25 PM, serguei.spitsyn at oracle.com wrote: >>>>>>>>>>> Hi Yasumasa, >>>>>>>>>>> >>>>>>>>>>> Sorry for a quite late participation. >>>>>>>>>>> >>>>>>>>>>> I looked at the previous webrevs and think that this one is much better. >>>>>>>>>>> >>>>>>>>>>> Some concern is if we need any kind of synchronization here, e.g. CAS. >>>>>>>>>>> But it depends on the PerfMemory class usage. >>>>>>>>>>> >>>>>>>>>>> Should we make the static variables '_initialized' and '_destroyed' volatile? >>>>>>>>>> >>>>>>>>>> For good measure - yes. >>>>>>>>>> >>>>>>>>>>> Also, the '_initialized' is set to 1 with: >>>>>>>>>>> ??? 159 OrderAccess::release_store(&_initialized, 1); >>>>>>>>>>> >>>>>>>>>>> Should we do the same to set the '_destroyed'?: >>>>>>>>>>> 200 _destroyed = true; >>>>>>>>>> >>>>>>>>>> There is a benign initialization race but we need the release_store to ensure all the data fields can be read if _initialized is seen as true. But what is missing is a load_acquire() in is_initialized() to ensure we synchronize with that store! >>>>>>>>> >>>>>>>>> Yes, I noticed that the load_acquire() is missed. :| >>>>>>>>> >>>>>>>>>> >>>>>>>>>> There is also a potential for a destruction race (if multiple aborts happens concurrently in different threads) but that also seems benign. In this case there is no data being set so the store to _destroyed does not need to be a release_store. >>>>>>>>> >>>>>>>>> I'm not convinced yet this is benign as the PerfMemory::destroy() has this call: >>>>>>>>> ?? 197 delete_memory_region(); >>>>>>>> >>>>>>>> Yes though most of its work ends up being no-ops. >>>>>>>> >>>>>>>>> >>>>>>>>> Now, I started thinking about the asserts that call the is_useable(). >>>>>>>>> Should they be returns instead? >>>>>>>> >>>>>>>> I think this is a somewhat confused chunk of code. It's only fractionally thread-safe yet once in use could be in use concurrently with an aborting thread that calls destroy(). I don't think there is any simple fix for this. If we're in the process of crashing does it really matter if we trigger a secondary crash due to this? >>>>>>> >>>>>>> It doesn't matter if we do: >>>>>>> >>>>>>> assert(is_usable(),...); >>>>>>> // continue >>>>>>> >>>>>>> or >>>>>>> >>>>>>> if (!is_usable()) return; >>>>>>> // continue >>>>>>> >>>>>>> because as soon as we have checked is_usable() and abort happening in another thread may have changed that by calling destroy. >>>>>>> >>>>>>> This code is basically broken if we hit an abort path instead of a normal VM shutdown. >>>>>>> >>>>>>> David >>>>>>> ----- >>>>>>> >>>>>>>> The problems with this code go way beyond what Yasumasa is trying to address with the JSnap problem and I would not want to put it back on him to try and come up with an overall solution. >>>>>>>> >>>>>>>>> Then the is_destroyed() would better to have the load_acquire(). >>>>>>>> >>>>>>>> You could add a load_acquire and do the store_release. It certainly would not hurt, but I don't think it would actually benefit anything either. >>>>>>>> >>>>>>>> Cheers, >>>>>>>> David >>>>>>>> >>>>>>>>> Just interested to know what do you think on this. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Serguei >>>>>>>>> >>>>>>>>>> >>>>>>>>>> Cheers, >>>>>>>>>> David >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Serguei >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 10/18/17 00:39, Yasumasa Suenaga wrote: >>>>>>>>>>>> Hi David, >>>>>>>>>>>> >>>>>>>>>>>> Thank you for your comment. >>>>>>>>>>>> I uploaded new webrev: >>>>>>>>>>>> >>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.07/ >>>>>>>>>>>> >>>>>>>>>>>> Serguei, please comment about this :-) >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Yasumasa >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> 2017-10-18 16:09 GMT+09:00 David Holmes: >>>>>>>>>>>>> Hi Yasumasa, >>>>>>>>>>>>> >>>>>>>>>>>>> On 18/10/2017 4:34 PM, Yasumasa Suenaga wrote: >>>>>>>>>>>>>> Hi David, >>>>>>>>>>>>>> >>>>>>>>>>>>>>> I don't think we need the extra fields, just ensure the existing ones >>>>>>>>>>>>>>> can't >>>>>>>>>>>>>>> be accessed (other than by the tools) after destroy is called. >>>>>>>>>>>>>> >>>>>>>>>>>>>> I've added PerfMemory::is_useable() to check whether we can access to >>>>>>>>>>>>>> PerfMemory. >>>>>>>>>>>>>> I think this webrev prevent to access to PerfMemory after destroy() call. >>>>>>>>>>>>>> >>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.06/ >>>>>>>>>>>>> >>>>>>>>>>>>> This: >>>>>>>>>>>>> >>>>>>>>>>>>> ?? 90 void PerfMemory::initialize() { >>>>>>>>>>>>> ?? 91 >>>>>>>>>>>>> ?? 92?? if (_prologue != NULL) >>>>>>>>>>>>> ?? 93???? // initialization already performed >>>>>>>>>>>>> ?? 94???? return; >>>>>>>>>>>>> >>>>>>>>>>>>> shouldn't check _prologue, but is_initialized(). >>>>>>>>>>>>> >>>>>>>>>>>>> ? 213?? assert(is_useable(), "called before initialization"); >>>>>>>>>>>>> >>>>>>>>>>>>> -> "called before init or after destroy" >>>>>>>>>>>>> >>>>>>>>>>>>> Could add a similar assert in PerfMemory::mark_updated(). >>>>>>>>>>>>> >>>>>>>>>>>>> Let's see what Serguei thinks. :) >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> David >>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> 2017-10-18 13:44 GMT+09:00 David Holmes: >>>>>>>>>>>>>>> On 18/10/2017 2:27 PM, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Hi David, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> 2017-10-18 12:55 GMT+09:00 David Holmes: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On 18/10/2017 12:37 PM, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Hi David, >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> With your changes you no longer null out _prologue so the assertion >>>>>>>>>>>>>>>>>>> would >>>>>>>>>>>>>>>>>>> now not fail and we'd proceed to access the deleted memory region! >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On Linux, PerfMemory::delete_memory_region() does not call munmap() >>>>>>>>>>>>>>>>>> for PerfMemory. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Perhaps not but there are still other actions that happen and the point >>>>>>>>>>>>>>>>> is >>>>>>>>>>>>>>>>> we should not be able to continue to use PerfMemory once it has been >>>>>>>>>>>>>>>>> destroyed (even if the destruction is only logical). >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I received same comment from Dmitry in the past, but we couldn't >>>>>>>>>>>>>>>> decide how should we do. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> http://mail.openjdk.java.net/pipermail/serviceability-dev/2016-May/019728.html >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> In that discussion, I uploaded another webrev which adds other fields >>>>>>>>>>>>>>>> for >>>>>>>>>>>>>>>> JSnap. >>>>>>>>>>>>>>>> Is it suitable? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.02/ >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I don't think we need the extra fields, just ensure the existing ones >>>>>>>>>>>>>>> can't >>>>>>>>>>>>>>> be accessed (other than by the tools) after destroy is called. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> I'm unclear why you no longer clear all the fields set during >>>>>>>>>>>>>>>>>>> initialization? >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> PerfMemory.java in jdk.hotspot.agent needs these field values. >>>>>>>>>>>>>>>>>> `jhsdb jsnap --core` is failed if they are cleared. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I'm not familiar with these tools. When do we produce a core file after >>>>>>>>>>>>>>>>> calling PerfMemory::destroy ? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> PerfMemory::destroy() is called before aborting. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Ah - right. I assume we need to close off the perfdata file before we >>>>>>>>>>>>>>> abort. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>> David >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> ----------------------- >>>>>>>>>>>>>>>> #0? perfMemory_exit () >>>>>>>>>>>>>>>> ?????? at >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/share/vm/runtime/perfMemory.cpp:80 >>>>>>>>>>>>>>>> #1? 0x00007f99b091c949 in os::shutdown () >>>>>>>>>>>>>>>> ?????? at >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:1483 >>>>>>>>>>>>>>>> #2? 0x00007f99b091c980 in os::abort (dump_core=) >>>>>>>>>>>>>>>> ?????? at >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:1503 >>>>>>>>>>>>>>>> #3? 0x00007f99b0b689c3 in VMError::report_and_die ( >>>>>>>>>>>>>>>> ?????? this=this at entry=0x7ffcacf40b50) >>>>>>>>>>>>>>>> ?????? at >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/share/vm/utilities/vmError.cpp:1060 >>>>>>>>>>>>>>>> #4? 0x00007f99b0926f04 in JVM_handle_linux_signal (sig=sig at entry=11, >>>>>>>>>>>>>>>> ?????? info=info at entry=0x7ffcacf40df0, >>>>>>>>>>>>>>>> ucVoid=ucVoid at entry=0x7ffcacf40cc0, >>>>>>>>>>>>>>>> abort_if_unrecognized=abort_if_unrecognized at entry=1) >>>>>>>>>>>>>>>> ?????? at >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/os_cpu/linux_x86/vm/os_linux_x86.cpp:541 >>>>>>>>>>>>>>>> ----------------------- >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> But it seems to me that there are various checks of >>>>>>>>>>>>>>>>>>> _prologue that should really be checking is_initialized() and/or >>>>>>>>>>>>>>>>>>> is_destroyed() as a guard. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Should I change all assertions for _prologue? >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Assertions and direct guards. Checking _prologue is a placeholder for >>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>> real check. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> 2017-10-18 10:53 GMT+09:00 David Holmes: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Hi Yasumasa, >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> By chance we ran into this bug which I analysed yesterday: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8189390 >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> We hit the assertion: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> #? Internal Error >>>>>>>>>>>>>>>>>>> (/open/src/hotspot/share/runtime/perfMemory.cpp:216), >>>>>>>>>>>>>>>>>>> pid=17874, tid=17875 >>>>>>>>>>>>>>>>>>> #? assert(_prologue != __null) failed: called before initialization >>>>>>>>>>>>>>>>>>> # >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> which is misleading because it can fail if called before >>>>>>>>>>>>>>>>>>> initialization, >>>>>>>>>>>>>>>>>>> or >>>>>>>>>>>>>>>>>>> after PerfMemory::destroy has been called. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> With your changes you no longer null out _prologue so the assertion >>>>>>>>>>>>>>>>>>> would >>>>>>>>>>>>>>>>>>> now not fail and we'd proceed to access the deleted memory region! >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> I'm unclear why you no longer clear all the fields set during >>>>>>>>>>>>>>>>>>> initialization? But it seems to me that there are various checks of >>>>>>>>>>>>>>>>>>> _prologue that should really be checking is_initialized() and/or >>>>>>>>>>>>>>>>>>> is_destroyed() as a guard. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> On 16/10/2017 11:25 PM, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> PING: >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Could you review it? >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.05/ >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> On 2017/10/03 13:18, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> I added gtest unit test case for this change in new webrev: >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.05/ >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Could you review it? >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> 2017-09-27 0:01 GMT+09:00 Yasumasa Suenaga: >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> I uploaded new webrev to be adapted to jdk10/hs: >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.04/ >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> On 2017/09/21 7:45, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> PING: >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Have you checked this issue? >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.03/ >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> On 2017/07/01 23:43, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> PING: >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> Have you checked this issue? >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> On 2017/06/13 14:10, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> I want to discuss about JDK-8151815: Could not parse core image >>>>>>>>>>>>>>>>>>>>>>>>> with >>>>>>>>>>>>>>>>>>>>>>>>> JSnap. >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> In last year, I found JSnap cannot parse coredump and I've sent >>>>>>>>>>>>>>>>>>>>>>>>> review >>>>>>>>>>>>>>>>>>>>>>>>> request for it as JDK-8151815. However it has not been reviewed >>>>>>>>>>>>>>>>>>>>>>>>> yet >>>>>>>>>>>>>>>>>>>>>>>>> [1]. >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> We've discussed about safety implementation, but we could not >>>>>>>>>>>>>>>>>>>>>>>>> get >>>>>>>>>>>>>>>>>>>>>>>>> consensus. >>>>>>>>>>>>>>>>>>>>>>>>> IMHO all SA tools should be handled java processes and core >>>>>>>>>>>>>>>>>>>>>>>>> images, >>>>>>>>>>>>>>>>>>>>>>>>> and PerfCounter value is useful. So I fix this issue. >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> I uploaded new webrev for this issue. I think this patch is >>>>>>>>>>>>>>>>>>>>>>>>> safety >>>>>>>>>>>>>>>>>>>>>>>>> because new flag PerfMemory::_destroyed guards double free, and >>>>>>>>>>>>>>>>>>>>>>>>> all >>>>>>>>>>>>>>>>>>>>>>>>> members in PerfMemory is accessible (they are not munmap'ed) >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.03/ >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Can you cooperate? >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> [1] >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> http://mail.openjdk.java.net/pipermail/serviceability-dev/2016-April/019480.html >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>> >>>>> >> From david.holmes at oracle.com Thu Oct 19 12:24:57 2017 From: david.holmes at oracle.com (David Holmes) Date: Thu, 19 Oct 2017 22:24:57 +1000 Subject: PING: RFR: JDK-8151815: Could not parse core image with JSnap. In-Reply-To: <6bffc41e-80e6-6863-f46a-fba7ba66dc2a@gmail.com> References: <4370fcb0-d06d-865b-8bab-e03ec1812e89@oracle.com> <84922d13-79ac-8904-1885-79ee0c2b4232@oracle.com> <967acad8-8378-cf99-0ff9-ba3852d51dd3@oracle.com> <41185116-0271-374c-da9e-74e173d6b58c@oracle.com> <7cb41834-75fa-9dc5-4f6d-c2b85f84dcee@oracle.com> <78b0bfca-1259-e5ef-1189-a7646d7fea36@oracle.com> <7b8ae324-c590-bb41-4bb6-2b4d18b12267@gmail.com> <024d8188-8d54-3345-d3b6-e757f4cafb6e@oracle.com> <38f55e3a-5053-f977-980e-59687cca5fd9@gmail.com> <8e2c7e3f-2f3a-7168-712b-2642a4e88239@oracle.com> <05720ca7-73ef-2761-e8f5-1d0fc45c8bdc@oracle.com> <6bffc41e-80e6-6863-f46a-fba7ba66dc2a@gmail.com> Message-ID: On 19/10/2017 9:44 PM, Yasumasa Suenaga wrote: > Hi, > >> I suggest we leave the volatile off for now and file a RFE to add >> volatile_static_field support to VMStructs and update later. > > Okay. David or Serguei, could you file it? > > >>> I'd suggest to fall back to your previous approach as synchronization >>> was not there >>> in the first place, and it is not a part of the original issue you >>> are trying to fix >>> (if David or anyone else does not a simple solution). > >> I don't think trying to introduce locking would be a good idea as it >> would likely lead to deadlocks when a crash occurs. This could also be >> investigated as a future RFE if desired. > > Sorry, I have mistake the spell of "usable". > I've fixed it in new webrev: > > ? http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.08/ > > Can I list David and Serguei as Reviewer? > I will send a changeset to Serguei if it can. Yes. Thanks, David > > Thanks, > > Yasumasa > > > On 2017/10/19 19:41, David Holmes wrote: >> Hi Serguei, Yasumasa, >> >> I suggest we leave the volatile off for now and file a RFE to add >> volatile_static_field support to VMStructs and update later. >> >> I don't think trying to introduce locking would be a good idea as it >> would likely lead to deadlocks when a crash occurs. This could also be >> investigated as a future RFE if desired. >> >> Thanks, >> David >> >> On 19/10/2017 7:37 PM, serguei.spitsyn at oracle.com wrote: >>> Hi Yasumasa, >>> >>> I see the problem. >>> As it occurred making these variables volatile is non-trivial. >>> But thank you a lot for trying! >>> >>> I'd suggest to fall back to your previous approach as synchronization >>> was not there >>> in the first place, and it is not a part of the original issue you >>> are trying to fix >>> (if David or anyone else does not a simple solution). >>> But let's check if David does not object against it. >>> >>> I will sponsor your fix after you send me a patch. >>> >>> Thanks, >>> Serguei >>> >>> >>> On 10/18/17 20:21, Yasumasa Suenaga wrote: >>>> Sorry, I have mistake. >>>> But I cannot compile yet: >>>> >>>> diff -r 3e7702cd3f19 src/hotspot/share/runtime/vmStructs.cpp >>>> --- a/src/hotspot/share/runtime/vmStructs.cpp?? Thu Sep 07 15:40:20 >>>> 2017 +0200 >>>> +++ b/src/hotspot/share/runtime/vmStructs.cpp?? Thu Oct 19 12:21:11 >>>> 2017 +0900 >>>> @@ -578,7 +578,7 @@ >>>> ????? static_field(PerfMemory, _top, >>>> char*)???????????????????????????????? \ >>>> ????? static_field(PerfMemory, _capacity, >>>> size_t)??????????????????????????????? \ >>>> ????? static_field(PerfMemory, _prologue, >>>> PerfDataPrologue*)???????????????????? \ >>>> -???? static_field(PerfMemory, _initialized, >>>> jint)????????????????????????????????? \ >>>> +???? static_field(PerfMemory, >>>> _initialized,????????????????????????????????? volatile >>>> jint)??????????????????????????? \ >>>> >>>> -------------- >>>> In file included from >>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:104:0: >>>> >>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.hpp:168:58: >>>> error: invalid conversion from 'volatile void*' to 'void*' >>>> [-fpermissive] >>>> ? { QUOTE(typeName), QUOTE(fieldName), QUOTE(type), 1, 0, >>>> &typeName::fieldName }, >>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:581:6: >>>> note: in expansion of macro 'GENERATE_STATIC_VM_STRUCT_ENTRY' >>>> ????? static_field(PerfMemory, >>>> _initialized,????????????????????????????????? volatile >>>> jint)??????????????????????????? \ >>>> ????? ^~~~~~~~~~~~ >>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:2934:3: >>>> note: in expansion of macro 'VM_STRUCTS' >>>> ?? VM_STRUCTS(GENERATE_NONSTATIC_VM_STRUCT_ENTRY, >>>> ?? ^ >>>> gmake[3]: *** [lib/CompileJvm.gmk:210: >>>> /home/ysuenaga/OpenJDK/jdk10-hs/build/linux-x86_64-normal-server-fastdebug/hotspot/variant-server/libjvm/objs/vmStructs.o] >>>> Error 1 >>>> gmake[2]: *** [make/Main.gmk:266: hotspot-server-libs] Error 2 >>>> >>>> ERROR: Build failed for target 'images' in configuration >>>> 'linux-x86_64-normal-server-fastdebug' (exit code 2) >>>> -------------- >>>> >>>> >>>> >>>> On 2017/10/19 12:18, Yasumasa Suenaga wrote: >>>>> Hi Serguei, >>>>> >>>>>> Would the below work? : >>>>>> >>>>>> ? 578????? static_field(PerfMemory, _initialized, volatile >>>>>> jint)????????????????????????????????? \ >>>>>> >>>>>> It'd be similar to this non-static case: >>>>>> ? 362?? nonstatic_field(ConstantPoolCacheEntry, >>>>>> _f1,????????????????????????????????? volatile >>>>>> Metadata*)??????????????????? \ >>>>> >>>>> I got error messages as below: >>>>> >>>>> --------------- >>>>> In file included from >>>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:104:0: >>>>> >>>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:581:39: >>>>> error: expected unqualified-id before 'volatile' >>>>> ?????? static_field(PerfMemory,???????? volatile _initialized, >>>>> jint)????????????????????????????????? \ >>>>> ??????????????????????????????????????? ^ >>>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.hpp:168:69: >>>>> note: in definition of macro 'GENERATE_STATIC_VM_STRUCT_ENTRY' >>>>> ?? { QUOTE(typeName), QUOTE(fieldName), QUOTE(type), 1, 0, >>>>> &typeName::fieldName }, >>>>> ^~~~~~~~~ >>>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:2934:3: >>>>> note: in expansion of macro 'VM_STRUCTS' >>>>> ??? VM_STRUCTS(GENERATE_NONSTATIC_VM_STRUCT_ENTRY, >>>>> ??? ^ >>>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:581:39: >>>>> error: expected '}' before 'volatile' >>>>> ?????? static_field(PerfMemory,???????? volatile _initialized, >>>>> jint)????????????????????????????????? \ >>>>> ??????????????????????????????????????? ^ >>>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.hpp:168:69: >>>>> note: in definition of macro 'GENERATE_STATIC_VM_STRUCT_ENTRY' >>>>> ?? { QUOTE(typeName), QUOTE(fieldName), QUOTE(type), 1, 0, >>>>> &typeName::fieldName }, >>>>> ^~~~~~~~~ >>>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:2934:3: >>>>> note: in expansion of macro 'VM_STRUCTS' >>>>> ??? VM_STRUCTS(GENERATE_NONSTATIC_VM_STRUCT_ENTRY, >>>>> ??? ^ >>>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:581:39: >>>>> error: expected '}' before 'volatile' >>>>> ?????? static_field(PerfMemory,???????? volatile _initialized, >>>>> jint)????????????????????????????????? \ >>>>> ??????????????????????????????????????? ^ >>>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.hpp:168:69: >>>>> note: in definition of macro 'GENERATE_STATIC_VM_STRUCT_ENTRY' >>>>> ?? { QUOTE(typeName), QUOTE(fieldName), QUOTE(type), 1, 0, >>>>> &typeName::fieldName }, >>>>> ^~~~~~~~~ >>>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:2934:3: >>>>> note: in expansion of macro 'VM_STRUCTS' >>>>> ??? VM_STRUCTS(GENERATE_NONSTATIC_VM_STRUCT_ENTRY, >>>>> ??? ^ >>>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.hpp:168:79: >>>>> error: expected declaration before '}' token >>>>> ?? { QUOTE(typeName), QUOTE(fieldName), QUOTE(type), 1, 0, >>>>> &typeName::fieldName }, >>>>> ^ >>>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:581:6: >>>>> note: in expansion of macro 'GENERATE_STATIC_VM_STRUCT_ENTRY' >>>>> ?????? static_field(PerfMemory,???????? volatile _initialized, >>>>> jint)????????????????????????????????? \ >>>>> ?????? ^~~~~~~~~~~~ >>>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:2934:3: >>>>> note: in expansion of macro 'VM_STRUCTS' >>>>> ??? VM_STRUCTS(GENERATE_NONSTATIC_VM_STRUCT_ENTRY, >>>>> ??? ^ >>>>> gmake[3]: *** [lib/CompileJvm.gmk:210: >>>>> /home/ysuenaga/OpenJDK/jdk10-hs/build/linux-x86_64-normal-server-fastdebug/hotspot/variant-server/libjvm/objs/vmStructs.o] >>>>> Error 1 >>>>> gmake[2]: *** [make/Main.gmk:266: hotspot-server-libs] Error 2 >>>>> >>>>> ERROR: Build failed for target 'images' in configuration >>>>> 'linux-x86_64-normal-server-fastdebug' (exit code 2) >>>>> --------------- >>>>> >>>>> >>>>> I changed as below: >>>>> --------------- >>>>> diff -r 3e7702cd3f19 src/hotspot/share/runtime/perfMemory.cpp >>>>> --- a/src/hotspot/share/runtime/perfMemory.cpp? Thu Sep 07 15:40:20 >>>>> 2017 +0200 >>>>> +++ b/src/hotspot/share/runtime/perfMemory.cpp? Thu Oct 19 12:15:30 >>>>> 2017 +0900 >>>>> @@ -51,8 +51,9 @@ >>>>> ??char*??????????????????? PerfMemory::_end = NULL; >>>>> ??char*??????????????????? PerfMemory::_top = NULL; >>>>> ??size_t?????????????????? PerfMemory::_capacity = 0; >>>>> -jint???????????????????? PerfMemory::_initialized = false; >>>>> +volatile jint??????????? PerfMemory::_initialized = 0; >>>>> ??PerfDataPrologue*??????? PerfMemory::_prologue = NULL; >>>>> +volatile bool??????????? PerfMemory::_destroyed = false; >>>>> >>>>> --- a/src/hotspot/share/runtime/perfMemory.hpp? Thu Sep 07 15:40:20 >>>>> 2017 +0200 >>>>> +++ b/src/hotspot/share/runtime/perfMemory.hpp? Thu Oct 19 12:15:30 >>>>> 2017 +0900 >>>>> @@ -113,13 +113,15 @@ >>>>> ?? */ >>>>> ??class PerfMemory : AllStatic { >>>>> ????? friend class VMStructs; >>>>> +??? friend class PerfMemoryTest; >>>>> ??? private: >>>>> ????? static char*? _start; >>>>> ????? static char*? _end; >>>>> ????? static char*? _top; >>>>> ????? static size_t _capacity; >>>>> ????? static PerfDataPrologue*? _prologue; >>>>> -??? static jint?? _initialized; >>>>> +??? static volatile jint????? _initialized; >>>>> +??? static volatile bool????? _destroyed; >>>>> >>>>> diff -r 3e7702cd3f19 src/hotspot/share/runtime/vmStructs.cpp >>>>> --- a/src/hotspot/share/runtime/vmStructs.cpp?? Thu Sep 07 15:40:20 >>>>> 2017 +0200 >>>>> +++ b/src/hotspot/share/runtime/vmStructs.cpp?? Thu Oct 19 12:15:30 >>>>> 2017 +0900 >>>>> @@ -578,7 +578,7 @@ >>>>> ?????? static_field(PerfMemory, _top, >>>>> char*)???????????????????????????????? \ >>>>> ?????? static_field(PerfMemory, _capacity, >>>>> size_t)??????????????????????????????? \ >>>>> ?????? static_field(PerfMemory, _prologue, >>>>> PerfDataPrologue*)???????????????????? \ >>>>> -???? static_field(PerfMemory, _initialized, >>>>> jint)????????????????????????????????? \ >>>>> +???? static_field(PerfMemory,???????? volatile _initialized, >>>>> jint)????????????????????????????????? \ >>>>> --------------- >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Yasumasa >>>>> >>>>> >>>>> On 2017/10/19 6:18, serguei.spitsyn at oracle.com wrote: >>>>>> On 10/18/17 06:51, Yasumasa Suenaga wrote: >>>>>>> Hi David, Serguei, >>>>>>> >>>>>>>> because as soon as we have checked is_usable() and abort >>>>>>>> happening in another thread may have changed that by calling >>>>>>>> destroy. >>>>>>>> >>>>>>>> This code is basically broken if we hit an abort path instead of >>>>>>>> a normal VM shutdown. >>>>>>> >>>>>>> Can we use MutexLocker for initialize() and destroy() ? >>>>>>> >>>>>>> >>>>>>> I've tried to fix about your comments, but I have an issue about >>>>>>> volatile. >>>>>>> PerfMemory.java depends on PerfMemory::_initialized. However >>>>>>> VMStructs cannot handle static volatile variables. >>>>>>> I think two approaches as below: >>>>>>> >>>>>>> >>>>>>> ? 1. Remove _initialized check from PerfMemory.java >>>>>>> ???? SA will throw UnmappedAddressException if JSnap try to >>>>>>> access invalid address including uninitialized memory. >>>>>>> >>>>>>> ? 2. Add static volatile support to VMStructs >>>>>>> >>>>>>> >>>>>>> Which should we do? >>>>>>> 1. is easy to fix. But 2. might be right way... >>>>>> >>>>>> Would the below work? : >>>>>> >>>>>> ??578????? static_field(PerfMemory, _initialized, volatile >>>>>> jint)????????????????????????????????? \ >>>>>> >>>>>> It'd be similar to this non-static case: >>>>>> ??362?? nonstatic_field(ConstantPoolCacheEntry, >>>>>> _f1,????????????????????????????????? volatile >>>>>> Metadata*)??????????????????? \ >>>>>> >>>>>> >>>>>> Thanks, >>>>>> Serguei >>>>>> >>>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Yasumasa >>>>>>> >>>>>>> >>>>>>> On 2017/10/18 21:34, David Holmes wrote: >>>>>>>> Just to clarify ... >>>>>>>> >>>>>>>> On 18/10/2017 10:28 PM, David Holmes wrote: >>>>>>>>> On 18/10/2017 8:26 PM, serguei.spitsyn at oracle.com wrote: >>>>>>>>>> Hi David, >>>>>>>>>> >>>>>>>>>> Thank you for jumping to this review and helping Yasumasa to >>>>>>>>>> sort it out! >>>>>>>>>> I've just discovered that this issue was already on the table >>>>>>>>>> for several months without a significant progress. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 10/18/17 02:48, David Holmes wrote: >>>>>>>>>>> Hi Serguei >>>>>>>>>>> >>>>>>>>>>> On 18/10/2017 7:25 PM, serguei.spitsyn at oracle.com wrote: >>>>>>>>>>>> Hi Yasumasa, >>>>>>>>>>>> >>>>>>>>>>>> Sorry for a quite late participation. >>>>>>>>>>>> >>>>>>>>>>>> I looked at the previous webrevs and think that this one is >>>>>>>>>>>> much better. >>>>>>>>>>>> >>>>>>>>>>>> Some concern is if we need any kind of synchronization here, >>>>>>>>>>>> e.g. CAS. >>>>>>>>>>>> But it depends on the PerfMemory class usage. >>>>>>>>>>>> >>>>>>>>>>>> Should we make the static variables '_initialized' and >>>>>>>>>>>> '_destroyed' volatile? >>>>>>>>>>> >>>>>>>>>>> For good measure - yes. >>>>>>>>>>> >>>>>>>>>>>> Also, the '_initialized' is set to 1 with: >>>>>>>>>>>> ??? 159 OrderAccess::release_store(&_initialized, 1); >>>>>>>>>>>> >>>>>>>>>>>> Should we do the same to set the '_destroyed'?: >>>>>>>>>>>> 200 _destroyed = true; >>>>>>>>>>> >>>>>>>>>>> There is a benign initialization race but we need the >>>>>>>>>>> release_store to ensure all the data fields can be read if >>>>>>>>>>> _initialized is seen as true. But what is missing is a >>>>>>>>>>> load_acquire() in is_initialized() to ensure we synchronize >>>>>>>>>>> with that store! >>>>>>>>>> >>>>>>>>>> Yes, I noticed that the load_acquire() is missed. :| >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> There is also a potential for a destruction race (if multiple >>>>>>>>>>> aborts happens concurrently in different threads) but that >>>>>>>>>>> also seems benign. In this case there is no data being set so >>>>>>>>>>> the store to _destroyed does not need to be a release_store. >>>>>>>>>> >>>>>>>>>> I'm not convinced yet this is benign as the >>>>>>>>>> PerfMemory::destroy() has this call: >>>>>>>>>> ?? 197 delete_memory_region(); >>>>>>>>> >>>>>>>>> Yes though most of its work ends up being no-ops. >>>>>>>>> >>>>>>>>>> >>>>>>>>>> Now, I started thinking about the asserts that call the >>>>>>>>>> is_useable(). >>>>>>>>>> Should they be returns instead? >>>>>>>>> >>>>>>>>> I think this is a somewhat confused chunk of code. It's only >>>>>>>>> fractionally thread-safe yet once in use could be in use >>>>>>>>> concurrently with an aborting thread that calls destroy(). I >>>>>>>>> don't think there is any simple fix for this. If we're in the >>>>>>>>> process of crashing does it really matter if we trigger a >>>>>>>>> secondary crash due to this? >>>>>>>> >>>>>>>> It doesn't matter if we do: >>>>>>>> >>>>>>>> assert(is_usable(),...); >>>>>>>> // continue >>>>>>>> >>>>>>>> or >>>>>>>> >>>>>>>> if (!is_usable()) return; >>>>>>>> // continue >>>>>>>> >>>>>>>> because as soon as we have checked is_usable() and abort >>>>>>>> happening in another thread may have changed that by calling >>>>>>>> destroy. >>>>>>>> >>>>>>>> This code is basically broken if we hit an abort path instead of >>>>>>>> a normal VM shutdown. >>>>>>>> >>>>>>>> David >>>>>>>> ----- >>>>>>>> >>>>>>>>> The problems with this code go way beyond what Yasumasa is >>>>>>>>> trying to address with the JSnap problem and I would not want >>>>>>>>> to put it back on him to try and come up with an overall solution. >>>>>>>>> >>>>>>>>>> Then the is_destroyed() would better to have the load_acquire(). >>>>>>>>> >>>>>>>>> You could add a load_acquire and do the store_release. It >>>>>>>>> certainly would not hurt, but I don't think it would actually >>>>>>>>> benefit anything either. >>>>>>>>> >>>>>>>>> Cheers, >>>>>>>>> David >>>>>>>>> >>>>>>>>>> Just interested to know what do you think on this. >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Serguei >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Cheers, >>>>>>>>>>> David >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Serguei >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On 10/18/17 00:39, Yasumasa Suenaga wrote: >>>>>>>>>>>>> Hi David, >>>>>>>>>>>>> >>>>>>>>>>>>> Thank you for your comment. >>>>>>>>>>>>> I uploaded new webrev: >>>>>>>>>>>>> >>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.07/ >>>>>>>>>>>>> >>>>>>>>>>>>> Serguei, please comment about this :-) >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> 2017-10-18 16:09 GMT+09:00 David >>>>>>>>>>>>> Holmes: >>>>>>>>>>>>>> Hi Yasumasa, >>>>>>>>>>>>>> >>>>>>>>>>>>>> On 18/10/2017 4:34 PM, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>> Hi David, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I don't think we need the extra fields, just ensure the >>>>>>>>>>>>>>>> existing ones >>>>>>>>>>>>>>>> can't >>>>>>>>>>>>>>>> be accessed (other than by the tools) after destroy is >>>>>>>>>>>>>>>> called. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I've added PerfMemory::is_useable() to check whether we >>>>>>>>>>>>>>> can access to >>>>>>>>>>>>>>> PerfMemory. >>>>>>>>>>>>>>> I think this webrev prevent to access to PerfMemory after >>>>>>>>>>>>>>> destroy() call. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.06/ >>>>>>>>>>>>>> >>>>>>>>>>>>>> This: >>>>>>>>>>>>>> >>>>>>>>>>>>>> ?? 90 void PerfMemory::initialize() { >>>>>>>>>>>>>> ?? 91 >>>>>>>>>>>>>> ?? 92?? if (_prologue != NULL) >>>>>>>>>>>>>> ?? 93???? // initialization already performed >>>>>>>>>>>>>> ?? 94???? return; >>>>>>>>>>>>>> >>>>>>>>>>>>>> shouldn't check _prologue, but is_initialized(). >>>>>>>>>>>>>> >>>>>>>>>>>>>> ? 213?? assert(is_useable(), "called before initialization"); >>>>>>>>>>>>>> >>>>>>>>>>>>>> -> "called before init or after destroy" >>>>>>>>>>>>>> >>>>>>>>>>>>>> Could add a similar assert in PerfMemory::mark_updated(). >>>>>>>>>>>>>> >>>>>>>>>>>>>> Let's see what Serguei thinks. :) >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> David >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> 2017-10-18 13:44 GMT+09:00 David >>>>>>>>>>>>>>> Holmes: >>>>>>>>>>>>>>>> On 18/10/2017 2:27 PM, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Hi David, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> 2017-10-18 12:55 GMT+09:00 David >>>>>>>>>>>>>>>>> Holmes: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On 18/10/2017 12:37 PM, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Hi David, >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> With your changes you no longer null out _prologue >>>>>>>>>>>>>>>>>>>> so the assertion >>>>>>>>>>>>>>>>>>>> would >>>>>>>>>>>>>>>>>>>> now not fail and we'd proceed to access the deleted >>>>>>>>>>>>>>>>>>>> memory region! >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> On Linux, PerfMemory::delete_memory_region() does not >>>>>>>>>>>>>>>>>>> call munmap() >>>>>>>>>>>>>>>>>>> for PerfMemory. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Perhaps not but there are still other actions that >>>>>>>>>>>>>>>>>> happen and the point >>>>>>>>>>>>>>>>>> is >>>>>>>>>>>>>>>>>> we should not be able to continue to use PerfMemory >>>>>>>>>>>>>>>>>> once it has been >>>>>>>>>>>>>>>>>> destroyed (even if the destruction is only logical). >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I received same comment from Dmitry in the past, but we >>>>>>>>>>>>>>>>> couldn't >>>>>>>>>>>>>>>>> decide how should we do. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> http://mail.openjdk.java.net/pipermail/serviceability-dev/2016-May/019728.html >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> In that discussion, I uploaded another webrev which >>>>>>>>>>>>>>>>> adds other fields >>>>>>>>>>>>>>>>> for >>>>>>>>>>>>>>>>> JSnap. >>>>>>>>>>>>>>>>> Is it suitable? >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.02/ >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I don't think we need the extra fields, just ensure the >>>>>>>>>>>>>>>> existing ones >>>>>>>>>>>>>>>> can't >>>>>>>>>>>>>>>> be accessed (other than by the tools) after destroy is >>>>>>>>>>>>>>>> called. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> I'm unclear why you no longer clear all the fields >>>>>>>>>>>>>>>>>>>> set during >>>>>>>>>>>>>>>>>>>> initialization? >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> PerfMemory.java in jdk.hotspot.agent needs these >>>>>>>>>>>>>>>>>>> field values. >>>>>>>>>>>>>>>>>>> `jhsdb jsnap --core` is failed if they are cleared. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> I'm not familiar with these tools. When do we produce >>>>>>>>>>>>>>>>>> a core file after >>>>>>>>>>>>>>>>>> calling PerfMemory::destroy ? >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> PerfMemory::destroy() is called before aborting. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Ah - right. I assume we need to close off the perfdata >>>>>>>>>>>>>>>> file before we >>>>>>>>>>>>>>>> abort. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> ----------------------- >>>>>>>>>>>>>>>>> #0? perfMemory_exit () >>>>>>>>>>>>>>>>> ?????? at >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/share/vm/runtime/perfMemory.cpp:80 >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> #1? 0x00007f99b091c949 in os::shutdown () >>>>>>>>>>>>>>>>> ?????? at >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:1483 >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> #2? 0x00007f99b091c980 in os::abort >>>>>>>>>>>>>>>>> (dump_core=) >>>>>>>>>>>>>>>>> ?????? at >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:1503 >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> #3? 0x00007f99b0b689c3 in VMError::report_and_die ( >>>>>>>>>>>>>>>>> ?????? this=this at entry=0x7ffcacf40b50) >>>>>>>>>>>>>>>>> ?????? at >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/share/vm/utilities/vmError.cpp:1060 >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> #4? 0x00007f99b0926f04 in JVM_handle_linux_signal >>>>>>>>>>>>>>>>> (sig=sig at entry=11, >>>>>>>>>>>>>>>>> ?????? info=info at entry=0x7ffcacf40df0, >>>>>>>>>>>>>>>>> ucVoid=ucVoid at entry=0x7ffcacf40cc0, >>>>>>>>>>>>>>>>> abort_if_unrecognized=abort_if_unrecognized at entry=1) >>>>>>>>>>>>>>>>> ?????? at >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/os_cpu/linux_x86/vm/os_linux_x86.cpp:541 >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> ----------------------- >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> But it seems to me that there are various checks of >>>>>>>>>>>>>>>>>>>> _prologue that should really be checking >>>>>>>>>>>>>>>>>>>> is_initialized() and/or >>>>>>>>>>>>>>>>>>>> is_destroyed() as a guard. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Should I change all assertions for _prologue? >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Assertions and direct guards. Checking _prologue is a >>>>>>>>>>>>>>>>>> placeholder for >>>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>> real check. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> 2017-10-18 10:53 GMT+09:00 David >>>>>>>>>>>>>>>>>>> Holmes: >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Hi Yasumasa, >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> By chance we ran into this bug which I analysed >>>>>>>>>>>>>>>>>>>> yesterday: >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8189390 >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> We hit the assertion: >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> #? Internal Error >>>>>>>>>>>>>>>>>>>> (/open/src/hotspot/share/runtime/perfMemory.cpp:216), >>>>>>>>>>>>>>>>>>>> pid=17874, tid=17875 >>>>>>>>>>>>>>>>>>>> #? assert(_prologue != __null) failed: called before >>>>>>>>>>>>>>>>>>>> initialization >>>>>>>>>>>>>>>>>>>> # >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> which is misleading because it can fail if called >>>>>>>>>>>>>>>>>>>> before >>>>>>>>>>>>>>>>>>>> initialization, >>>>>>>>>>>>>>>>>>>> or >>>>>>>>>>>>>>>>>>>> after PerfMemory::destroy has been called. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> With your changes you no longer null out _prologue >>>>>>>>>>>>>>>>>>>> so the assertion >>>>>>>>>>>>>>>>>>>> would >>>>>>>>>>>>>>>>>>>> now not fail and we'd proceed to access the deleted >>>>>>>>>>>>>>>>>>>> memory region! >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> I'm unclear why you no longer clear all the fields >>>>>>>>>>>>>>>>>>>> set during >>>>>>>>>>>>>>>>>>>> initialization? But it seems to me that there are >>>>>>>>>>>>>>>>>>>> various checks of >>>>>>>>>>>>>>>>>>>> _prologue that should really be checking >>>>>>>>>>>>>>>>>>>> is_initialized() and/or >>>>>>>>>>>>>>>>>>>> is_destroyed() as a guard. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> On 16/10/2017 11:25 PM, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> PING: >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Could you review it? >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.05/ >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> On 2017/10/03 13:18, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> I added gtest unit test case for this change in >>>>>>>>>>>>>>>>>>>>>> new webrev: >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.05/ >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Could you review it? >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> 2017-09-27 0:01 GMT+09:00 Yasumasa >>>>>>>>>>>>>>>>>>>>>> Suenaga: >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> I uploaded new webrev to be adapted to jdk10/hs: >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.04/ >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> On 2017/09/21 7:45, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> PING: >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> Have you checked this issue? >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.03/ >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> On 2017/07/01 23:43, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> PING: >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Have you checked this issue? >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> On 2017/06/13 14:10, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> I want to discuss about JDK-8151815: Could not >>>>>>>>>>>>>>>>>>>>>>>>>> parse core image >>>>>>>>>>>>>>>>>>>>>>>>>> with >>>>>>>>>>>>>>>>>>>>>>>>>> JSnap. >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> In last year, I found JSnap cannot parse >>>>>>>>>>>>>>>>>>>>>>>>>> coredump and I've sent >>>>>>>>>>>>>>>>>>>>>>>>>> review >>>>>>>>>>>>>>>>>>>>>>>>>> request for it as JDK-8151815. However it has >>>>>>>>>>>>>>>>>>>>>>>>>> not been reviewed >>>>>>>>>>>>>>>>>>>>>>>>>> yet >>>>>>>>>>>>>>>>>>>>>>>>>> [1]. >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> We've discussed about safety implementation, >>>>>>>>>>>>>>>>>>>>>>>>>> but we could not >>>>>>>>>>>>>>>>>>>>>>>>>> get >>>>>>>>>>>>>>>>>>>>>>>>>> consensus. >>>>>>>>>>>>>>>>>>>>>>>>>> IMHO all SA tools should be handled java >>>>>>>>>>>>>>>>>>>>>>>>>> processes and core >>>>>>>>>>>>>>>>>>>>>>>>>> images, >>>>>>>>>>>>>>>>>>>>>>>>>> and PerfCounter value is useful. So I fix this >>>>>>>>>>>>>>>>>>>>>>>>>> issue. >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> I uploaded new webrev for this issue. I think >>>>>>>>>>>>>>>>>>>>>>>>>> this patch is >>>>>>>>>>>>>>>>>>>>>>>>>> safety >>>>>>>>>>>>>>>>>>>>>>>>>> because new flag PerfMemory::_destroyed guards >>>>>>>>>>>>>>>>>>>>>>>>>> double free, and >>>>>>>>>>>>>>>>>>>>>>>>>> all >>>>>>>>>>>>>>>>>>>>>>>>>> members in PerfMemory is accessible (they are >>>>>>>>>>>>>>>>>>>>>>>>>> not munmap'ed) >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.03/ >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> Can you cooperate? >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> [1] >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> http://mail.openjdk.java.net/pipermail/serviceability-dev/2016-April/019480.html >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>> >>>>>> >>> From david.holmes at oracle.com Thu Oct 19 12:27:41 2017 From: david.holmes at oracle.com (David Holmes) Date: Thu, 19 Oct 2017 22:27:41 +1000 Subject: RFR: 8189373: jmap -heap exited with error code In-Reply-To: <17635b8d-7071-acbe-469e-688233862033@redhat.com> References: <5f387be5-bd8a-91c6-39d4-871a74e80355@redhat.com> <6b12f58a-35eb-d7a1-1684-64ed6b73708b@oracle.com> <17635b8d-7071-acbe-469e-688233862033@redhat.com> Message-ID: On 19/10/2017 9:28 PM, Roman Kennke wrote: > Hi David, hi Jini, > > thank you both for review! > > Now I need a sponsor. Final patch (incl. summary and reviewed-by): > http://cr.openjdk.java.net/~rkennke/8189373/webrev.01/ > I will sponsor this. Thanks, David > Thank you! > Roman > >> I also took a look and it seemed okay - Jini is the expert so that's >> good enough for me. :) >> >> Reviewed. >> >> David >> >> On 19/10/2017 8:57 PM, Jini George wrote: >>> Your changes look good to me, Roman. >>> >>> Nit: Please do change the copyright year to 2017 in >>> CollectedHeapName.java. >>> >>> Thank you, >>> Jini (not a Reviewer). >>> >>> On 10/18/2017 11:59 PM, Roman Kennke wrote: >>>> My recent CMSHeap extraction has broken the JVM servicability agent. >>>> It looks like I actually need a little bit more boilerplate to make >>>> it happy: >>>> >>>> http://cr.openjdk.java.net/~rkennke/8189373/webrev.00/ >>>> >>>> >>>> It does fix the test that's mentioned in the bug report: >>>> >>>> https://bugs.openjdk.java.net/browse/JDK-8189373 >>>> >>>> Is this the correct way to fix it? >>>> >>>> Roman >>>> >>>> > From rkennke at redhat.com Thu Oct 19 12:39:02 2017 From: rkennke at redhat.com (Roman Kennke) Date: Thu, 19 Oct 2017 14:39:02 +0200 Subject: RFR: 8189373: jmap -heap exited with error code In-Reply-To: References: <5f387be5-bd8a-91c6-39d4-871a74e80355@redhat.com> <6b12f58a-35eb-d7a1-1684-64ed6b73708b@oracle.com> <17635b8d-7071-acbe-469e-688233862033@redhat.com> Message-ID: Am 19.10.2017 um 14:27 schrieb David Holmes: > On 19/10/2017 9:28 PM, Roman Kennke wrote: >> Hi David, hi Jini, >> >> thank you both for review! >> >> Now I need a sponsor. Final patch (incl. summary and reviewed-by): >> http://cr.openjdk.java.net/~rkennke/8189373/webrev.01/ >> > > I will sponsor this. Thank you!! Roman From yasuenag at gmail.com Thu Oct 19 13:43:15 2017 From: yasuenag at gmail.com (Yasumasa Suenaga) Date: Thu, 19 Oct 2017 22:43:15 +0900 Subject: PING: RFR: JDK-8151815: Could not parse core image with JSnap. In-Reply-To: References: <4370fcb0-d06d-865b-8bab-e03ec1812e89@oracle.com> <84922d13-79ac-8904-1885-79ee0c2b4232@oracle.com> <967acad8-8378-cf99-0ff9-ba3852d51dd3@oracle.com> <41185116-0271-374c-da9e-74e173d6b58c@oracle.com> <7cb41834-75fa-9dc5-4f6d-c2b85f84dcee@oracle.com> <78b0bfca-1259-e5ef-1189-a7646d7fea36@oracle.com> <7b8ae324-c590-bb41-4bb6-2b4d18b12267@gmail.com> <024d8188-8d54-3345-d3b6-e757f4cafb6e@oracle.com> <38f55e3a-5053-f977-980e-59687cca5fd9@gmail.com> <8e2c7e3f-2f3a-7168-712b-2642a4e88239@oracle.com> <05720ca7-73ef-2761-e8f5-1d0fc45c8bdc@oracle.com> <6bffc41e-80e6-6863-f46a-fba7ba66dc2a@gmail.com> Message-ID: <4b785c83-d5cc-9aab-5186-39765d964e4a@gmail.com> Sorry, I forgot the fix to use OrderAccess::load_acquire() in PerfMemory::is_initialized(). I fixed it in new webrev. Could you review again? http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.09/ Yasumasa On 2017/10/19 21:24, David Holmes wrote: > On 19/10/2017 9:44 PM, Yasumasa Suenaga wrote: >> Hi, >> >>> I suggest we leave the volatile off for now and file a RFE to add volatile_static_field support to VMStructs and update later. >> >> Okay. David or Serguei, could you file it? >> >> >>>> I'd suggest to fall back to your previous approach as synchronization was not there >>>> in the first place, and it is not a part of the original issue you are trying to fix >>>> (if David or anyone else does not a simple solution). >> >>> I don't think trying to introduce locking would be a good idea as it would likely lead to deadlocks when a crash occurs. This could also be investigated as a future RFE if desired. >> >> Sorry, I have mistake the spell of "usable". >> I've fixed it in new webrev: >> >> ?? http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.08/ >> >> Can I list David and Serguei as Reviewer? >> I will send a changeset to Serguei if it can. > > Yes. > > Thanks, > David > >> >> Thanks, >> >> Yasumasa >> >> >> On 2017/10/19 19:41, David Holmes wrote: >>> Hi Serguei, Yasumasa, >>> >>> I suggest we leave the volatile off for now and file a RFE to add volatile_static_field support to VMStructs and update later. >>> >>> I don't think trying to introduce locking would be a good idea as it would likely lead to deadlocks when a crash occurs. This could also be investigated as a future RFE if desired. >>> >>> Thanks, >>> David >>> >>> On 19/10/2017 7:37 PM, serguei.spitsyn at oracle.com wrote: >>>> Hi Yasumasa, >>>> >>>> I see the problem. >>>> As it occurred making these variables volatile is non-trivial. >>>> But thank you a lot for trying! >>>> >>>> I'd suggest to fall back to your previous approach as synchronization was not there >>>> in the first place, and it is not a part of the original issue you are trying to fix >>>> (if David or anyone else does not a simple solution). >>>> But let's check if David does not object against it. >>>> >>>> I will sponsor your fix after you send me a patch. >>>> >>>> Thanks, >>>> Serguei >>>> >>>> >>>> On 10/18/17 20:21, Yasumasa Suenaga wrote: >>>>> Sorry, I have mistake. >>>>> But I cannot compile yet: >>>>> >>>>> diff -r 3e7702cd3f19 src/hotspot/share/runtime/vmStructs.cpp >>>>> --- a/src/hotspot/share/runtime/vmStructs.cpp?? Thu Sep 07 15:40:20 2017 +0200 >>>>> +++ b/src/hotspot/share/runtime/vmStructs.cpp?? Thu Oct 19 12:21:11 2017 +0900 >>>>> @@ -578,7 +578,7 @@ >>>>> ????? static_field(PerfMemory, _top, char*)???????????????????????????????? \ >>>>> ????? static_field(PerfMemory, _capacity, size_t)??????????????????????????????? \ >>>>> ????? static_field(PerfMemory, _prologue, PerfDataPrologue*)???????????????????? \ >>>>> -???? static_field(PerfMemory, _initialized, jint)????????????????????????????????? \ >>>>> +???? static_field(PerfMemory, _initialized,????????????????????????????????? volatile jint)??????????????????????????? \ >>>>> >>>>> -------------- >>>>> In file included from /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:104:0: >>>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.hpp:168:58: error: invalid conversion from 'volatile void*' to 'void*' [-fpermissive] >>>>> ? { QUOTE(typeName), QUOTE(fieldName), QUOTE(type), 1, 0, &typeName::fieldName }, >>>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:581:6: note: in expansion of macro 'GENERATE_STATIC_VM_STRUCT_ENTRY' >>>>> ????? static_field(PerfMemory, _initialized,????????????????????????????????? volatile jint)??????????????????????????? \ >>>>> ????? ^~~~~~~~~~~~ >>>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:2934:3: note: in expansion of macro 'VM_STRUCTS' >>>>> ?? VM_STRUCTS(GENERATE_NONSTATIC_VM_STRUCT_ENTRY, >>>>> ?? ^ >>>>> gmake[3]: *** [lib/CompileJvm.gmk:210: /home/ysuenaga/OpenJDK/jdk10-hs/build/linux-x86_64-normal-server-fastdebug/hotspot/variant-server/libjvm/objs/vmStructs.o] Error 1 >>>>> gmake[2]: *** [make/Main.gmk:266: hotspot-server-libs] Error 2 >>>>> >>>>> ERROR: Build failed for target 'images' in configuration 'linux-x86_64-normal-server-fastdebug' (exit code 2) >>>>> -------------- >>>>> >>>>> >>>>> >>>>> On 2017/10/19 12:18, Yasumasa Suenaga wrote: >>>>>> Hi Serguei, >>>>>> >>>>>>> Would the below work? : >>>>>>> >>>>>>> ? 578????? static_field(PerfMemory, _initialized, volatile jint)????????????????????????????????? \ >>>>>>> >>>>>>> It'd be similar to this non-static case: >>>>>>> ? 362?? nonstatic_field(ConstantPoolCacheEntry, _f1,????????????????????????????????? volatile Metadata*)??????????????????? \ >>>>>> >>>>>> I got error messages as below: >>>>>> >>>>>> --------------- >>>>>> In file included from /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:104:0: >>>>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:581:39: error: expected unqualified-id before 'volatile' >>>>>> ?????? static_field(PerfMemory,???????? volatile _initialized, jint)????????????????????????????????? \ >>>>>> ??????????????????????????????????????? ^ >>>>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.hpp:168:69: note: in definition of macro 'GENERATE_STATIC_VM_STRUCT_ENTRY' >>>>>> ?? { QUOTE(typeName), QUOTE(fieldName), QUOTE(type), 1, 0, &typeName::fieldName }, >>>>>> ^~~~~~~~~ >>>>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:2934:3: note: in expansion of macro 'VM_STRUCTS' >>>>>> ??? VM_STRUCTS(GENERATE_NONSTATIC_VM_STRUCT_ENTRY, >>>>>> ??? ^ >>>>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:581:39: error: expected '}' before 'volatile' >>>>>> ?????? static_field(PerfMemory,???????? volatile _initialized, jint)????????????????????????????????? \ >>>>>> ??????????????????????????????????????? ^ >>>>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.hpp:168:69: note: in definition of macro 'GENERATE_STATIC_VM_STRUCT_ENTRY' >>>>>> ?? { QUOTE(typeName), QUOTE(fieldName), QUOTE(type), 1, 0, &typeName::fieldName }, >>>>>> ^~~~~~~~~ >>>>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:2934:3: note: in expansion of macro 'VM_STRUCTS' >>>>>> ??? VM_STRUCTS(GENERATE_NONSTATIC_VM_STRUCT_ENTRY, >>>>>> ??? ^ >>>>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:581:39: error: expected '}' before 'volatile' >>>>>> ?????? static_field(PerfMemory,???????? volatile _initialized, jint)????????????????????????????????? \ >>>>>> ??????????????????????????????????????? ^ >>>>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.hpp:168:69: note: in definition of macro 'GENERATE_STATIC_VM_STRUCT_ENTRY' >>>>>> ?? { QUOTE(typeName), QUOTE(fieldName), QUOTE(type), 1, 0, &typeName::fieldName }, >>>>>> ^~~~~~~~~ >>>>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:2934:3: note: in expansion of macro 'VM_STRUCTS' >>>>>> ??? VM_STRUCTS(GENERATE_NONSTATIC_VM_STRUCT_ENTRY, >>>>>> ??? ^ >>>>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.hpp:168:79: error: expected declaration before '}' token >>>>>> ?? { QUOTE(typeName), QUOTE(fieldName), QUOTE(type), 1, 0, &typeName::fieldName }, >>>>>> ^ >>>>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:581:6: note: in expansion of macro 'GENERATE_STATIC_VM_STRUCT_ENTRY' >>>>>> ?????? static_field(PerfMemory,???????? volatile _initialized, jint)????????????????????????????????? \ >>>>>> ?????? ^~~~~~~~~~~~ >>>>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:2934:3: note: in expansion of macro 'VM_STRUCTS' >>>>>> ??? VM_STRUCTS(GENERATE_NONSTATIC_VM_STRUCT_ENTRY, >>>>>> ??? ^ >>>>>> gmake[3]: *** [lib/CompileJvm.gmk:210: /home/ysuenaga/OpenJDK/jdk10-hs/build/linux-x86_64-normal-server-fastdebug/hotspot/variant-server/libjvm/objs/vmStructs.o] Error 1 >>>>>> gmake[2]: *** [make/Main.gmk:266: hotspot-server-libs] Error 2 >>>>>> >>>>>> ERROR: Build failed for target 'images' in configuration 'linux-x86_64-normal-server-fastdebug' (exit code 2) >>>>>> --------------- >>>>>> >>>>>> >>>>>> I changed as below: >>>>>> --------------- >>>>>> diff -r 3e7702cd3f19 src/hotspot/share/runtime/perfMemory.cpp >>>>>> --- a/src/hotspot/share/runtime/perfMemory.cpp? Thu Sep 07 15:40:20 2017 +0200 >>>>>> +++ b/src/hotspot/share/runtime/perfMemory.cpp? Thu Oct 19 12:15:30 2017 +0900 >>>>>> @@ -51,8 +51,9 @@ >>>>>> ??char*??????????????????? PerfMemory::_end = NULL; >>>>>> ??char*??????????????????? PerfMemory::_top = NULL; >>>>>> ??size_t?????????????????? PerfMemory::_capacity = 0; >>>>>> -jint???????????????????? PerfMemory::_initialized = false; >>>>>> +volatile jint??????????? PerfMemory::_initialized = 0; >>>>>> ??PerfDataPrologue*??????? PerfMemory::_prologue = NULL; >>>>>> +volatile bool??????????? PerfMemory::_destroyed = false; >>>>>> >>>>>> --- a/src/hotspot/share/runtime/perfMemory.hpp? Thu Sep 07 15:40:20 2017 +0200 >>>>>> +++ b/src/hotspot/share/runtime/perfMemory.hpp? Thu Oct 19 12:15:30 2017 +0900 >>>>>> @@ -113,13 +113,15 @@ >>>>>> ?? */ >>>>>> ??class PerfMemory : AllStatic { >>>>>> ????? friend class VMStructs; >>>>>> +??? friend class PerfMemoryTest; >>>>>> ??? private: >>>>>> ????? static char*? _start; >>>>>> ????? static char*? _end; >>>>>> ????? static char*? _top; >>>>>> ????? static size_t _capacity; >>>>>> ????? static PerfDataPrologue*? _prologue; >>>>>> -??? static jint?? _initialized; >>>>>> +??? static volatile jint????? _initialized; >>>>>> +??? static volatile bool????? _destroyed; >>>>>> >>>>>> diff -r 3e7702cd3f19 src/hotspot/share/runtime/vmStructs.cpp >>>>>> --- a/src/hotspot/share/runtime/vmStructs.cpp?? Thu Sep 07 15:40:20 2017 +0200 >>>>>> +++ b/src/hotspot/share/runtime/vmStructs.cpp?? Thu Oct 19 12:15:30 2017 +0900 >>>>>> @@ -578,7 +578,7 @@ >>>>>> ?????? static_field(PerfMemory, _top, char*)???????????????????????????????? \ >>>>>> ?????? static_field(PerfMemory, _capacity, size_t)??????????????????????????????? \ >>>>>> ?????? static_field(PerfMemory, _prologue, PerfDataPrologue*)???????????????????? \ >>>>>> -???? static_field(PerfMemory, _initialized, jint)????????????????????????????????? \ >>>>>> +???? static_field(PerfMemory,???????? volatile _initialized, jint)????????????????????????????????? \ >>>>>> --------------- >>>>>> >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Yasumasa >>>>>> >>>>>> >>>>>> On 2017/10/19 6:18, serguei.spitsyn at oracle.com wrote: >>>>>>> On 10/18/17 06:51, Yasumasa Suenaga wrote: >>>>>>>> Hi David, Serguei, >>>>>>>> >>>>>>>>> because as soon as we have checked is_usable() and abort happening in another thread may have changed that by calling destroy. >>>>>>>>> >>>>>>>>> This code is basically broken if we hit an abort path instead of a normal VM shutdown. >>>>>>>> >>>>>>>> Can we use MutexLocker for initialize() and destroy() ? >>>>>>>> >>>>>>>> >>>>>>>> I've tried to fix about your comments, but I have an issue about volatile. >>>>>>>> PerfMemory.java depends on PerfMemory::_initialized. However VMStructs cannot handle static volatile variables. >>>>>>>> I think two approaches as below: >>>>>>>> >>>>>>>> >>>>>>>> ? 1. Remove _initialized check from PerfMemory.java >>>>>>>> ???? SA will throw UnmappedAddressException if JSnap try to access invalid address including uninitialized memory. >>>>>>>> >>>>>>>> ? 2. Add static volatile support to VMStructs >>>>>>>> >>>>>>>> >>>>>>>> Which should we do? >>>>>>>> 1. is easy to fix. But 2. might be right way... >>>>>>> >>>>>>> Would the below work? : >>>>>>> >>>>>>> ??578????? static_field(PerfMemory, _initialized, volatile jint)????????????????????????????????? \ >>>>>>> >>>>>>> It'd be similar to this non-static case: >>>>>>> ??362?? nonstatic_field(ConstantPoolCacheEntry, _f1,????????????????????????????????? volatile Metadata*)??????????????????? \ >>>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> Serguei >>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Yasumasa >>>>>>>> >>>>>>>> >>>>>>>> On 2017/10/18 21:34, David Holmes wrote: >>>>>>>>> Just to clarify ... >>>>>>>>> >>>>>>>>> On 18/10/2017 10:28 PM, David Holmes wrote: >>>>>>>>>> On 18/10/2017 8:26 PM, serguei.spitsyn at oracle.com wrote: >>>>>>>>>>> Hi David, >>>>>>>>>>> >>>>>>>>>>> Thank you for jumping to this review and helping Yasumasa to sort it out! >>>>>>>>>>> I've just discovered that this issue was already on the table for several months without a significant progress. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 10/18/17 02:48, David Holmes wrote: >>>>>>>>>>>> Hi Serguei >>>>>>>>>>>> >>>>>>>>>>>> On 18/10/2017 7:25 PM, serguei.spitsyn at oracle.com wrote: >>>>>>>>>>>>> Hi Yasumasa, >>>>>>>>>>>>> >>>>>>>>>>>>> Sorry for a quite late participation. >>>>>>>>>>>>> >>>>>>>>>>>>> I looked at the previous webrevs and think that this one is much better. >>>>>>>>>>>>> >>>>>>>>>>>>> Some concern is if we need any kind of synchronization here, e.g. CAS. >>>>>>>>>>>>> But it depends on the PerfMemory class usage. >>>>>>>>>>>>> >>>>>>>>>>>>> Should we make the static variables '_initialized' and '_destroyed' volatile? >>>>>>>>>>>> >>>>>>>>>>>> For good measure - yes. >>>>>>>>>>>> >>>>>>>>>>>>> Also, the '_initialized' is set to 1 with: >>>>>>>>>>>>> ??? 159 OrderAccess::release_store(&_initialized, 1); >>>>>>>>>>>>> >>>>>>>>>>>>> Should we do the same to set the '_destroyed'?: >>>>>>>>>>>>> 200 _destroyed = true; >>>>>>>>>>>> >>>>>>>>>>>> There is a benign initialization race but we need the release_store to ensure all the data fields can be read if _initialized is seen as true. But what is missing is a load_acquire() in is_initialized() to ensure we synchronize with that store! >>>>>>>>>>> >>>>>>>>>>> Yes, I noticed that the load_acquire() is missed. :| >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> There is also a potential for a destruction race (if multiple aborts happens concurrently in different threads) but that also seems benign. In this case there is no data being set so the store to _destroyed does not need to be a release_store. >>>>>>>>>>> >>>>>>>>>>> I'm not convinced yet this is benign as the PerfMemory::destroy() has this call: >>>>>>>>>>> ?? 197 delete_memory_region(); >>>>>>>>>> >>>>>>>>>> Yes though most of its work ends up being no-ops. >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Now, I started thinking about the asserts that call the is_useable(). >>>>>>>>>>> Should they be returns instead? >>>>>>>>>> >>>>>>>>>> I think this is a somewhat confused chunk of code. It's only fractionally thread-safe yet once in use could be in use concurrently with an aborting thread that calls destroy(). I don't think there is any simple fix for this. If we're in the process of crashing does it really matter if we trigger a secondary crash due to this? >>>>>>>>> >>>>>>>>> It doesn't matter if we do: >>>>>>>>> >>>>>>>>> assert(is_usable(),...); >>>>>>>>> // continue >>>>>>>>> >>>>>>>>> or >>>>>>>>> >>>>>>>>> if (!is_usable()) return; >>>>>>>>> // continue >>>>>>>>> >>>>>>>>> because as soon as we have checked is_usable() and abort happening in another thread may have changed that by calling destroy. >>>>>>>>> >>>>>>>>> This code is basically broken if we hit an abort path instead of a normal VM shutdown. >>>>>>>>> >>>>>>>>> David >>>>>>>>> ----- >>>>>>>>> >>>>>>>>>> The problems with this code go way beyond what Yasumasa is trying to address with the JSnap problem and I would not want to put it back on him to try and come up with an overall solution. >>>>>>>>>> >>>>>>>>>>> Then the is_destroyed() would better to have the load_acquire(). >>>>>>>>>> >>>>>>>>>> You could add a load_acquire and do the store_release. It certainly would not hurt, but I don't think it would actually benefit anything either. >>>>>>>>>> >>>>>>>>>> Cheers, >>>>>>>>>> David >>>>>>>>>> >>>>>>>>>>> Just interested to know what do you think on this. >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Serguei >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Cheers, >>>>>>>>>>>> David >>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> Serguei >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On 10/18/17 00:39, Yasumasa Suenaga wrote: >>>>>>>>>>>>>> Hi David, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thank you for your comment. >>>>>>>>>>>>>> I uploaded new webrev: >>>>>>>>>>>>>> >>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.07/ >>>>>>>>>>>>>> >>>>>>>>>>>>>> Serguei, please comment about this :-) >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> 2017-10-18 16:09 GMT+09:00 David Holmes: >>>>>>>>>>>>>>> Hi Yasumasa, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On 18/10/2017 4:34 PM, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>> Hi David, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I don't think we need the extra fields, just ensure the existing ones >>>>>>>>>>>>>>>>> can't >>>>>>>>>>>>>>>>> be accessed (other than by the tools) after destroy is called. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I've added PerfMemory::is_useable() to check whether we can access to >>>>>>>>>>>>>>>> PerfMemory. >>>>>>>>>>>>>>>> I think this webrev prevent to access to PerfMemory after destroy() call. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.06/ >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> This: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> ?? 90 void PerfMemory::initialize() { >>>>>>>>>>>>>>> ?? 91 >>>>>>>>>>>>>>> ?? 92?? if (_prologue != NULL) >>>>>>>>>>>>>>> ?? 93???? // initialization already performed >>>>>>>>>>>>>>> ?? 94???? return; >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> shouldn't check _prologue, but is_initialized(). >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> ? 213?? assert(is_useable(), "called before initialization"); >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -> "called before init or after destroy" >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Could add a similar assert in PerfMemory::mark_updated(). >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Let's see what Serguei thinks. :) >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>> David >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> 2017-10-18 13:44 GMT+09:00 David Holmes: >>>>>>>>>>>>>>>>> On 18/10/2017 2:27 PM, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Hi David, >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> 2017-10-18 12:55 GMT+09:00 David Holmes: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> On 18/10/2017 12:37 PM, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Hi David, >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> With your changes you no longer null out _prologue so the assertion >>>>>>>>>>>>>>>>>>>>> would >>>>>>>>>>>>>>>>>>>>> now not fail and we'd proceed to access the deleted memory region! >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> On Linux, PerfMemory::delete_memory_region() does not call munmap() >>>>>>>>>>>>>>>>>>>> for PerfMemory. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Perhaps not but there are still other actions that happen and the point >>>>>>>>>>>>>>>>>>> is >>>>>>>>>>>>>>>>>>> we should not be able to continue to use PerfMemory once it has been >>>>>>>>>>>>>>>>>>> destroyed (even if the destruction is only logical). >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> I received same comment from Dmitry in the past, but we couldn't >>>>>>>>>>>>>>>>>> decide how should we do. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> http://mail.openjdk.java.net/pipermail/serviceability-dev/2016-May/019728.html >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> In that discussion, I uploaded another webrev which adds other fields >>>>>>>>>>>>>>>>>> for >>>>>>>>>>>>>>>>>> JSnap. >>>>>>>>>>>>>>>>>> Is it suitable? >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.02/ >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I don't think we need the extra fields, just ensure the existing ones >>>>>>>>>>>>>>>>> can't >>>>>>>>>>>>>>>>> be accessed (other than by the tools) after destroy is called. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> I'm unclear why you no longer clear all the fields set during >>>>>>>>>>>>>>>>>>>>> initialization? >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> PerfMemory.java in jdk.hotspot.agent needs these field values. >>>>>>>>>>>>>>>>>>>> `jhsdb jsnap --core` is failed if they are cleared. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> I'm not familiar with these tools. When do we produce a core file after >>>>>>>>>>>>>>>>>>> calling PerfMemory::destroy ? >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> PerfMemory::destroy() is called before aborting. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Ah - right. I assume we need to close off the perfdata file before we >>>>>>>>>>>>>>>>> abort. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> ----------------------- >>>>>>>>>>>>>>>>>> #0? perfMemory_exit () >>>>>>>>>>>>>>>>>> ?????? at >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/share/vm/runtime/perfMemory.cpp:80 >>>>>>>>>>>>>>>>>> #1? 0x00007f99b091c949 in os::shutdown () >>>>>>>>>>>>>>>>>> ?????? at >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:1483 >>>>>>>>>>>>>>>>>> #2? 0x00007f99b091c980 in os::abort (dump_core=) >>>>>>>>>>>>>>>>>> ?????? at >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:1503 >>>>>>>>>>>>>>>>>> #3? 0x00007f99b0b689c3 in VMError::report_and_die ( >>>>>>>>>>>>>>>>>> ?????? this=this at entry=0x7ffcacf40b50) >>>>>>>>>>>>>>>>>> ?????? at >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/share/vm/utilities/vmError.cpp:1060 >>>>>>>>>>>>>>>>>> #4? 0x00007f99b0926f04 in JVM_handle_linux_signal (sig=sig at entry=11, >>>>>>>>>>>>>>>>>> ?????? info=info at entry=0x7ffcacf40df0, >>>>>>>>>>>>>>>>>> ucVoid=ucVoid at entry=0x7ffcacf40cc0, >>>>>>>>>>>>>>>>>> abort_if_unrecognized=abort_if_unrecognized at entry=1) >>>>>>>>>>>>>>>>>> ?????? at >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/os_cpu/linux_x86/vm/os_linux_x86.cpp:541 >>>>>>>>>>>>>>>>>> ----------------------- >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> But it seems to me that there are various checks of >>>>>>>>>>>>>>>>>>>>> _prologue that should really be checking is_initialized() and/or >>>>>>>>>>>>>>>>>>>>> is_destroyed() as a guard. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Should I change all assertions for _prologue? >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Assertions and direct guards. Checking _prologue is a placeholder for >>>>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>> real check. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> 2017-10-18 10:53 GMT+09:00 David Holmes: >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Hi Yasumasa, >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> By chance we ran into this bug which I analysed yesterday: >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8189390 >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> We hit the assertion: >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> #? Internal Error >>>>>>>>>>>>>>>>>>>>> (/open/src/hotspot/share/runtime/perfMemory.cpp:216), >>>>>>>>>>>>>>>>>>>>> pid=17874, tid=17875 >>>>>>>>>>>>>>>>>>>>> #? assert(_prologue != __null) failed: called before initialization >>>>>>>>>>>>>>>>>>>>> # >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> which is misleading because it can fail if called before >>>>>>>>>>>>>>>>>>>>> initialization, >>>>>>>>>>>>>>>>>>>>> or >>>>>>>>>>>>>>>>>>>>> after PerfMemory::destroy has been called. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> With your changes you no longer null out _prologue so the assertion >>>>>>>>>>>>>>>>>>>>> would >>>>>>>>>>>>>>>>>>>>> now not fail and we'd proceed to access the deleted memory region! >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> I'm unclear why you no longer clear all the fields set during >>>>>>>>>>>>>>>>>>>>> initialization? But it seems to me that there are various checks of >>>>>>>>>>>>>>>>>>>>> _prologue that should really be checking is_initialized() and/or >>>>>>>>>>>>>>>>>>>>> is_destroyed() as a guard. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> On 16/10/2017 11:25 PM, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> PING: >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Could you review it? >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.05/ >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> On 2017/10/03 13:18, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> I added gtest unit test case for this change in new webrev: >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.05/ >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Could you review it? >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> 2017-09-27 0:01 GMT+09:00 Yasumasa Suenaga: >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> I uploaded new webrev to be adapted to jdk10/hs: >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.04/ >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> On 2017/09/21 7:45, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> PING: >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Have you checked this issue? >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.03/ >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> On 2017/07/01 23:43, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> PING: >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> Have you checked this issue? >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> On 2017/06/13 14:10, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> I want to discuss about JDK-8151815: Could not parse core image >>>>>>>>>>>>>>>>>>>>>>>>>>> with >>>>>>>>>>>>>>>>>>>>>>>>>>> JSnap. >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> In last year, I found JSnap cannot parse coredump and I've sent >>>>>>>>>>>>>>>>>>>>>>>>>>> review >>>>>>>>>>>>>>>>>>>>>>>>>>> request for it as JDK-8151815. However it has not been reviewed >>>>>>>>>>>>>>>>>>>>>>>>>>> yet >>>>>>>>>>>>>>>>>>>>>>>>>>> [1]. >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> We've discussed about safety implementation, but we could not >>>>>>>>>>>>>>>>>>>>>>>>>>> get >>>>>>>>>>>>>>>>>>>>>>>>>>> consensus. >>>>>>>>>>>>>>>>>>>>>>>>>>> IMHO all SA tools should be handled java processes and core >>>>>>>>>>>>>>>>>>>>>>>>>>> images, >>>>>>>>>>>>>>>>>>>>>>>>>>> and PerfCounter value is useful. So I fix this issue. >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> I uploaded new webrev for this issue. I think this patch is >>>>>>>>>>>>>>>>>>>>>>>>>>> safety >>>>>>>>>>>>>>>>>>>>>>>>>>> because new flag PerfMemory::_destroyed guards double free, and >>>>>>>>>>>>>>>>>>>>>>>>>>> all >>>>>>>>>>>>>>>>>>>>>>>>>>> members in PerfMemory is accessible (they are not munmap'ed) >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.03/ >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> Can you cooperate? >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> [1] >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> http://mail.openjdk.java.net/pipermail/serviceability-dev/2016-April/019480.html >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>> >>>>>>> >>>> From erik.gahlin at oracle.com Thu Oct 19 15:14:52 2017 From: erik.gahlin at oracle.com (Erik Gahlin) Date: Thu, 19 Oct 2017 17:14:52 +0200 Subject: RFR(S): 8189425: Minor updates in support of closed changes In-Reply-To: References: <59E6BB27.8020605@oracle.com> <9acd0472-de6b-4f6f-454e-7a325492af15@oracle.com> <59E7B3E6.8000101@oracle.com> Message-ID: <59E8C16C.10604@oracle.com> Thanks for the review, David and Markus! Erik > Hi Erik, > > Looks good. > > Thanks > Markus > > -----Original Message----- > From: Erik Gahlin > Sent: den 18 oktober 2017 22:05 > To: David Holmes; serviceability-dev at openjdk.java.net > Subject: Re: RFR(S): 8189425: Minor updates in support of closed changes > > Hi David, > >> Hi Erik, >> >> On 18/10/2017 12:23 PM, Erik Gahlin wrote: >>> Hi, >>> >>> Could I have a review of this change that will adjust an assertion >>> and >> Can you explain the adjustment please. > We have closed code that modifies the mark word and then changes it back during a safepoint. When the mark word is modified, we reuse GC infrastructure that run into the assert. If we change the assert to ignore checking that the mark word is NULL, we don't run into the problem. > >>> remove a lock associated with JFR. > I forgot to modify the header file, see updated webrev. > > http://cr.openjdk.java.net/~egahlin/8189425_1/ > > I also made a change to GrowableArray, the insert_sorted method now takes a const. > > Thanks > Erik > >> That bit is fine :) >> >> Thanks, >> David >> >>> Webrev: >>> http://cr.openjdk.java.net/~egahlin/8189425_0 >>> >>> Bug: >>> https://bugs.openjdk.java.net/browse/JDK-8189425 >>> >>> Thanks >>> Erik >>> >>> From ben_walsh at uk.ibm.com Thu Oct 19 10:07:37 2017 From: ben_walsh at uk.ibm.com (Ben Walsh) Date: Thu, 19 Oct 2017 11:07:37 +0100 Subject: [PATCH] Unnecessary Amount Of Internal Class Conversion Message-ID: Per Alan's request here ( http://mail.openjdk.java.net/pipermail/core-libs-dev/2017-October/049532.html ), I am redirecting my initial email to this mailing list ... I have observed a problem where an unnecessary amount of internal class conversion is occurring. I have a patch which I would like to contribute which represents a performance optimisation for this problem in the java.instrument module implementation. It avoids having to convert all classes from a JVM's internal class format to the .class file format, in order to call the ClassFileLoadHook, when no java transformer installed. I would like to pair with a sponsor who could host and review this patch, so I can get it contributed. Regards, Ben Walsh Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU From rkennke at redhat.com Thu Oct 19 16:39:24 2017 From: rkennke at redhat.com (Roman Kennke) Date: Thu, 19 Oct 2017 18:39:24 +0200 Subject: RFR: 8183542: Factor out serial GC specific code from GenCollectedHeap into its own subclass In-Reply-To: <11759C2F-6F17-49C0-9D68-5B5B057C14B7@oracle.com> References: <6dea5306-1ec9-d439-b498-b9e657d7daf0@redhat.com> <3979E865-1363-4B60-ABDF-63128BAF04CE@oracle.com> <352c558d-40ba-27c4-861b-fcdb99ef706d@redhat.com> <487881EC-2CCD-47C4-BA7A-1723A40C1F9F@oracle.com> <11759C2F-6F17-49C0-9D68-5B5B057C14B7@oracle.com> Message-ID: <4f84ff16-3f69-d2cf-2eb3-2e820aad49af@redhat.com> Am 18.10.2017 um 22:41 schrieb Kim Barrett: >> On Oct 18, 2017, at 4:04 PM, Roman Kennke wrote: >> >> Am 18.10.2017 um 20:41 schrieb Kim Barrett: >>>> On Oct 18, 2017, at 8:08 AM, Roman Kennke wrote: >>>> Differential webrev: >>>> http://cr.openjdk.java.net/~rkennke/8183542/webrev.01.diff/ >>>> >>>> Full webrev: >>>> http://cr.openjdk.java.net/~rkennke/8183542/webrev.01/ >>>> >>>> Better now? >>>> >>>> Thanks, Roman >>> Looks good. >>> >> Hi Kim, >> >> thanks for the review. >> >> I just fixed a bug caused by my similar CMSHeap extraction, and I think I need to do the same thing for SerialHeap too: >> >> https://bugs.openjdk.java.net/browse/JDK-8189373 >> >> This is the fix for the CMSHeap issue: >> >> http://cr.openjdk.java.net/~rkennke/8189373/webrev.00/ >> >> I'll do the same for SerialHeap once the above has been approved and pushed, otherwise it'll be a mess. ;-) >> >> Roman > The SA strikes again! Yes, it looks like the same thing should be done for SerialHeap. > I?m going to leave the review of 8189373 to others who have more clue about the SA. > Okidoki, so here comes the SerialGC with SA boilerplate: Differential: http://cr.openjdk.java.net/~rkennke/8183542/webrev.02.diff/ Full: http://cr.openjdk.java.net/~rkennke/8183542/webrev.02/ This builds on top of the patch for https://bugs.openjdk.java.net/browse/JDK-8189373 which should land in the repo shortly, and implements the same thing for SerialHeap. It also passes the test that failed in the mentioned bug report (with -XX:+UseSerialGC). Can I get reviews (for the changed/added stuff) again? Thanks, Roman From serguei.spitsyn at oracle.com Thu Oct 19 17:49:29 2017 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 19 Oct 2017 10:49:29 -0700 Subject: PING: RFR: JDK-8151815: Could not parse core image with JSnap. In-Reply-To: <4b785c83-d5cc-9aab-5186-39765d964e4a@gmail.com> References: <84922d13-79ac-8904-1885-79ee0c2b4232@oracle.com> <967acad8-8378-cf99-0ff9-ba3852d51dd3@oracle.com> <41185116-0271-374c-da9e-74e173d6b58c@oracle.com> <7cb41834-75fa-9dc5-4f6d-c2b85f84dcee@oracle.com> <78b0bfca-1259-e5ef-1189-a7646d7fea36@oracle.com> <7b8ae324-c590-bb41-4bb6-2b4d18b12267@gmail.com> <024d8188-8d54-3345-d3b6-e757f4cafb6e@oracle.com> <38f55e3a-5053-f977-980e-59687cca5fd9@gmail.com> <8e2c7e3f-2f3a-7168-712b-2642a4e88239@oracle.com> <05720ca7-73ef-2761-e8f5-1d0fc45c8bdc@oracle.com> <6bffc41e-80e6-6863-f46a-fba7ba66dc2a@gmail.com> <4b785c83-d5cc-9aab-5186-39765d964e4a@gmail.com> Message-ID: <9869ac5d-7ae1-7378-7790-08e3571cdc4d@oracle.com> An HTML attachment was scrubbed... URL: From yasuenag at gmail.com Fri Oct 20 03:13:59 2017 From: yasuenag at gmail.com (Yasumasa Suenaga) Date: Fri, 20 Oct 2017 12:13:59 +0900 Subject: PING: RFR: JDK-8151815: Could not parse core image with JSnap. In-Reply-To: <9869ac5d-7ae1-7378-7790-08e3571cdc4d@oracle.com> References: <84922d13-79ac-8904-1885-79ee0c2b4232@oracle.com> <967acad8-8378-cf99-0ff9-ba3852d51dd3@oracle.com> <41185116-0271-374c-da9e-74e173d6b58c@oracle.com> <7cb41834-75fa-9dc5-4f6d-c2b85f84dcee@oracle.com> <78b0bfca-1259-e5ef-1189-a7646d7fea36@oracle.com> <7b8ae324-c590-bb41-4bb6-2b4d18b12267@gmail.com> <024d8188-8d54-3345-d3b6-e757f4cafb6e@oracle.com> <38f55e3a-5053-f977-980e-59687cca5fd9@gmail.com> <8e2c7e3f-2f3a-7168-712b-2642a4e88239@oracle.com> <05720ca7-73ef-2761-e8f5-1d0fc45c8bdc@oracle.com> <6bffc41e-80e6-6863-f46a-fba7ba66dc2a@gmail.com> <4b785c83-d5cc-9aab-5186-39765d964e4a@gmail.com> <9869ac5d-7ae1-7378-7790-08e3571cdc4d@oracle.com> Message-ID: <3ababa4f-f417-664c-c436-f9f1a5ff7c6c@gmail.com> >> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.09/ > > It looks good to me. > I've filed: > https://bugs.openjdk.java.net/browse/JDK-8189685 > need PerfMemory class update and a volatile_static_field support in VMStructs Thanks! Yasumasa On 2017/10/20 2:49, serguei.spitsyn at oracle.com wrote: > Hi Yasumasa, > > > On 10/19/17 06:43, Yasumasa Suenaga wrote: >> Sorry, I forgot the fix to use OrderAccess::load_acquire() in PerfMemory::is_initialized(). >> I fixed it in new webrev. Could you review again? >> >> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.09/ > > It looks good to me. > Thank you for your patience. > > >> Yasumasa >> >> >> On 2017/10/19 21:24, David Holmes wrote: >>> On 19/10/2017 9:44 PM, Yasumasa Suenaga wrote: >>>> Hi, >>>> >>>>> I suggest we leave the volatile off for now and file a RFE to add volatile_static_field support to VMStructs and update later. >>>> >>>> Okay. David or Serguei, could you file it? > > I've filed: > https://bugs.openjdk.java.net/browse/JDK-8189685 > ??? need PerfMemory class update and a volatile_static_field support in VMStructs > > Feel free to update it if necessary. > > Thanks, > Serguei > > >>>>>> I'd suggest to fall back to your previous approach as synchronization was not there >>>>>> in the first place, and it is not a part of the original issue you are trying to fix >>>>>> (if David or anyone else does not a simple solution). >>>> >>>>> I don't think trying to introduce locking would be a good idea as it would likely lead to deadlocks when a crash occurs. This could also be investigated as a future RFE if desired. >>>> >>>> Sorry, I have mistake the spell of "usable". >>>> I've fixed it in new webrev: >>>> >>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.08/ >>>> >>>> Can I list David and Serguei as Reviewer? >>>> I will send a changeset to Serguei if it can. >>> >>> Yes. >>> >>> Thanks, >>> David >>> >>>> >>>> Thanks, >>>> >>>> Yasumasa >>>> >>>> >>>> On 2017/10/19 19:41, David Holmes wrote: >>>>> Hi Serguei, Yasumasa, >>>>> >>>>> I suggest we leave the volatile off for now and file a RFE to add volatile_static_field support to VMStructs and update later. >>>>> >>>>> I don't think trying to introduce locking would be a good idea as it would likely lead to deadlocks when a crash occurs. This could also be investigated as a future RFE if desired. >>>>> >>>>> Thanks, >>>>> David >>>>> >>>>> On 19/10/2017 7:37 PM, serguei.spitsyn at oracle.com wrote: >>>>>> Hi Yasumasa, >>>>>> >>>>>> I see the problem. >>>>>> As it occurred making these variables volatile is non-trivial. >>>>>> But thank you a lot for trying! >>>>>> >>>>>> I'd suggest to fall back to your previous approach as synchronization was not there >>>>>> in the first place, and it is not a part of the original issue you are trying to fix >>>>>> (if David or anyone else does not a simple solution). >>>>>> But let's check if David does not object against it. >>>>>> >>>>>> I will sponsor your fix after you send me a patch. >>>>>> >>>>>> Thanks, >>>>>> Serguei >>>>>> >>>>>> >>>>>> On 10/18/17 20:21, Yasumasa Suenaga wrote: >>>>>>> Sorry, I have mistake. >>>>>>> But I cannot compile yet: >>>>>>> >>>>>>> diff -r 3e7702cd3f19 src/hotspot/share/runtime/vmStructs.cpp >>>>>>> --- a/src/hotspot/share/runtime/vmStructs.cpp?? Thu Sep 07 15:40:20 2017 +0200 >>>>>>> +++ b/src/hotspot/share/runtime/vmStructs.cpp?? Thu Oct 19 12:21:11 2017 +0900 >>>>>>> @@ -578,7 +578,7 @@ >>>>>>> ????? static_field(PerfMemory, _top, char*)???????????????????????????????? \ >>>>>>> ????? static_field(PerfMemory, _capacity, size_t)??????????????????????????????? \ >>>>>>> ????? static_field(PerfMemory, _prologue, PerfDataPrologue*)???????????????????? \ >>>>>>> -???? static_field(PerfMemory, _initialized, jint)????????????????????????????????? \ >>>>>>> +???? static_field(PerfMemory, _initialized,????????????????????????????????? volatile jint)??????????????????????????? \ >>>>>>> >>>>>>> -------------- >>>>>>> In file included from /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:104:0: >>>>>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.hpp:168:58: error: invalid conversion from 'volatile void*' to 'void*' [-fpermissive] >>>>>>> ? { QUOTE(typeName), QUOTE(fieldName), QUOTE(type), 1, 0, &typeName::fieldName }, >>>>>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:581:6: note: in expansion of macro 'GENERATE_STATIC_VM_STRUCT_ENTRY' >>>>>>> ????? static_field(PerfMemory, _initialized,????????????????????????????????? volatile jint)??????????????????????????? \ >>>>>>> ????? ^~~~~~~~~~~~ >>>>>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:2934:3: note: in expansion of macro 'VM_STRUCTS' >>>>>>> ?? VM_STRUCTS(GENERATE_NONSTATIC_VM_STRUCT_ENTRY, >>>>>>> ?? ^ >>>>>>> gmake[3]: *** [lib/CompileJvm.gmk:210: /home/ysuenaga/OpenJDK/jdk10-hs/build/linux-x86_64-normal-server-fastdebug/hotspot/variant-server/libjvm/objs/vmStructs.o] Error 1 >>>>>>> gmake[2]: *** [make/Main.gmk:266: hotspot-server-libs] Error 2 >>>>>>> >>>>>>> ERROR: Build failed for target 'images' in configuration 'linux-x86_64-normal-server-fastdebug' (exit code 2) >>>>>>> -------------- >>>>>>> >>>>>>> >>>>>>> >>>>>>> On 2017/10/19 12:18, Yasumasa Suenaga wrote: >>>>>>>> Hi Serguei, >>>>>>>> >>>>>>>>> Would the below work? : >>>>>>>>> >>>>>>>>> ? 578????? static_field(PerfMemory, _initialized, volatile jint)????????????????????????????????? \ >>>>>>>>> >>>>>>>>> It'd be similar to this non-static case: >>>>>>>>> ? 362?? nonstatic_field(ConstantPoolCacheEntry, _f1,????????????????????????????????? volatile Metadata*)??????????????????? \ >>>>>>>> >>>>>>>> I got error messages as below: >>>>>>>> >>>>>>>> --------------- >>>>>>>> In file included from /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:104:0: >>>>>>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:581:39: error: expected unqualified-id before 'volatile' >>>>>>>> ?????? static_field(PerfMemory,???????? volatile _initialized, jint)????????????????????????????????? \ >>>>>>>> ??????????????????????????????????????? ^ >>>>>>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.hpp:168:69: note: in definition of macro 'GENERATE_STATIC_VM_STRUCT_ENTRY' >>>>>>>> ?? { QUOTE(typeName), QUOTE(fieldName), QUOTE(type), 1, 0, &typeName::fieldName }, >>>>>>>> ^~~~~~~~~ >>>>>>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:2934:3: note: in expansion of macro 'VM_STRUCTS' >>>>>>>> ??? VM_STRUCTS(GENERATE_NONSTATIC_VM_STRUCT_ENTRY, >>>>>>>> ??? ^ >>>>>>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:581:39: error: expected '}' before 'volatile' >>>>>>>> ?????? static_field(PerfMemory,???????? volatile _initialized, jint)????????????????????????????????? \ >>>>>>>> ??????????????????????????????????????? ^ >>>>>>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.hpp:168:69: note: in definition of macro 'GENERATE_STATIC_VM_STRUCT_ENTRY' >>>>>>>> ?? { QUOTE(typeName), QUOTE(fieldName), QUOTE(type), 1, 0, &typeName::fieldName }, >>>>>>>> ^~~~~~~~~ >>>>>>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:2934:3: note: in expansion of macro 'VM_STRUCTS' >>>>>>>> ??? VM_STRUCTS(GENERATE_NONSTATIC_VM_STRUCT_ENTRY, >>>>>>>> ??? ^ >>>>>>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:581:39: error: expected '}' before 'volatile' >>>>>>>> ?????? static_field(PerfMemory,???????? volatile _initialized, jint)????????????????????????????????? \ >>>>>>>> ??????????????????????????????????????? ^ >>>>>>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.hpp:168:69: note: in definition of macro 'GENERATE_STATIC_VM_STRUCT_ENTRY' >>>>>>>> ?? { QUOTE(typeName), QUOTE(fieldName), QUOTE(type), 1, 0, &typeName::fieldName }, >>>>>>>> ^~~~~~~~~ >>>>>>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:2934:3: note: in expansion of macro 'VM_STRUCTS' >>>>>>>> ??? VM_STRUCTS(GENERATE_NONSTATIC_VM_STRUCT_ENTRY, >>>>>>>> ??? ^ >>>>>>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.hpp:168:79: error: expected declaration before '}' token >>>>>>>> ?? { QUOTE(typeName), QUOTE(fieldName), QUOTE(type), 1, 0, &typeName::fieldName }, >>>>>>>> ^ >>>>>>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:581:6: note: in expansion of macro 'GENERATE_STATIC_VM_STRUCT_ENTRY' >>>>>>>> ?????? static_field(PerfMemory,???????? volatile _initialized, jint)????????????????????????????????? \ >>>>>>>> ?????? ^~~~~~~~~~~~ >>>>>>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:2934:3: note: in expansion of macro 'VM_STRUCTS' >>>>>>>> ??? VM_STRUCTS(GENERATE_NONSTATIC_VM_STRUCT_ENTRY, >>>>>>>> ??? ^ >>>>>>>> gmake[3]: *** [lib/CompileJvm.gmk:210: /home/ysuenaga/OpenJDK/jdk10-hs/build/linux-x86_64-normal-server-fastdebug/hotspot/variant-server/libjvm/objs/vmStructs.o] Error 1 >>>>>>>> gmake[2]: *** [make/Main.gmk:266: hotspot-server-libs] Error 2 >>>>>>>> >>>>>>>> ERROR: Build failed for target 'images' in configuration 'linux-x86_64-normal-server-fastdebug' (exit code 2) >>>>>>>> --------------- >>>>>>>> >>>>>>>> >>>>>>>> I changed as below: >>>>>>>> --------------- >>>>>>>> diff -r 3e7702cd3f19 src/hotspot/share/runtime/perfMemory.cpp >>>>>>>> --- a/src/hotspot/share/runtime/perfMemory.cpp? Thu Sep 07 15:40:20 2017 +0200 >>>>>>>> +++ b/src/hotspot/share/runtime/perfMemory.cpp? Thu Oct 19 12:15:30 2017 +0900 >>>>>>>> @@ -51,8 +51,9 @@ >>>>>>>> ??char*??????????????????? PerfMemory::_end = NULL; >>>>>>>> ??char*??????????????????? PerfMemory::_top = NULL; >>>>>>>> ??size_t?????????????????? PerfMemory::_capacity = 0; >>>>>>>> -jint???????????????????? PerfMemory::_initialized = false; >>>>>>>> +volatile jint??????????? PerfMemory::_initialized = 0; >>>>>>>> ??PerfDataPrologue*??????? PerfMemory::_prologue = NULL; >>>>>>>> +volatile bool??????????? PerfMemory::_destroyed = false; >>>>>>>> >>>>>>>> --- a/src/hotspot/share/runtime/perfMemory.hpp? Thu Sep 07 15:40:20 2017 +0200 >>>>>>>> +++ b/src/hotspot/share/runtime/perfMemory.hpp? Thu Oct 19 12:15:30 2017 +0900 >>>>>>>> @@ -113,13 +113,15 @@ >>>>>>>> ?? */ >>>>>>>> ??class PerfMemory : AllStatic { >>>>>>>> ????? friend class VMStructs; >>>>>>>> +??? friend class PerfMemoryTest; >>>>>>>> ??? private: >>>>>>>> ????? static char*? _start; >>>>>>>> ????? static char*? _end; >>>>>>>> ????? static char*? _top; >>>>>>>> ????? static size_t _capacity; >>>>>>>> ????? static PerfDataPrologue*? _prologue; >>>>>>>> -??? static jint?? _initialized; >>>>>>>> +??? static volatile jint????? _initialized; >>>>>>>> +??? static volatile bool????? _destroyed; >>>>>>>> >>>>>>>> diff -r 3e7702cd3f19 src/hotspot/share/runtime/vmStructs.cpp >>>>>>>> --- a/src/hotspot/share/runtime/vmStructs.cpp?? Thu Sep 07 15:40:20 2017 +0200 >>>>>>>> +++ b/src/hotspot/share/runtime/vmStructs.cpp?? Thu Oct 19 12:15:30 2017 +0900 >>>>>>>> @@ -578,7 +578,7 @@ >>>>>>>> ?????? static_field(PerfMemory, _top, char*)???????????????????????????????? \ >>>>>>>> ?????? static_field(PerfMemory, _capacity, size_t)??????????????????????????????? \ >>>>>>>> ?????? static_field(PerfMemory, _prologue, PerfDataPrologue*)???????????????????? \ >>>>>>>> -???? static_field(PerfMemory, _initialized, jint)????????????????????????????????? \ >>>>>>>> +???? static_field(PerfMemory,???????? volatile _initialized, jint)????????????????????????????????? \ >>>>>>>> --------------- >>>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Yasumasa >>>>>>>> >>>>>>>> >>>>>>>> On 2017/10/19 6:18, serguei.spitsyn at oracle.com wrote: >>>>>>>>> On 10/18/17 06:51, Yasumasa Suenaga wrote: >>>>>>>>>> Hi David, Serguei, >>>>>>>>>> >>>>>>>>>>> because as soon as we have checked is_usable() and abort happening in another thread may have changed that by calling destroy. >>>>>>>>>>> >>>>>>>>>>> This code is basically broken if we hit an abort path instead of a normal VM shutdown. >>>>>>>>>> >>>>>>>>>> Can we use MutexLocker for initialize() and destroy() ? >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> I've tried to fix about your comments, but I have an issue about volatile. >>>>>>>>>> PerfMemory.java depends on PerfMemory::_initialized. However VMStructs cannot handle static volatile variables. >>>>>>>>>> I think two approaches as below: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> ? 1. Remove _initialized check from PerfMemory.java >>>>>>>>>> ???? SA will throw UnmappedAddressException if JSnap try to access invalid address including uninitialized memory. >>>>>>>>>> >>>>>>>>>> ? 2. Add static volatile support to VMStructs >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Which should we do? >>>>>>>>>> 1. is easy to fix. But 2. might be right way... >>>>>>>>> >>>>>>>>> Would the below work? : >>>>>>>>> >>>>>>>>> ??578????? static_field(PerfMemory, _initialized, volatile jint)????????????????????????????????? \ >>>>>>>>> >>>>>>>>> It'd be similar to this non-static case: >>>>>>>>> ??362?? nonstatic_field(ConstantPoolCacheEntry, _f1,????????????????????????????????? volatile Metadata*)??????????????????? \ >>>>>>>>> >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Serguei >>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> >>>>>>>>>> Yasumasa >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 2017/10/18 21:34, David Holmes wrote: >>>>>>>>>>> Just to clarify ... >>>>>>>>>>> >>>>>>>>>>> On 18/10/2017 10:28 PM, David Holmes wrote: >>>>>>>>>>>> On 18/10/2017 8:26 PM, serguei.spitsyn at oracle.com wrote: >>>>>>>>>>>>> Hi David, >>>>>>>>>>>>> >>>>>>>>>>>>> Thank you for jumping to this review and helping Yasumasa to sort it out! >>>>>>>>>>>>> I've just discovered that this issue was already on the table for several months without a significant progress. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On 10/18/17 02:48, David Holmes wrote: >>>>>>>>>>>>>> Hi Serguei >>>>>>>>>>>>>> >>>>>>>>>>>>>> On 18/10/2017 7:25 PM, serguei.spitsyn at oracle.com wrote: >>>>>>>>>>>>>>> Hi Yasumasa, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Sorry for a quite late participation. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I looked at the previous webrevs and think that this one is much better. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Some concern is if we need any kind of synchronization here, e.g. CAS. >>>>>>>>>>>>>>> But it depends on the PerfMemory class usage. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Should we make the static variables '_initialized' and '_destroyed' volatile? >>>>>>>>>>>>>> >>>>>>>>>>>>>> For good measure - yes. >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Also, the '_initialized' is set to 1 with: >>>>>>>>>>>>>>> ??? 159 OrderAccess::release_store(&_initialized, 1); >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Should we do the same to set the '_destroyed'?: >>>>>>>>>>>>>>> 200 _destroyed = true; >>>>>>>>>>>>>> >>>>>>>>>>>>>> There is a benign initialization race but we need the release_store to ensure all the data fields can be read if _initialized is seen as true. But what is missing is a load_acquire() in is_initialized() to ensure we synchronize with that store! >>>>>>>>>>>>> >>>>>>>>>>>>> Yes, I noticed that the load_acquire() is missed. :| >>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> There is also a potential for a destruction race (if multiple aborts happens concurrently in different threads) but that also seems benign. In this case there is no data being set so the store to _destroyed does not need to be a release_store. >>>>>>>>>>>>> >>>>>>>>>>>>> I'm not convinced yet this is benign as the PerfMemory::destroy() has this call: >>>>>>>>>>>>> ?? 197 delete_memory_region(); >>>>>>>>>>>> >>>>>>>>>>>> Yes though most of its work ends up being no-ops. >>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Now, I started thinking about the asserts that call the is_useable(). >>>>>>>>>>>>> Should they be returns instead? >>>>>>>>>>>> >>>>>>>>>>>> I think this is a somewhat confused chunk of code. It's only fractionally thread-safe yet once in use could be in use concurrently with an aborting thread that calls destroy(). I don't think there is any simple fix for this. If we're in the process of crashing does it really matter if we trigger a secondary crash due to this? >>>>>>>>>>> >>>>>>>>>>> It doesn't matter if we do: >>>>>>>>>>> >>>>>>>>>>> assert(is_usable(),...); >>>>>>>>>>> // continue >>>>>>>>>>> >>>>>>>>>>> or >>>>>>>>>>> >>>>>>>>>>> if (!is_usable()) return; >>>>>>>>>>> // continue >>>>>>>>>>> >>>>>>>>>>> because as soon as we have checked is_usable() and abort happening in another thread may have changed that by calling destroy. >>>>>>>>>>> >>>>>>>>>>> This code is basically broken if we hit an abort path instead of a normal VM shutdown. >>>>>>>>>>> >>>>>>>>>>> David >>>>>>>>>>> ----- >>>>>>>>>>> >>>>>>>>>>>> The problems with this code go way beyond what Yasumasa is trying to address with the JSnap problem and I would not want to put it back on him to try and come up with an overall solution. >>>>>>>>>>>> >>>>>>>>>>>>> Then the is_destroyed() would better to have the load_acquire(). >>>>>>>>>>>> >>>>>>>>>>>> You could add a load_acquire and do the store_release. It certainly would not hurt, but I don't think it would actually benefit anything either. >>>>>>>>>>>> >>>>>>>>>>>> Cheers, >>>>>>>>>>>> David >>>>>>>>>>>> >>>>>>>>>>>>> Just interested to know what do you think on this. >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> Serguei >>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Cheers, >>>>>>>>>>>>>> David >>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>> Serguei >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On 10/18/17 00:39, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>> Hi David, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thank you for your comment. >>>>>>>>>>>>>>>> I uploaded new webrev: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.07/ >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Serguei, please comment about this :-) >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> 2017-10-18 16:09 GMT+09:00 David Holmes: >>>>>>>>>>>>>>>>> Hi Yasumasa, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On 18/10/2017 4:34 PM, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>>> Hi David, >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> I don't think we need the extra fields, just ensure the existing ones >>>>>>>>>>>>>>>>>>> can't >>>>>>>>>>>>>>>>>>> be accessed (other than by the tools) after destroy is called. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> I've added PerfMemory::is_useable() to check whether we can access to >>>>>>>>>>>>>>>>>> PerfMemory. >>>>>>>>>>>>>>>>>> I think this webrev prevent to access to PerfMemory after destroy() call. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.06/ >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> This: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> ?? 90 void PerfMemory::initialize() { >>>>>>>>>>>>>>>>> ?? 91 >>>>>>>>>>>>>>>>> ?? 92?? if (_prologue != NULL) >>>>>>>>>>>>>>>>> ?? 93???? // initialization already performed >>>>>>>>>>>>>>>>> ?? 94???? return; >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> shouldn't check _prologue, but is_initialized(). >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> ? 213?? assert(is_useable(), "called before initialization"); >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> -> "called before init or after destroy" >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Could add a similar assert in PerfMemory::mark_updated(). >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Let's see what Serguei thinks. :) >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> 2017-10-18 13:44 GMT+09:00 David Holmes: >>>>>>>>>>>>>>>>>>> On 18/10/2017 2:27 PM, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Hi David, >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> 2017-10-18 12:55 GMT+09:00 David Holmes: >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> On 18/10/2017 12:37 PM, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Hi David, >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> With your changes you no longer null out _prologue so the assertion >>>>>>>>>>>>>>>>>>>>>>> would >>>>>>>>>>>>>>>>>>>>>>> now not fail and we'd proceed to access the deleted memory region! >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> On Linux, PerfMemory::delete_memory_region() does not call munmap() >>>>>>>>>>>>>>>>>>>>>> for PerfMemory. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Perhaps not but there are still other actions that happen and the point >>>>>>>>>>>>>>>>>>>>> is >>>>>>>>>>>>>>>>>>>>> we should not be able to continue to use PerfMemory once it has been >>>>>>>>>>>>>>>>>>>>> destroyed (even if the destruction is only logical). >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> I received same comment from Dmitry in the past, but we couldn't >>>>>>>>>>>>>>>>>>>> decide how should we do. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> http://mail.openjdk.java.net/pipermail/serviceability-dev/2016-May/019728.html >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> In that discussion, I uploaded another webrev which adds other fields >>>>>>>>>>>>>>>>>>>> for >>>>>>>>>>>>>>>>>>>> JSnap. >>>>>>>>>>>>>>>>>>>> Is it suitable? >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.02/ >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> I don't think we need the extra fields, just ensure the existing ones >>>>>>>>>>>>>>>>>>> can't >>>>>>>>>>>>>>>>>>> be accessed (other than by the tools) after destroy is called. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> I'm unclear why you no longer clear all the fields set during >>>>>>>>>>>>>>>>>>>>>>> initialization? >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> PerfMemory.java in jdk.hotspot.agent needs these field values. >>>>>>>>>>>>>>>>>>>>>> `jhsdb jsnap --core` is failed if they are cleared. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> I'm not familiar with these tools. When do we produce a core file after >>>>>>>>>>>>>>>>>>>>> calling PerfMemory::destroy ? >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> PerfMemory::destroy() is called before aborting. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Ah - right. I assume we need to close off the perfdata file before we >>>>>>>>>>>>>>>>>>> abort. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> ----------------------- >>>>>>>>>>>>>>>>>>>> #0? perfMemory_exit () >>>>>>>>>>>>>>>>>>>> ?????? at >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/share/vm/runtime/perfMemory.cpp:80 >>>>>>>>>>>>>>>>>>>> #1? 0x00007f99b091c949 in os::shutdown () >>>>>>>>>>>>>>>>>>>> ?????? at >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:1483 >>>>>>>>>>>>>>>>>>>> #2? 0x00007f99b091c980 in os::abort (dump_core=) >>>>>>>>>>>>>>>>>>>> ?????? at >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:1503 >>>>>>>>>>>>>>>>>>>> #3? 0x00007f99b0b689c3 in VMError::report_and_die ( >>>>>>>>>>>>>>>>>>>> this=this at entry=0x7ffcacf40b50) >>>>>>>>>>>>>>>>>>>> ?????? at >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/share/vm/utilities/vmError.cpp:1060 >>>>>>>>>>>>>>>>>>>> #4? 0x00007f99b0926f04 in JVM_handle_linux_signal (sig=sig at entry=11, >>>>>>>>>>>>>>>>>>>> info=info at entry=0x7ffcacf40df0, >>>>>>>>>>>>>>>>>>>> ucVoid=ucVoid at entry=0x7ffcacf40cc0, >>>>>>>>>>>>>>>>>>>> abort_if_unrecognized=abort_if_unrecognized at entry=1) >>>>>>>>>>>>>>>>>>>> ?????? at >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/os_cpu/linux_x86/vm/os_linux_x86.cpp:541 >>>>>>>>>>>>>>>>>>>> ----------------------- >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> But it seems to me that there are various checks of >>>>>>>>>>>>>>>>>>>>>>> _prologue that should really be checking is_initialized() and/or >>>>>>>>>>>>>>>>>>>>>>> is_destroyed() as a guard. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Should I change all assertions for _prologue? >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Assertions and direct guards. Checking _prologue is a placeholder for >>>>>>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>>> real check. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> 2017-10-18 10:53 GMT+09:00 David Holmes: >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Hi Yasumasa, >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> By chance we ran into this bug which I analysed yesterday: >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8189390 >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> We hit the assertion: >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> #? Internal Error >>>>>>>>>>>>>>>>>>>>>>> (/open/src/hotspot/share/runtime/perfMemory.cpp:216), >>>>>>>>>>>>>>>>>>>>>>> pid=17874, tid=17875 >>>>>>>>>>>>>>>>>>>>>>> #? assert(_prologue != __null) failed: called before initialization >>>>>>>>>>>>>>>>>>>>>>> # >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> which is misleading because it can fail if called before >>>>>>>>>>>>>>>>>>>>>>> initialization, >>>>>>>>>>>>>>>>>>>>>>> or >>>>>>>>>>>>>>>>>>>>>>> after PerfMemory::destroy has been called. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> With your changes you no longer null out _prologue so the assertion >>>>>>>>>>>>>>>>>>>>>>> would >>>>>>>>>>>>>>>>>>>>>>> now not fail and we'd proceed to access the deleted memory region! >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> I'm unclear why you no longer clear all the fields set during >>>>>>>>>>>>>>>>>>>>>>> initialization? But it seems to me that there are various checks of >>>>>>>>>>>>>>>>>>>>>>> _prologue that should really be checking is_initialized() and/or >>>>>>>>>>>>>>>>>>>>>>> is_destroyed() as a guard. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> On 16/10/2017 11:25 PM, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> PING: >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> Could you review it? >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.05/ >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> On 2017/10/03 13:18, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> I added gtest unit test case for this change in new webrev: >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.05/ >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Could you review it? >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> 2017-09-27 0:01 GMT+09:00 Yasumasa Suenaga: >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> I uploaded new webrev to be adapted to jdk10/hs: >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.04/ >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> On 2017/09/21 7:45, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> PING: >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> Have you checked this issue? >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.03/ >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> On 2017/07/01 23:43, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> PING: >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Have you checked this issue? >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> On 2017/06/13 14:10, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> I want to discuss about JDK-8151815: Could not parse core image >>>>>>>>>>>>>>>>>>>>>>>>>>>>> with >>>>>>>>>>>>>>>>>>>>>>>>>>>>> JSnap. >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> In last year, I found JSnap cannot parse coredump and I've sent >>>>>>>>>>>>>>>>>>>>>>>>>>>>> review >>>>>>>>>>>>>>>>>>>>>>>>>>>>> request for it as JDK-8151815. However it has not been reviewed >>>>>>>>>>>>>>>>>>>>>>>>>>>>> yet >>>>>>>>>>>>>>>>>>>>>>>>>>>>> [1]. >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> We've discussed about safety implementation, but we could not >>>>>>>>>>>>>>>>>>>>>>>>>>>>> get >>>>>>>>>>>>>>>>>>>>>>>>>>>>> consensus. >>>>>>>>>>>>>>>>>>>>>>>>>>>>> IMHO all SA tools should be handled java processes and core >>>>>>>>>>>>>>>>>>>>>>>>>>>>> images, >>>>>>>>>>>>>>>>>>>>>>>>>>>>> and PerfCounter value is useful. So I fix this issue. >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> I uploaded new webrev for this issue. I think this patch is >>>>>>>>>>>>>>>>>>>>>>>>>>>>> safety >>>>>>>>>>>>>>>>>>>>>>>>>>>>> because new flag PerfMemory::_destroyed guards double free, and >>>>>>>>>>>>>>>>>>>>>>>>>>>>> all >>>>>>>>>>>>>>>>>>>>>>>>>>>>> members in PerfMemory is accessible (they are not munmap'ed) >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.03/ >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Can you cooperate? >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> [1] >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> http://mail.openjdk.java.net/pipermail/serviceability-dev/2016-April/019480.html >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>> >>>>>> > From david.holmes at oracle.com Fri Oct 20 05:05:22 2017 From: david.holmes at oracle.com (David Holmes) Date: Fri, 20 Oct 2017 15:05:22 +1000 Subject: PING: RFR: JDK-8151815: Could not parse core image with JSnap. In-Reply-To: <4b785c83-d5cc-9aab-5186-39765d964e4a@gmail.com> References: <84922d13-79ac-8904-1885-79ee0c2b4232@oracle.com> <967acad8-8378-cf99-0ff9-ba3852d51dd3@oracle.com> <41185116-0271-374c-da9e-74e173d6b58c@oracle.com> <7cb41834-75fa-9dc5-4f6d-c2b85f84dcee@oracle.com> <78b0bfca-1259-e5ef-1189-a7646d7fea36@oracle.com> <7b8ae324-c590-bb41-4bb6-2b4d18b12267@gmail.com> <024d8188-8d54-3345-d3b6-e757f4cafb6e@oracle.com> <38f55e3a-5053-f977-980e-59687cca5fd9@gmail.com> <8e2c7e3f-2f3a-7168-712b-2642a4e88239@oracle.com> <05720ca7-73ef-2761-e8f5-1d0fc45c8bdc@oracle.com> <6bffc41e-80e6-6863-f46a-fba7ba66dc2a@gmail.com> <4b785c83-d5cc-9aab-5186-39765d964e4a@gmail.com> Message-ID: Looks good. (Sorry for the delay.) Thanks, David On 19/10/2017 11:43 PM, Yasumasa Suenaga wrote: > Sorry, I forgot the fix to use OrderAccess::load_acquire() in > PerfMemory::is_initialized(). > I fixed it in new webrev. Could you review again? > > ? http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.09/ > > > Yasumasa > > > On 2017/10/19 21:24, David Holmes wrote: >> On 19/10/2017 9:44 PM, Yasumasa Suenaga wrote: >>> Hi, >>> >>>> I suggest we leave the volatile off for now and file a RFE to add >>>> volatile_static_field support to VMStructs and update later. >>> >>> Okay. David or Serguei, could you file it? >>> >>> >>>>> I'd suggest to fall back to your previous approach as >>>>> synchronization was not there >>>>> in the first place, and it is not a part of the original issue you >>>>> are trying to fix >>>>> (if David or anyone else does not a simple solution). >>> >>>> I don't think trying to introduce locking would be a good idea as it >>>> would likely lead to deadlocks when a crash occurs. This could also >>>> be investigated as a future RFE if desired. >>> >>> Sorry, I have mistake the spell of "usable". >>> I've fixed it in new webrev: >>> >>> ?? http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.08/ >>> >>> Can I list David and Serguei as Reviewer? >>> I will send a changeset to Serguei if it can. >> >> Yes. >> >> Thanks, >> David >> >>> >>> Thanks, >>> >>> Yasumasa >>> >>> >>> On 2017/10/19 19:41, David Holmes wrote: >>>> Hi Serguei, Yasumasa, >>>> >>>> I suggest we leave the volatile off for now and file a RFE to add >>>> volatile_static_field support to VMStructs and update later. >>>> >>>> I don't think trying to introduce locking would be a good idea as it >>>> would likely lead to deadlocks when a crash occurs. This could also >>>> be investigated as a future RFE if desired. >>>> >>>> Thanks, >>>> David >>>> >>>> On 19/10/2017 7:37 PM, serguei.spitsyn at oracle.com wrote: >>>>> Hi Yasumasa, >>>>> >>>>> I see the problem. >>>>> As it occurred making these variables volatile is non-trivial. >>>>> But thank you a lot for trying! >>>>> >>>>> I'd suggest to fall back to your previous approach as >>>>> synchronization was not there >>>>> in the first place, and it is not a part of the original issue you >>>>> are trying to fix >>>>> (if David or anyone else does not a simple solution). >>>>> But let's check if David does not object against it. >>>>> >>>>> I will sponsor your fix after you send me a patch. >>>>> >>>>> Thanks, >>>>> Serguei >>>>> >>>>> >>>>> On 10/18/17 20:21, Yasumasa Suenaga wrote: >>>>>> Sorry, I have mistake. >>>>>> But I cannot compile yet: >>>>>> >>>>>> diff -r 3e7702cd3f19 src/hotspot/share/runtime/vmStructs.cpp >>>>>> --- a/src/hotspot/share/runtime/vmStructs.cpp?? Thu Sep 07 >>>>>> 15:40:20 2017 +0200 >>>>>> +++ b/src/hotspot/share/runtime/vmStructs.cpp?? Thu Oct 19 >>>>>> 12:21:11 2017 +0900 >>>>>> @@ -578,7 +578,7 @@ >>>>>> ????? static_field(PerfMemory, _top, >>>>>> char*)???????????????????????????????? \ >>>>>> ????? static_field(PerfMemory, _capacity, >>>>>> size_t)??????????????????????????????? \ >>>>>> ????? static_field(PerfMemory, _prologue, >>>>>> PerfDataPrologue*)???????????????????? \ >>>>>> -???? static_field(PerfMemory, _initialized, >>>>>> jint)????????????????????????????????? \ >>>>>> +???? static_field(PerfMemory, >>>>>> _initialized,????????????????????????????????? volatile >>>>>> jint)??????????????????????????? \ >>>>>> >>>>>> -------------- >>>>>> In file included from >>>>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:104:0: >>>>>> >>>>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.hpp:168:58: >>>>>> error: invalid conversion from 'volatile void*' to 'void*' >>>>>> [-fpermissive] >>>>>> ? { QUOTE(typeName), QUOTE(fieldName), QUOTE(type), 1, 0, >>>>>> &typeName::fieldName }, >>>>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:581:6: >>>>>> note: in expansion of macro 'GENERATE_STATIC_VM_STRUCT_ENTRY' >>>>>> ????? static_field(PerfMemory, >>>>>> _initialized,????????????????????????????????? volatile >>>>>> jint)??????????????????????????? \ >>>>>> ????? ^~~~~~~~~~~~ >>>>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:2934:3: >>>>>> note: in expansion of macro 'VM_STRUCTS' >>>>>> ?? VM_STRUCTS(GENERATE_NONSTATIC_VM_STRUCT_ENTRY, >>>>>> ?? ^ >>>>>> gmake[3]: *** [lib/CompileJvm.gmk:210: >>>>>> /home/ysuenaga/OpenJDK/jdk10-hs/build/linux-x86_64-normal-server-fastdebug/hotspot/variant-server/libjvm/objs/vmStructs.o] >>>>>> Error 1 >>>>>> gmake[2]: *** [make/Main.gmk:266: hotspot-server-libs] Error 2 >>>>>> >>>>>> ERROR: Build failed for target 'images' in configuration >>>>>> 'linux-x86_64-normal-server-fastdebug' (exit code 2) >>>>>> -------------- >>>>>> >>>>>> >>>>>> >>>>>> On 2017/10/19 12:18, Yasumasa Suenaga wrote: >>>>>>> Hi Serguei, >>>>>>> >>>>>>>> Would the below work? : >>>>>>>> >>>>>>>> ? 578????? static_field(PerfMemory, _initialized, volatile >>>>>>>> jint)????????????????????????????????? \ >>>>>>>> >>>>>>>> It'd be similar to this non-static case: >>>>>>>> ? 362?? nonstatic_field(ConstantPoolCacheEntry, >>>>>>>> _f1,????????????????????????????????? volatile >>>>>>>> Metadata*)??????????????????? \ >>>>>>> >>>>>>> I got error messages as below: >>>>>>> >>>>>>> --------------- >>>>>>> In file included from >>>>>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:104:0: >>>>>>> >>>>>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:581:39: >>>>>>> error: expected unqualified-id before 'volatile' >>>>>>> ?????? static_field(PerfMemory,???????? volatile _initialized, >>>>>>> jint)????????????????????????????????? \ >>>>>>> ??????????????????????????????????????? ^ >>>>>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.hpp:168:69: >>>>>>> note: in definition of macro 'GENERATE_STATIC_VM_STRUCT_ENTRY' >>>>>>> ?? { QUOTE(typeName), QUOTE(fieldName), QUOTE(type), 1, 0, >>>>>>> &typeName::fieldName }, >>>>>>> ^~~~~~~~~ >>>>>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:2934:3: >>>>>>> note: in expansion of macro 'VM_STRUCTS' >>>>>>> ??? VM_STRUCTS(GENERATE_NONSTATIC_VM_STRUCT_ENTRY, >>>>>>> ??? ^ >>>>>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:581:39: >>>>>>> error: expected '}' before 'volatile' >>>>>>> ?????? static_field(PerfMemory,???????? volatile _initialized, >>>>>>> jint)????????????????????????????????? \ >>>>>>> ??????????????????????????????????????? ^ >>>>>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.hpp:168:69: >>>>>>> note: in definition of macro 'GENERATE_STATIC_VM_STRUCT_ENTRY' >>>>>>> ?? { QUOTE(typeName), QUOTE(fieldName), QUOTE(type), 1, 0, >>>>>>> &typeName::fieldName }, >>>>>>> ^~~~~~~~~ >>>>>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:2934:3: >>>>>>> note: in expansion of macro 'VM_STRUCTS' >>>>>>> ??? VM_STRUCTS(GENERATE_NONSTATIC_VM_STRUCT_ENTRY, >>>>>>> ??? ^ >>>>>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:581:39: >>>>>>> error: expected '}' before 'volatile' >>>>>>> ?????? static_field(PerfMemory,???????? volatile _initialized, >>>>>>> jint)????????????????????????????????? \ >>>>>>> ??????????????????????????????????????? ^ >>>>>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.hpp:168:69: >>>>>>> note: in definition of macro 'GENERATE_STATIC_VM_STRUCT_ENTRY' >>>>>>> ?? { QUOTE(typeName), QUOTE(fieldName), QUOTE(type), 1, 0, >>>>>>> &typeName::fieldName }, >>>>>>> ^~~~~~~~~ >>>>>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:2934:3: >>>>>>> note: in expansion of macro 'VM_STRUCTS' >>>>>>> ??? VM_STRUCTS(GENERATE_NONSTATIC_VM_STRUCT_ENTRY, >>>>>>> ??? ^ >>>>>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.hpp:168:79: >>>>>>> error: expected declaration before '}' token >>>>>>> ?? { QUOTE(typeName), QUOTE(fieldName), QUOTE(type), 1, 0, >>>>>>> &typeName::fieldName }, >>>>>>> ^ >>>>>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:581:6: >>>>>>> note: in expansion of macro 'GENERATE_STATIC_VM_STRUCT_ENTRY' >>>>>>> ?????? static_field(PerfMemory,???????? volatile _initialized, >>>>>>> jint)????????????????????????????????? \ >>>>>>> ?????? ^~~~~~~~~~~~ >>>>>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:2934:3: >>>>>>> note: in expansion of macro 'VM_STRUCTS' >>>>>>> ??? VM_STRUCTS(GENERATE_NONSTATIC_VM_STRUCT_ENTRY, >>>>>>> ??? ^ >>>>>>> gmake[3]: *** [lib/CompileJvm.gmk:210: >>>>>>> /home/ysuenaga/OpenJDK/jdk10-hs/build/linux-x86_64-normal-server-fastdebug/hotspot/variant-server/libjvm/objs/vmStructs.o] >>>>>>> Error 1 >>>>>>> gmake[2]: *** [make/Main.gmk:266: hotspot-server-libs] Error 2 >>>>>>> >>>>>>> ERROR: Build failed for target 'images' in configuration >>>>>>> 'linux-x86_64-normal-server-fastdebug' (exit code 2) >>>>>>> --------------- >>>>>>> >>>>>>> >>>>>>> I changed as below: >>>>>>> --------------- >>>>>>> diff -r 3e7702cd3f19 src/hotspot/share/runtime/perfMemory.cpp >>>>>>> --- a/src/hotspot/share/runtime/perfMemory.cpp? Thu Sep 07 >>>>>>> 15:40:20 2017 +0200 >>>>>>> +++ b/src/hotspot/share/runtime/perfMemory.cpp? Thu Oct 19 >>>>>>> 12:15:30 2017 +0900 >>>>>>> @@ -51,8 +51,9 @@ >>>>>>> ??char*??????????????????? PerfMemory::_end = NULL; >>>>>>> ??char*??????????????????? PerfMemory::_top = NULL; >>>>>>> ??size_t?????????????????? PerfMemory::_capacity = 0; >>>>>>> -jint???????????????????? PerfMemory::_initialized = false; >>>>>>> +volatile jint??????????? PerfMemory::_initialized = 0; >>>>>>> ??PerfDataPrologue*??????? PerfMemory::_prologue = NULL; >>>>>>> +volatile bool??????????? PerfMemory::_destroyed = false; >>>>>>> >>>>>>> --- a/src/hotspot/share/runtime/perfMemory.hpp? Thu Sep 07 >>>>>>> 15:40:20 2017 +0200 >>>>>>> +++ b/src/hotspot/share/runtime/perfMemory.hpp? Thu Oct 19 >>>>>>> 12:15:30 2017 +0900 >>>>>>> @@ -113,13 +113,15 @@ >>>>>>> ?? */ >>>>>>> ??class PerfMemory : AllStatic { >>>>>>> ????? friend class VMStructs; >>>>>>> +??? friend class PerfMemoryTest; >>>>>>> ??? private: >>>>>>> ????? static char*? _start; >>>>>>> ????? static char*? _end; >>>>>>> ????? static char*? _top; >>>>>>> ????? static size_t _capacity; >>>>>>> ????? static PerfDataPrologue*? _prologue; >>>>>>> -??? static jint?? _initialized; >>>>>>> +??? static volatile jint????? _initialized; >>>>>>> +??? static volatile bool????? _destroyed; >>>>>>> >>>>>>> diff -r 3e7702cd3f19 src/hotspot/share/runtime/vmStructs.cpp >>>>>>> --- a/src/hotspot/share/runtime/vmStructs.cpp?? Thu Sep 07 >>>>>>> 15:40:20 2017 +0200 >>>>>>> +++ b/src/hotspot/share/runtime/vmStructs.cpp?? Thu Oct 19 >>>>>>> 12:15:30 2017 +0900 >>>>>>> @@ -578,7 +578,7 @@ >>>>>>> ?????? static_field(PerfMemory, _top, >>>>>>> char*)???????????????????????????????? \ >>>>>>> ?????? static_field(PerfMemory, _capacity, >>>>>>> size_t)??????????????????????????????? \ >>>>>>> ?????? static_field(PerfMemory, _prologue, >>>>>>> PerfDataPrologue*)???????????????????? \ >>>>>>> -???? static_field(PerfMemory, _initialized, >>>>>>> jint)????????????????????????????????? \ >>>>>>> +???? static_field(PerfMemory,???????? volatile _initialized, >>>>>>> jint)????????????????????????????????? \ >>>>>>> --------------- >>>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Yasumasa >>>>>>> >>>>>>> >>>>>>> On 2017/10/19 6:18, serguei.spitsyn at oracle.com wrote: >>>>>>>> On 10/18/17 06:51, Yasumasa Suenaga wrote: >>>>>>>>> Hi David, Serguei, >>>>>>>>> >>>>>>>>>> because as soon as we have checked is_usable() and abort >>>>>>>>>> happening in another thread may have changed that by calling >>>>>>>>>> destroy. >>>>>>>>>> >>>>>>>>>> This code is basically broken if we hit an abort path instead >>>>>>>>>> of a normal VM shutdown. >>>>>>>>> >>>>>>>>> Can we use MutexLocker for initialize() and destroy() ? >>>>>>>>> >>>>>>>>> >>>>>>>>> I've tried to fix about your comments, but I have an issue >>>>>>>>> about volatile. >>>>>>>>> PerfMemory.java depends on PerfMemory::_initialized. However >>>>>>>>> VMStructs cannot handle static volatile variables. >>>>>>>>> I think two approaches as below: >>>>>>>>> >>>>>>>>> >>>>>>>>> ? 1. Remove _initialized check from PerfMemory.java >>>>>>>>> ???? SA will throw UnmappedAddressException if JSnap try to >>>>>>>>> access invalid address including uninitialized memory. >>>>>>>>> >>>>>>>>> ? 2. Add static volatile support to VMStructs >>>>>>>>> >>>>>>>>> >>>>>>>>> Which should we do? >>>>>>>>> 1. is easy to fix. But 2. might be right way... >>>>>>>> >>>>>>>> Would the below work? : >>>>>>>> >>>>>>>> ??578????? static_field(PerfMemory, _initialized, volatile >>>>>>>> jint)????????????????????????????????? \ >>>>>>>> >>>>>>>> It'd be similar to this non-static case: >>>>>>>> ??362?? nonstatic_field(ConstantPoolCacheEntry, >>>>>>>> _f1,????????????????????????????????? volatile >>>>>>>> Metadata*)??????????????????? \ >>>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Serguei >>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> Yasumasa >>>>>>>>> >>>>>>>>> >>>>>>>>> On 2017/10/18 21:34, David Holmes wrote: >>>>>>>>>> Just to clarify ... >>>>>>>>>> >>>>>>>>>> On 18/10/2017 10:28 PM, David Holmes wrote: >>>>>>>>>>> On 18/10/2017 8:26 PM, serguei.spitsyn at oracle.com wrote: >>>>>>>>>>>> Hi David, >>>>>>>>>>>> >>>>>>>>>>>> Thank you for jumping to this review and helping Yasumasa to >>>>>>>>>>>> sort it out! >>>>>>>>>>>> I've just discovered that this issue was already on the >>>>>>>>>>>> table for several months without a significant progress. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On 10/18/17 02:48, David Holmes wrote: >>>>>>>>>>>>> Hi Serguei >>>>>>>>>>>>> >>>>>>>>>>>>> On 18/10/2017 7:25 PM, serguei.spitsyn at oracle.com wrote: >>>>>>>>>>>>>> Hi Yasumasa, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Sorry for a quite late participation. >>>>>>>>>>>>>> >>>>>>>>>>>>>> I looked at the previous webrevs and think that this one >>>>>>>>>>>>>> is much better. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Some concern is if we need any kind of synchronization >>>>>>>>>>>>>> here, e.g. CAS. >>>>>>>>>>>>>> But it depends on the PerfMemory class usage. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Should we make the static variables '_initialized' and >>>>>>>>>>>>>> '_destroyed' volatile? >>>>>>>>>>>>> >>>>>>>>>>>>> For good measure - yes. >>>>>>>>>>>>> >>>>>>>>>>>>>> Also, the '_initialized' is set to 1 with: >>>>>>>>>>>>>> ??? 159 OrderAccess::release_store(&_initialized, 1); >>>>>>>>>>>>>> >>>>>>>>>>>>>> Should we do the same to set the '_destroyed'?: >>>>>>>>>>>>>> 200 _destroyed = true; >>>>>>>>>>>>> >>>>>>>>>>>>> There is a benign initialization race but we need the >>>>>>>>>>>>> release_store to ensure all the data fields can be read if >>>>>>>>>>>>> _initialized is seen as true. But what is missing is a >>>>>>>>>>>>> load_acquire() in is_initialized() to ensure we synchronize >>>>>>>>>>>>> with that store! >>>>>>>>>>>> >>>>>>>>>>>> Yes, I noticed that the load_acquire() is missed. :| >>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> There is also a potential for a destruction race (if >>>>>>>>>>>>> multiple aborts happens concurrently in different threads) >>>>>>>>>>>>> but that also seems benign. In this case there is no data >>>>>>>>>>>>> being set so the store to _destroyed does not need to be a >>>>>>>>>>>>> release_store. >>>>>>>>>>>> >>>>>>>>>>>> I'm not convinced yet this is benign as the >>>>>>>>>>>> PerfMemory::destroy() has this call: >>>>>>>>>>>> ?? 197 delete_memory_region(); >>>>>>>>>>> >>>>>>>>>>> Yes though most of its work ends up being no-ops. >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Now, I started thinking about the asserts that call the >>>>>>>>>>>> is_useable(). >>>>>>>>>>>> Should they be returns instead? >>>>>>>>>>> >>>>>>>>>>> I think this is a somewhat confused chunk of code. It's only >>>>>>>>>>> fractionally thread-safe yet once in use could be in use >>>>>>>>>>> concurrently with an aborting thread that calls destroy(). I >>>>>>>>>>> don't think there is any simple fix for this. If we're in the >>>>>>>>>>> process of crashing does it really matter if we trigger a >>>>>>>>>>> secondary crash due to this? >>>>>>>>>> >>>>>>>>>> It doesn't matter if we do: >>>>>>>>>> >>>>>>>>>> assert(is_usable(),...); >>>>>>>>>> // continue >>>>>>>>>> >>>>>>>>>> or >>>>>>>>>> >>>>>>>>>> if (!is_usable()) return; >>>>>>>>>> // continue >>>>>>>>>> >>>>>>>>>> because as soon as we have checked is_usable() and abort >>>>>>>>>> happening in another thread may have changed that by calling >>>>>>>>>> destroy. >>>>>>>>>> >>>>>>>>>> This code is basically broken if we hit an abort path instead >>>>>>>>>> of a normal VM shutdown. >>>>>>>>>> >>>>>>>>>> David >>>>>>>>>> ----- >>>>>>>>>> >>>>>>>>>>> The problems with this code go way beyond what Yasumasa is >>>>>>>>>>> trying to address with the JSnap problem and I would not want >>>>>>>>>>> to put it back on him to try and come up with an overall >>>>>>>>>>> solution. >>>>>>>>>>> >>>>>>>>>>>> Then the is_destroyed() would better to have the >>>>>>>>>>>> load_acquire(). >>>>>>>>>>> >>>>>>>>>>> You could add a load_acquire and do the store_release. It >>>>>>>>>>> certainly would not hurt, but I don't think it would actually >>>>>>>>>>> benefit anything either. >>>>>>>>>>> >>>>>>>>>>> Cheers, >>>>>>>>>>> David >>>>>>>>>>> >>>>>>>>>>>> Just interested to know what do you think on this. >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Serguei >>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Cheers, >>>>>>>>>>>>> David >>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> Serguei >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On 10/18/17 00:39, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>> Hi David, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thank you for your comment. >>>>>>>>>>>>>>> I uploaded new webrev: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.07/ >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Serguei, please comment about this :-) >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> 2017-10-18 16:09 GMT+09:00 David >>>>>>>>>>>>>>> Holmes: >>>>>>>>>>>>>>>> Hi Yasumasa, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On 18/10/2017 4:34 PM, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>> Hi David, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> I don't think we need the extra fields, just ensure >>>>>>>>>>>>>>>>>> the existing ones >>>>>>>>>>>>>>>>>> can't >>>>>>>>>>>>>>>>>> be accessed (other than by the tools) after destroy is >>>>>>>>>>>>>>>>>> called. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I've added PerfMemory::is_useable() to check whether we >>>>>>>>>>>>>>>>> can access to >>>>>>>>>>>>>>>>> PerfMemory. >>>>>>>>>>>>>>>>> I think this webrev prevent to access to PerfMemory >>>>>>>>>>>>>>>>> after destroy() call. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.06/ >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> This: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> ?? 90 void PerfMemory::initialize() { >>>>>>>>>>>>>>>> ?? 91 >>>>>>>>>>>>>>>> ?? 92?? if (_prologue != NULL) >>>>>>>>>>>>>>>> ?? 93???? // initialization already performed >>>>>>>>>>>>>>>> ?? 94???? return; >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> shouldn't check _prologue, but is_initialized(). >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> ? 213?? assert(is_useable(), "called before >>>>>>>>>>>>>>>> initialization"); >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> -> "called before init or after destroy" >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Could add a similar assert in PerfMemory::mark_updated(). >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Let's see what Serguei thinks. :) >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> 2017-10-18 13:44 GMT+09:00 David >>>>>>>>>>>>>>>>> Holmes: >>>>>>>>>>>>>>>>>> On 18/10/2017 2:27 PM, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Hi David, >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> 2017-10-18 12:55 GMT+09:00 David >>>>>>>>>>>>>>>>>>> Holmes: >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> On 18/10/2017 12:37 PM, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Hi David, >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> With your changes you no longer null out _prologue >>>>>>>>>>>>>>>>>>>>>> so the assertion >>>>>>>>>>>>>>>>>>>>>> would >>>>>>>>>>>>>>>>>>>>>> now not fail and we'd proceed to access the >>>>>>>>>>>>>>>>>>>>>> deleted memory region! >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> On Linux, PerfMemory::delete_memory_region() does >>>>>>>>>>>>>>>>>>>>> not call munmap() >>>>>>>>>>>>>>>>>>>>> for PerfMemory. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Perhaps not but there are still other actions that >>>>>>>>>>>>>>>>>>>> happen and the point >>>>>>>>>>>>>>>>>>>> is >>>>>>>>>>>>>>>>>>>> we should not be able to continue to use PerfMemory >>>>>>>>>>>>>>>>>>>> once it has been >>>>>>>>>>>>>>>>>>>> destroyed (even if the destruction is only logical). >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> I received same comment from Dmitry in the past, but >>>>>>>>>>>>>>>>>>> we couldn't >>>>>>>>>>>>>>>>>>> decide how should we do. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> http://mail.openjdk.java.net/pipermail/serviceability-dev/2016-May/019728.html >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> In that discussion, I uploaded another webrev which >>>>>>>>>>>>>>>>>>> adds other fields >>>>>>>>>>>>>>>>>>> for >>>>>>>>>>>>>>>>>>> JSnap. >>>>>>>>>>>>>>>>>>> Is it suitable? >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.02/ >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> I don't think we need the extra fields, just ensure >>>>>>>>>>>>>>>>>> the existing ones >>>>>>>>>>>>>>>>>> can't >>>>>>>>>>>>>>>>>> be accessed (other than by the tools) after destroy is >>>>>>>>>>>>>>>>>> called. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> I'm unclear why you no longer clear all the fields >>>>>>>>>>>>>>>>>>>>>> set during >>>>>>>>>>>>>>>>>>>>>> initialization? >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> PerfMemory.java in jdk.hotspot.agent needs these >>>>>>>>>>>>>>>>>>>>> field values. >>>>>>>>>>>>>>>>>>>>> `jhsdb jsnap --core` is failed if they are cleared. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> I'm not familiar with these tools. When do we >>>>>>>>>>>>>>>>>>>> produce a core file after >>>>>>>>>>>>>>>>>>>> calling PerfMemory::destroy ? >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> PerfMemory::destroy() is called before aborting. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Ah - right. I assume we need to close off the perfdata >>>>>>>>>>>>>>>>>> file before we >>>>>>>>>>>>>>>>>> abort. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> ----------------------- >>>>>>>>>>>>>>>>>>> #0? perfMemory_exit () >>>>>>>>>>>>>>>>>>> ?????? at >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/share/vm/runtime/perfMemory.cpp:80 >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> #1? 0x00007f99b091c949 in os::shutdown () >>>>>>>>>>>>>>>>>>> ?????? at >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:1483 >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> #2? 0x00007f99b091c980 in os::abort >>>>>>>>>>>>>>>>>>> (dump_core=) >>>>>>>>>>>>>>>>>>> ?????? at >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:1503 >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> #3? 0x00007f99b0b689c3 in VMError::report_and_die ( >>>>>>>>>>>>>>>>>>> ?????? this=this at entry=0x7ffcacf40b50) >>>>>>>>>>>>>>>>>>> ?????? at >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/share/vm/utilities/vmError.cpp:1060 >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> #4? 0x00007f99b0926f04 in JVM_handle_linux_signal >>>>>>>>>>>>>>>>>>> (sig=sig at entry=11, >>>>>>>>>>>>>>>>>>> ?????? info=info at entry=0x7ffcacf40df0, >>>>>>>>>>>>>>>>>>> ucVoid=ucVoid at entry=0x7ffcacf40cc0, >>>>>>>>>>>>>>>>>>> abort_if_unrecognized=abort_if_unrecognized at entry=1) >>>>>>>>>>>>>>>>>>> ?????? at >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/os_cpu/linux_x86/vm/os_linux_x86.cpp:541 >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> ----------------------- >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> But it seems to me that there are various checks of >>>>>>>>>>>>>>>>>>>>>> _prologue that should really be checking >>>>>>>>>>>>>>>>>>>>>> is_initialized() and/or >>>>>>>>>>>>>>>>>>>>>> is_destroyed() as a guard. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Should I change all assertions for _prologue? >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Assertions and direct guards. Checking _prologue is >>>>>>>>>>>>>>>>>>>> a placeholder for >>>>>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>> real check. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> 2017-10-18 10:53 GMT+09:00 David >>>>>>>>>>>>>>>>>>>>> Holmes: >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Hi Yasumasa, >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> By chance we ran into this bug which I analysed >>>>>>>>>>>>>>>>>>>>>> yesterday: >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8189390 >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> We hit the assertion: >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> #? Internal Error >>>>>>>>>>>>>>>>>>>>>> (/open/src/hotspot/share/runtime/perfMemory.cpp:216), >>>>>>>>>>>>>>>>>>>>>> pid=17874, tid=17875 >>>>>>>>>>>>>>>>>>>>>> #? assert(_prologue != __null) failed: called >>>>>>>>>>>>>>>>>>>>>> before initialization >>>>>>>>>>>>>>>>>>>>>> # >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> which is misleading because it can fail if called >>>>>>>>>>>>>>>>>>>>>> before >>>>>>>>>>>>>>>>>>>>>> initialization, >>>>>>>>>>>>>>>>>>>>>> or >>>>>>>>>>>>>>>>>>>>>> after PerfMemory::destroy has been called. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> With your changes you no longer null out _prologue >>>>>>>>>>>>>>>>>>>>>> so the assertion >>>>>>>>>>>>>>>>>>>>>> would >>>>>>>>>>>>>>>>>>>>>> now not fail and we'd proceed to access the >>>>>>>>>>>>>>>>>>>>>> deleted memory region! >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> I'm unclear why you no longer clear all the fields >>>>>>>>>>>>>>>>>>>>>> set during >>>>>>>>>>>>>>>>>>>>>> initialization? But it seems to me that there are >>>>>>>>>>>>>>>>>>>>>> various checks of >>>>>>>>>>>>>>>>>>>>>> _prologue that should really be checking >>>>>>>>>>>>>>>>>>>>>> is_initialized() and/or >>>>>>>>>>>>>>>>>>>>>> is_destroyed() as a guard. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> On 16/10/2017 11:25 PM, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> PING: >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Could you review it? >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.05/ >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> On 2017/10/03 13:18, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> I added gtest unit test case for this change in >>>>>>>>>>>>>>>>>>>>>>>> new webrev: >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.05/ >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> Could you review it? >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> 2017-09-27 0:01 GMT+09:00 Yasumasa >>>>>>>>>>>>>>>>>>>>>>>> Suenaga: >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> I uploaded new webrev to be adapted to jdk10/hs: >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.04/ >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> On 2017/09/21 7:45, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> PING: >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> Have you checked this issue? >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.03/ >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> On 2017/07/01 23:43, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> PING: >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> Have you checked this issue? >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> On 2017/06/13 14:10, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> I want to discuss about JDK-8151815: Could >>>>>>>>>>>>>>>>>>>>>>>>>>>> not parse core image >>>>>>>>>>>>>>>>>>>>>>>>>>>> with >>>>>>>>>>>>>>>>>>>>>>>>>>>> JSnap. >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> In last year, I found JSnap cannot parse >>>>>>>>>>>>>>>>>>>>>>>>>>>> coredump and I've sent >>>>>>>>>>>>>>>>>>>>>>>>>>>> review >>>>>>>>>>>>>>>>>>>>>>>>>>>> request for it as JDK-8151815. However it >>>>>>>>>>>>>>>>>>>>>>>>>>>> has not been reviewed >>>>>>>>>>>>>>>>>>>>>>>>>>>> yet >>>>>>>>>>>>>>>>>>>>>>>>>>>> [1]. >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> We've discussed about safety implementation, >>>>>>>>>>>>>>>>>>>>>>>>>>>> but we could not >>>>>>>>>>>>>>>>>>>>>>>>>>>> get >>>>>>>>>>>>>>>>>>>>>>>>>>>> consensus. >>>>>>>>>>>>>>>>>>>>>>>>>>>> IMHO all SA tools should be handled java >>>>>>>>>>>>>>>>>>>>>>>>>>>> processes and core >>>>>>>>>>>>>>>>>>>>>>>>>>>> images, >>>>>>>>>>>>>>>>>>>>>>>>>>>> and PerfCounter value is useful. So I fix >>>>>>>>>>>>>>>>>>>>>>>>>>>> this issue. >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> I uploaded new webrev for this issue. I >>>>>>>>>>>>>>>>>>>>>>>>>>>> think this patch is >>>>>>>>>>>>>>>>>>>>>>>>>>>> safety >>>>>>>>>>>>>>>>>>>>>>>>>>>> because new flag PerfMemory::_destroyed >>>>>>>>>>>>>>>>>>>>>>>>>>>> guards double free, and >>>>>>>>>>>>>>>>>>>>>>>>>>>> all >>>>>>>>>>>>>>>>>>>>>>>>>>>> members in PerfMemory is accessible (they >>>>>>>>>>>>>>>>>>>>>>>>>>>> are not munmap'ed) >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.03/ >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Can you cooperate? >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> [1] >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> http://mail.openjdk.java.net/pipermail/serviceability-dev/2016-April/019480.html >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>> >>>>> From serguei.spitsyn at oracle.com Fri Oct 20 05:07:33 2017 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 19 Oct 2017 22:07:33 -0700 Subject: PING: RFR: JDK-8151815: Could not parse core image with JSnap. In-Reply-To: References: <84922d13-79ac-8904-1885-79ee0c2b4232@oracle.com> <967acad8-8378-cf99-0ff9-ba3852d51dd3@oracle.com> <41185116-0271-374c-da9e-74e173d6b58c@oracle.com> <7cb41834-75fa-9dc5-4f6d-c2b85f84dcee@oracle.com> <78b0bfca-1259-e5ef-1189-a7646d7fea36@oracle.com> <7b8ae324-c590-bb41-4bb6-2b4d18b12267@gmail.com> <024d8188-8d54-3345-d3b6-e757f4cafb6e@oracle.com> <38f55e3a-5053-f977-980e-59687cca5fd9@gmail.com> <8e2c7e3f-2f3a-7168-712b-2642a4e88239@oracle.com> <05720ca7-73ef-2761-e8f5-1d0fc45c8bdc@oracle.com> <6bffc41e-80e6-6863-f46a-fba7ba66dc2a@gmail.com> <4b785c83-d5cc-9aab-5186-39765d964e4a@gmail.com> Message-ID: Thank you, David! Yasumasa, could you, please, send me a patch? Thanks, Serguei On 10/19/17 22:05, David Holmes wrote: > Looks good. (Sorry for the delay.) > > Thanks, > David > > On 19/10/2017 11:43 PM, Yasumasa Suenaga wrote: >> Sorry, I forgot the fix to use OrderAccess::load_acquire() in >> PerfMemory::is_initialized(). >> I fixed it in new webrev. Could you review again? >> >> ?? http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.09/ >> >> >> Yasumasa >> >> >> On 2017/10/19 21:24, David Holmes wrote: >>> On 19/10/2017 9:44 PM, Yasumasa Suenaga wrote: >>>> Hi, >>>> >>>>> I suggest we leave the volatile off for now and file a RFE to add >>>>> volatile_static_field support to VMStructs and update later. >>>> >>>> Okay. David or Serguei, could you file it? >>>> >>>> >>>>>> I'd suggest to fall back to your previous approach as >>>>>> synchronization was not there >>>>>> in the first place, and it is not a part of the original issue >>>>>> you are trying to fix >>>>>> (if David or anyone else does not a simple solution). >>>> >>>>> I don't think trying to introduce locking would be a good idea as >>>>> it would likely lead to deadlocks when a crash occurs. This could >>>>> also be investigated as a future RFE if desired. >>>> >>>> Sorry, I have mistake the spell of "usable". >>>> I've fixed it in new webrev: >>>> >>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.08/ >>>> >>>> Can I list David and Serguei as Reviewer? >>>> I will send a changeset to Serguei if it can. >>> >>> Yes. >>> >>> Thanks, >>> David >>> >>>> >>>> Thanks, >>>> >>>> Yasumasa >>>> >>>> >>>> On 2017/10/19 19:41, David Holmes wrote: >>>>> Hi Serguei, Yasumasa, >>>>> >>>>> I suggest we leave the volatile off for now and file a RFE to add >>>>> volatile_static_field support to VMStructs and update later. >>>>> >>>>> I don't think trying to introduce locking would be a good idea as >>>>> it would likely lead to deadlocks when a crash occurs. This could >>>>> also be investigated as a future RFE if desired. >>>>> >>>>> Thanks, >>>>> David >>>>> >>>>> On 19/10/2017 7:37 PM, serguei.spitsyn at oracle.com wrote: >>>>>> Hi Yasumasa, >>>>>> >>>>>> I see the problem. >>>>>> As it occurred making these variables volatile is non-trivial. >>>>>> But thank you a lot for trying! >>>>>> >>>>>> I'd suggest to fall back to your previous approach as >>>>>> synchronization was not there >>>>>> in the first place, and it is not a part of the original issue >>>>>> you are trying to fix >>>>>> (if David or anyone else does not a simple solution). >>>>>> But let's check if David does not object against it. >>>>>> >>>>>> I will sponsor your fix after you send me a patch. >>>>>> >>>>>> Thanks, >>>>>> Serguei >>>>>> >>>>>> >>>>>> On 10/18/17 20:21, Yasumasa Suenaga wrote: >>>>>>> Sorry, I have mistake. >>>>>>> But I cannot compile yet: >>>>>>> >>>>>>> diff -r 3e7702cd3f19 src/hotspot/share/runtime/vmStructs.cpp >>>>>>> --- a/src/hotspot/share/runtime/vmStructs.cpp?? Thu Sep 07 >>>>>>> 15:40:20 2017 +0200 >>>>>>> +++ b/src/hotspot/share/runtime/vmStructs.cpp?? Thu Oct 19 >>>>>>> 12:21:11 2017 +0900 >>>>>>> @@ -578,7 +578,7 @@ >>>>>>> ????? static_field(PerfMemory, _top, >>>>>>> char*)???????????????????????????????? \ >>>>>>> ????? static_field(PerfMemory, _capacity, >>>>>>> size_t)??????????????????????????????? \ >>>>>>> ????? static_field(PerfMemory, _prologue, >>>>>>> PerfDataPrologue*)???????????????????? \ >>>>>>> -???? static_field(PerfMemory, _initialized, >>>>>>> jint)????????????????????????????????? \ >>>>>>> +???? static_field(PerfMemory, _initialized, volatile >>>>>>> jint)??????????????????????????? \ >>>>>>> >>>>>>> -------------- >>>>>>> In file included from >>>>>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:104:0: >>>>>>> >>>>>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.hpp:168:58: >>>>>>> error: invalid conversion from 'volatile void*' to 'void*' >>>>>>> [-fpermissive] >>>>>>> ? { QUOTE(typeName), QUOTE(fieldName), QUOTE(type), 1, 0, >>>>>>> &typeName::fieldName }, >>>>>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:581:6: >>>>>>> note: in expansion of macro 'GENERATE_STATIC_VM_STRUCT_ENTRY' >>>>>>> ????? static_field(PerfMemory, _initialized, volatile >>>>>>> jint)??????????????????????????? \ >>>>>>> ????? ^~~~~~~~~~~~ >>>>>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:2934:3: >>>>>>> note: in expansion of macro 'VM_STRUCTS' >>>>>>> ?? VM_STRUCTS(GENERATE_NONSTATIC_VM_STRUCT_ENTRY, >>>>>>> ?? ^ >>>>>>> gmake[3]: *** [lib/CompileJvm.gmk:210: >>>>>>> /home/ysuenaga/OpenJDK/jdk10-hs/build/linux-x86_64-normal-server-fastdebug/hotspot/variant-server/libjvm/objs/vmStructs.o] >>>>>>> Error 1 >>>>>>> gmake[2]: *** [make/Main.gmk:266: hotspot-server-libs] Error 2 >>>>>>> >>>>>>> ERROR: Build failed for target 'images' in configuration >>>>>>> 'linux-x86_64-normal-server-fastdebug' (exit code 2) >>>>>>> -------------- >>>>>>> >>>>>>> >>>>>>> >>>>>>> On 2017/10/19 12:18, Yasumasa Suenaga wrote: >>>>>>>> Hi Serguei, >>>>>>>> >>>>>>>>> Would the below work? : >>>>>>>>> >>>>>>>>> ? 578????? static_field(PerfMemory, _initialized, volatile >>>>>>>>> jint)????????????????????????????????? \ >>>>>>>>> >>>>>>>>> It'd be similar to this non-static case: >>>>>>>>> ? 362?? nonstatic_field(ConstantPoolCacheEntry, >>>>>>>>> _f1,????????????????????????????????? volatile >>>>>>>>> Metadata*)??????????????????? \ >>>>>>>> >>>>>>>> I got error messages as below: >>>>>>>> >>>>>>>> --------------- >>>>>>>> In file included from >>>>>>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:104:0: >>>>>>>> >>>>>>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:581:39: >>>>>>>> error: expected unqualified-id before 'volatile' >>>>>>>> ?????? static_field(PerfMemory,???????? volatile _initialized, >>>>>>>> jint) \ >>>>>>>> ??????????????????????????????????????? ^ >>>>>>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.hpp:168:69: >>>>>>>> note: in definition of macro 'GENERATE_STATIC_VM_STRUCT_ENTRY' >>>>>>>> ?? { QUOTE(typeName), QUOTE(fieldName), QUOTE(type), 1, 0, >>>>>>>> &typeName::fieldName }, >>>>>>>> ^~~~~~~~~ >>>>>>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:2934:3: >>>>>>>> note: in expansion of macro 'VM_STRUCTS' >>>>>>>> ??? VM_STRUCTS(GENERATE_NONSTATIC_VM_STRUCT_ENTRY, >>>>>>>> ??? ^ >>>>>>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:581:39: >>>>>>>> error: expected '}' before 'volatile' >>>>>>>> ?????? static_field(PerfMemory,???????? volatile _initialized, >>>>>>>> jint) \ >>>>>>>> ??????????????????????????????????????? ^ >>>>>>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.hpp:168:69: >>>>>>>> note: in definition of macro 'GENERATE_STATIC_VM_STRUCT_ENTRY' >>>>>>>> ?? { QUOTE(typeName), QUOTE(fieldName), QUOTE(type), 1, 0, >>>>>>>> &typeName::fieldName }, >>>>>>>> ^~~~~~~~~ >>>>>>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:2934:3: >>>>>>>> note: in expansion of macro 'VM_STRUCTS' >>>>>>>> ??? VM_STRUCTS(GENERATE_NONSTATIC_VM_STRUCT_ENTRY, >>>>>>>> ??? ^ >>>>>>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:581:39: >>>>>>>> error: expected '}' before 'volatile' >>>>>>>> ?????? static_field(PerfMemory,???????? volatile _initialized, >>>>>>>> jint) \ >>>>>>>> ??????????????????????????????????????? ^ >>>>>>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.hpp:168:69: >>>>>>>> note: in definition of macro 'GENERATE_STATIC_VM_STRUCT_ENTRY' >>>>>>>> ?? { QUOTE(typeName), QUOTE(fieldName), QUOTE(type), 1, 0, >>>>>>>> &typeName::fieldName }, >>>>>>>> ^~~~~~~~~ >>>>>>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:2934:3: >>>>>>>> note: in expansion of macro 'VM_STRUCTS' >>>>>>>> ??? VM_STRUCTS(GENERATE_NONSTATIC_VM_STRUCT_ENTRY, >>>>>>>> ??? ^ >>>>>>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.hpp:168:79: >>>>>>>> error: expected declaration before '}' token >>>>>>>> ?? { QUOTE(typeName), QUOTE(fieldName), QUOTE(type), 1, 0, >>>>>>>> &typeName::fieldName }, >>>>>>>> ^ >>>>>>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:581:6: >>>>>>>> note: in expansion of macro 'GENERATE_STATIC_VM_STRUCT_ENTRY' >>>>>>>> ?????? static_field(PerfMemory,???????? volatile _initialized, >>>>>>>> jint) \ >>>>>>>> ?????? ^~~~~~~~~~~~ >>>>>>>> /home/ysuenaga/OpenJDK/jdk10-hs/src/hotspot/share/runtime/vmStructs.cpp:2934:3: >>>>>>>> note: in expansion of macro 'VM_STRUCTS' >>>>>>>> ??? VM_STRUCTS(GENERATE_NONSTATIC_VM_STRUCT_ENTRY, >>>>>>>> ??? ^ >>>>>>>> gmake[3]: *** [lib/CompileJvm.gmk:210: >>>>>>>> /home/ysuenaga/OpenJDK/jdk10-hs/build/linux-x86_64-normal-server-fastdebug/hotspot/variant-server/libjvm/objs/vmStructs.o] >>>>>>>> Error 1 >>>>>>>> gmake[2]: *** [make/Main.gmk:266: hotspot-server-libs] Error 2 >>>>>>>> >>>>>>>> ERROR: Build failed for target 'images' in configuration >>>>>>>> 'linux-x86_64-normal-server-fastdebug' (exit code 2) >>>>>>>> --------------- >>>>>>>> >>>>>>>> >>>>>>>> I changed as below: >>>>>>>> --------------- >>>>>>>> diff -r 3e7702cd3f19 src/hotspot/share/runtime/perfMemory.cpp >>>>>>>> --- a/src/hotspot/share/runtime/perfMemory.cpp? Thu Sep 07 >>>>>>>> 15:40:20 2017 +0200 >>>>>>>> +++ b/src/hotspot/share/runtime/perfMemory.cpp? Thu Oct 19 >>>>>>>> 12:15:30 2017 +0900 >>>>>>>> @@ -51,8 +51,9 @@ >>>>>>>> ??char*??????????????????? PerfMemory::_end = NULL; >>>>>>>> ??char*??????????????????? PerfMemory::_top = NULL; >>>>>>>> ??size_t?????????????????? PerfMemory::_capacity = 0; >>>>>>>> -jint???????????????????? PerfMemory::_initialized = false; >>>>>>>> +volatile jint??????????? PerfMemory::_initialized = 0; >>>>>>>> ??PerfDataPrologue*??????? PerfMemory::_prologue = NULL; >>>>>>>> +volatile bool??????????? PerfMemory::_destroyed = false; >>>>>>>> >>>>>>>> --- a/src/hotspot/share/runtime/perfMemory.hpp? Thu Sep 07 >>>>>>>> 15:40:20 2017 +0200 >>>>>>>> +++ b/src/hotspot/share/runtime/perfMemory.hpp? Thu Oct 19 >>>>>>>> 12:15:30 2017 +0900 >>>>>>>> @@ -113,13 +113,15 @@ >>>>>>>> ?? */ >>>>>>>> ??class PerfMemory : AllStatic { >>>>>>>> ????? friend class VMStructs; >>>>>>>> +??? friend class PerfMemoryTest; >>>>>>>> ??? private: >>>>>>>> ????? static char*? _start; >>>>>>>> ????? static char*? _end; >>>>>>>> ????? static char*? _top; >>>>>>>> ????? static size_t _capacity; >>>>>>>> ????? static PerfDataPrologue*? _prologue; >>>>>>>> -??? static jint?? _initialized; >>>>>>>> +??? static volatile jint????? _initialized; >>>>>>>> +??? static volatile bool????? _destroyed; >>>>>>>> >>>>>>>> diff -r 3e7702cd3f19 src/hotspot/share/runtime/vmStructs.cpp >>>>>>>> --- a/src/hotspot/share/runtime/vmStructs.cpp?? Thu Sep 07 >>>>>>>> 15:40:20 2017 +0200 >>>>>>>> +++ b/src/hotspot/share/runtime/vmStructs.cpp?? Thu Oct 19 >>>>>>>> 12:15:30 2017 +0900 >>>>>>>> @@ -578,7 +578,7 @@ >>>>>>>> ?????? static_field(PerfMemory, _top, >>>>>>>> char*)???????????????????????????????? \ >>>>>>>> ?????? static_field(PerfMemory, _capacity, >>>>>>>> size_t)??????????????????????????????? \ >>>>>>>> ?????? static_field(PerfMemory, _prologue, >>>>>>>> PerfDataPrologue*)???????????????????? \ >>>>>>>> -???? static_field(PerfMemory, _initialized, >>>>>>>> jint)????????????????????????????????? \ >>>>>>>> +???? static_field(PerfMemory,???????? volatile _initialized, >>>>>>>> jint) \ >>>>>>>> --------------- >>>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Yasumasa >>>>>>>> >>>>>>>> >>>>>>>> On 2017/10/19 6:18, serguei.spitsyn at oracle.com wrote: >>>>>>>>> On 10/18/17 06:51, Yasumasa Suenaga wrote: >>>>>>>>>> Hi David, Serguei, >>>>>>>>>> >>>>>>>>>>> because as soon as we have checked is_usable() and abort >>>>>>>>>>> happening in another thread may have changed that by calling >>>>>>>>>>> destroy. >>>>>>>>>>> >>>>>>>>>>> This code is basically broken if we hit an abort path >>>>>>>>>>> instead of a normal VM shutdown. >>>>>>>>>> >>>>>>>>>> Can we use MutexLocker for initialize() and destroy() ? >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> I've tried to fix about your comments, but I have an issue >>>>>>>>>> about volatile. >>>>>>>>>> PerfMemory.java depends on PerfMemory::_initialized. However >>>>>>>>>> VMStructs cannot handle static volatile variables. >>>>>>>>>> I think two approaches as below: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> ? 1. Remove _initialized check from PerfMemory.java >>>>>>>>>> ???? SA will throw UnmappedAddressException if JSnap try to >>>>>>>>>> access invalid address including uninitialized memory. >>>>>>>>>> >>>>>>>>>> ? 2. Add static volatile support to VMStructs >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Which should we do? >>>>>>>>>> 1. is easy to fix. But 2. might be right way... >>>>>>>>> >>>>>>>>> Would the below work? : >>>>>>>>> >>>>>>>>> ??578????? static_field(PerfMemory, _initialized, volatile >>>>>>>>> jint)????????????????????????????????? \ >>>>>>>>> >>>>>>>>> It'd be similar to this non-static case: >>>>>>>>> ??362?? nonstatic_field(ConstantPoolCacheEntry, >>>>>>>>> _f1,????????????????????????????????? volatile >>>>>>>>> Metadata*)??????????????????? \ >>>>>>>>> >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Serguei >>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> >>>>>>>>>> Yasumasa >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 2017/10/18 21:34, David Holmes wrote: >>>>>>>>>>> Just to clarify ... >>>>>>>>>>> >>>>>>>>>>> On 18/10/2017 10:28 PM, David Holmes wrote: >>>>>>>>>>>> On 18/10/2017 8:26 PM, serguei.spitsyn at oracle.com wrote: >>>>>>>>>>>>> Hi David, >>>>>>>>>>>>> >>>>>>>>>>>>> Thank you for jumping to this review and helping Yasumasa >>>>>>>>>>>>> to sort it out! >>>>>>>>>>>>> I've just discovered that this issue was already on the >>>>>>>>>>>>> table for several months without a significant progress. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On 10/18/17 02:48, David Holmes wrote: >>>>>>>>>>>>>> Hi Serguei >>>>>>>>>>>>>> >>>>>>>>>>>>>> On 18/10/2017 7:25 PM, serguei.spitsyn at oracle.com wrote: >>>>>>>>>>>>>>> Hi Yasumasa, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Sorry for a quite late participation. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I looked at the previous webrevs and think that this one >>>>>>>>>>>>>>> is much better. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Some concern is if we need any kind of synchronization >>>>>>>>>>>>>>> here, e.g. CAS. >>>>>>>>>>>>>>> But it depends on the PerfMemory class usage. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Should we make the static variables '_initialized' and >>>>>>>>>>>>>>> '_destroyed' volatile? >>>>>>>>>>>>>> >>>>>>>>>>>>>> For good measure - yes. >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Also, the '_initialized' is set to 1 with: >>>>>>>>>>>>>>> ??? 159 OrderAccess::release_store(&_initialized, 1); >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Should we do the same to set the '_destroyed'?: >>>>>>>>>>>>>>> 200 _destroyed = true; >>>>>>>>>>>>>> >>>>>>>>>>>>>> There is a benign initialization race but we need the >>>>>>>>>>>>>> release_store to ensure all the data fields can be read >>>>>>>>>>>>>> if _initialized is seen as true. But what is missing is a >>>>>>>>>>>>>> load_acquire() in is_initialized() to ensure we >>>>>>>>>>>>>> synchronize with that store! >>>>>>>>>>>>> >>>>>>>>>>>>> Yes, I noticed that the load_acquire() is missed. :| >>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> There is also a potential for a destruction race (if >>>>>>>>>>>>>> multiple aborts happens concurrently in different >>>>>>>>>>>>>> threads) but that also seems benign. In this case there >>>>>>>>>>>>>> is no data being set so the store to _destroyed does not >>>>>>>>>>>>>> need to be a release_store. >>>>>>>>>>>>> >>>>>>>>>>>>> I'm not convinced yet this is benign as the >>>>>>>>>>>>> PerfMemory::destroy() has this call: >>>>>>>>>>>>> ?? 197 delete_memory_region(); >>>>>>>>>>>> >>>>>>>>>>>> Yes though most of its work ends up being no-ops. >>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Now, I started thinking about the asserts that call the >>>>>>>>>>>>> is_useable(). >>>>>>>>>>>>> Should they be returns instead? >>>>>>>>>>>> >>>>>>>>>>>> I think this is a somewhat confused chunk of code. It's >>>>>>>>>>>> only fractionally thread-safe yet once in use could be in >>>>>>>>>>>> use concurrently with an aborting thread that calls >>>>>>>>>>>> destroy(). I don't think there is any simple fix for this. >>>>>>>>>>>> If we're in the process of crashing does it really matter >>>>>>>>>>>> if we trigger a secondary crash due to this? >>>>>>>>>>> >>>>>>>>>>> It doesn't matter if we do: >>>>>>>>>>> >>>>>>>>>>> assert(is_usable(),...); >>>>>>>>>>> // continue >>>>>>>>>>> >>>>>>>>>>> or >>>>>>>>>>> >>>>>>>>>>> if (!is_usable()) return; >>>>>>>>>>> // continue >>>>>>>>>>> >>>>>>>>>>> because as soon as we have checked is_usable() and abort >>>>>>>>>>> happening in another thread may have changed that by calling >>>>>>>>>>> destroy. >>>>>>>>>>> >>>>>>>>>>> This code is basically broken if we hit an abort path >>>>>>>>>>> instead of a normal VM shutdown. >>>>>>>>>>> >>>>>>>>>>> David >>>>>>>>>>> ----- >>>>>>>>>>> >>>>>>>>>>>> The problems with this code go way beyond what Yasumasa is >>>>>>>>>>>> trying to address with the JSnap problem and I would not >>>>>>>>>>>> want to put it back on him to try and come up with an >>>>>>>>>>>> overall solution. >>>>>>>>>>>> >>>>>>>>>>>>> Then the is_destroyed() would better to have the >>>>>>>>>>>>> load_acquire(). >>>>>>>>>>>> >>>>>>>>>>>> You could add a load_acquire and do the store_release. It >>>>>>>>>>>> certainly would not hurt, but I don't think it would >>>>>>>>>>>> actually benefit anything either. >>>>>>>>>>>> >>>>>>>>>>>> Cheers, >>>>>>>>>>>> David >>>>>>>>>>>> >>>>>>>>>>>>> Just interested to know what do you think on this. >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> Serguei >>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Cheers, >>>>>>>>>>>>>> David >>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>> Serguei >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On 10/18/17 00:39, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>> Hi David, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thank you for your comment. >>>>>>>>>>>>>>>> I uploaded new webrev: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.07/ >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Serguei, please comment about this :-) >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> 2017-10-18 16:09 GMT+09:00 David >>>>>>>>>>>>>>>> Holmes: >>>>>>>>>>>>>>>>> Hi Yasumasa, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On 18/10/2017 4:34 PM, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>>> Hi David, >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> I don't think we need the extra fields, just ensure >>>>>>>>>>>>>>>>>>> the existing ones >>>>>>>>>>>>>>>>>>> can't >>>>>>>>>>>>>>>>>>> be accessed (other than by the tools) after destroy >>>>>>>>>>>>>>>>>>> is called. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> I've added PerfMemory::is_useable() to check whether >>>>>>>>>>>>>>>>>> we can access to >>>>>>>>>>>>>>>>>> PerfMemory. >>>>>>>>>>>>>>>>>> I think this webrev prevent to access to PerfMemory >>>>>>>>>>>>>>>>>> after destroy() call. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.06/ >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> This: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> ?? 90 void PerfMemory::initialize() { >>>>>>>>>>>>>>>>> ?? 91 >>>>>>>>>>>>>>>>> ?? 92?? if (_prologue != NULL) >>>>>>>>>>>>>>>>> ?? 93???? // initialization already performed >>>>>>>>>>>>>>>>> ?? 94???? return; >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> shouldn't check _prologue, but is_initialized(). >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> ? 213?? assert(is_useable(), "called before >>>>>>>>>>>>>>>>> initialization"); >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> -> "called before init or after destroy" >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Could add a similar assert in PerfMemory::mark_updated(). >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Let's see what Serguei thinks. :) >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> 2017-10-18 13:44 GMT+09:00 David >>>>>>>>>>>>>>>>>> Holmes: >>>>>>>>>>>>>>>>>>> On 18/10/2017 2:27 PM, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Hi David, >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> 2017-10-18 12:55 GMT+09:00 David >>>>>>>>>>>>>>>>>>>> Holmes: >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> On 18/10/2017 12:37 PM, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Hi David, >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> With your changes you no longer null out >>>>>>>>>>>>>>>>>>>>>>> _prologue so the assertion >>>>>>>>>>>>>>>>>>>>>>> would >>>>>>>>>>>>>>>>>>>>>>> now not fail and we'd proceed to access the >>>>>>>>>>>>>>>>>>>>>>> deleted memory region! >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> On Linux, PerfMemory::delete_memory_region() does >>>>>>>>>>>>>>>>>>>>>> not call munmap() >>>>>>>>>>>>>>>>>>>>>> for PerfMemory. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Perhaps not but there are still other actions that >>>>>>>>>>>>>>>>>>>>> happen and the point >>>>>>>>>>>>>>>>>>>>> is >>>>>>>>>>>>>>>>>>>>> we should not be able to continue to use >>>>>>>>>>>>>>>>>>>>> PerfMemory once it has been >>>>>>>>>>>>>>>>>>>>> destroyed (even if the destruction is only logical). >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> I received same comment from Dmitry in the past, >>>>>>>>>>>>>>>>>>>> but we couldn't >>>>>>>>>>>>>>>>>>>> decide how should we do. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> http://mail.openjdk.java.net/pipermail/serviceability-dev/2016-May/019728.html >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> In that discussion, I uploaded another webrev which >>>>>>>>>>>>>>>>>>>> adds other fields >>>>>>>>>>>>>>>>>>>> for >>>>>>>>>>>>>>>>>>>> JSnap. >>>>>>>>>>>>>>>>>>>> Is it suitable? >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.02/ >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> I don't think we need the extra fields, just ensure >>>>>>>>>>>>>>>>>>> the existing ones >>>>>>>>>>>>>>>>>>> can't >>>>>>>>>>>>>>>>>>> be accessed (other than by the tools) after destroy >>>>>>>>>>>>>>>>>>> is called. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> I'm unclear why you no longer clear all the >>>>>>>>>>>>>>>>>>>>>>> fields set during >>>>>>>>>>>>>>>>>>>>>>> initialization? >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> PerfMemory.java in jdk.hotspot.agent needs these >>>>>>>>>>>>>>>>>>>>>> field values. >>>>>>>>>>>>>>>>>>>>>> `jhsdb jsnap --core` is failed if they are cleared. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> I'm not familiar with these tools. When do we >>>>>>>>>>>>>>>>>>>>> produce a core file after >>>>>>>>>>>>>>>>>>>>> calling PerfMemory::destroy ? >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> PerfMemory::destroy() is called before aborting. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Ah - right. I assume we need to close off the >>>>>>>>>>>>>>>>>>> perfdata file before we >>>>>>>>>>>>>>>>>>> abort. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> ----------------------- >>>>>>>>>>>>>>>>>>>> #0? perfMemory_exit () >>>>>>>>>>>>>>>>>>>> ?????? at >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/share/vm/runtime/perfMemory.cpp:80 >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> #1? 0x00007f99b091c949 in os::shutdown () >>>>>>>>>>>>>>>>>>>> ?????? at >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:1483 >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> #2? 0x00007f99b091c980 in os::abort >>>>>>>>>>>>>>>>>>>> (dump_core=) >>>>>>>>>>>>>>>>>>>> ?????? at >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:1503 >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> #3? 0x00007f99b0b689c3 in VMError::report_and_die ( >>>>>>>>>>>>>>>>>>>> this=this at entry=0x7ffcacf40b50) >>>>>>>>>>>>>>>>>>>> ?????? at >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/share/vm/utilities/vmError.cpp:1060 >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> #4? 0x00007f99b0926f04 in JVM_handle_linux_signal >>>>>>>>>>>>>>>>>>>> (sig=sig at entry=11, >>>>>>>>>>>>>>>>>>>> info=info at entry=0x7ffcacf40df0, >>>>>>>>>>>>>>>>>>>> ucVoid=ucVoid at entry=0x7ffcacf40cc0, >>>>>>>>>>>>>>>>>>>> abort_if_unrecognized=abort_if_unrecognized at entry=1) >>>>>>>>>>>>>>>>>>>> ?????? at >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.144-7.b01.fc26.x86_64/openjdk/hotspot/src/os_cpu/linux_x86/vm/os_linux_x86.cpp:541 >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> ----------------------- >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> But it seems to me that there are various checks of >>>>>>>>>>>>>>>>>>>>>>> _prologue that should really be checking >>>>>>>>>>>>>>>>>>>>>>> is_initialized() and/or >>>>>>>>>>>>>>>>>>>>>>> is_destroyed() as a guard. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Should I change all assertions for _prologue? >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Assertions and direct guards. Checking _prologue >>>>>>>>>>>>>>>>>>>>> is a placeholder for >>>>>>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>>> real check. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> 2017-10-18 10:53 GMT+09:00 David >>>>>>>>>>>>>>>>>>>>>> Holmes: >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Hi Yasumasa, >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> By chance we ran into this bug which I analysed >>>>>>>>>>>>>>>>>>>>>>> yesterday: >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8189390 >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> We hit the assertion: >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> #? Internal Error >>>>>>>>>>>>>>>>>>>>>>> (/open/src/hotspot/share/runtime/perfMemory.cpp:216), >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> pid=17874, tid=17875 >>>>>>>>>>>>>>>>>>>>>>> #? assert(_prologue != __null) failed: called >>>>>>>>>>>>>>>>>>>>>>> before initialization >>>>>>>>>>>>>>>>>>>>>>> # >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> which is misleading because it can fail if >>>>>>>>>>>>>>>>>>>>>>> called before >>>>>>>>>>>>>>>>>>>>>>> initialization, >>>>>>>>>>>>>>>>>>>>>>> or >>>>>>>>>>>>>>>>>>>>>>> after PerfMemory::destroy has been called. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> With your changes you no longer null out >>>>>>>>>>>>>>>>>>>>>>> _prologue so the assertion >>>>>>>>>>>>>>>>>>>>>>> would >>>>>>>>>>>>>>>>>>>>>>> now not fail and we'd proceed to access the >>>>>>>>>>>>>>>>>>>>>>> deleted memory region! >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> I'm unclear why you no longer clear all the >>>>>>>>>>>>>>>>>>>>>>> fields set during >>>>>>>>>>>>>>>>>>>>>>> initialization? But it seems to me that there >>>>>>>>>>>>>>>>>>>>>>> are various checks of >>>>>>>>>>>>>>>>>>>>>>> _prologue that should really be checking >>>>>>>>>>>>>>>>>>>>>>> is_initialized() and/or >>>>>>>>>>>>>>>>>>>>>>> is_destroyed() as a guard. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> On 16/10/2017 11:25 PM, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> PING: >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> Could you review it? >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.05/ >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> On 2017/10/03 13:18, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> I added gtest unit test case for this change >>>>>>>>>>>>>>>>>>>>>>>>> in new webrev: >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8151815/webrev.05/ >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Could you review it? >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> 2017-09-27 0:01 GMT+09:00 Yasumasa >>>>>>>>>>>>>>>>>>>>>>>>> Suenaga: >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> I uploaded new webrev to be adapted to jdk10/h