<html><head></head><body>How many Java threads are involved in monitor Inflation ? Parallelization is spread by Java threads (i.e. each worker claims and deflates monitors of 1 java thread per step).<br>
<br>
Roman<br><br><div class="gmail_quote">Am 29. Juni 2017 12:49:58 MESZ schrieb Robbin Ehn <robbin.ehn@oracle.com>:<blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;">
<pre class="k9mail">Hi Roman,<br /><br />I haven't had the time to test all scenarios, and the numbers are just an indication:<br /><br />Do it VM thread, MonitorUsedDeflationThreshold=0, 0.002782s avg, avg of 10 worsed cleanups 0.0173s<br />Do it 4 workers, MonitorUsedDeflationThreshold=0, 0.002923s avg, avg of 10 worsed cleanups 0.0199s<br />Do it VM thread, MonitorUsedDeflationThreshold=1, 0.001889s avg, avg of 10 worsed cleanups 0.0066s<br /><br />When MonitorUsedDeflationThreshold=0 we are talking about 120000 free monitors to deflate.<br />And I get worse numbers doing the cleanup in 4 threads.<br /><br />Any idea why I see these numbers?<br /><br />Thanks, Robbin<br /><br />On 06/28/2017 10:23 PM, Roman Kennke wrote:<br /><blockquote class="gmail_quote" style="margin: 0pt 0pt 1ex 0.8ex; border-left: 1px solid #729fcf; padding-left: 1ex;"> <br /><blockquote class="gmail_quote" style="margin: 0pt 0pt 1ex 0.8ex; border-left: 1px solid #ad7fa8; padding-left: 1ex;"><br /> On 06/27/2017 09:47 PM, Roman Kennke wrote:<br /><blockquote class="gmail_quote" style="margin: 0pt 0pt 1ex 0.8ex; border-left: 1px solid #8ae234; padding-left: 1ex;"> Hi Robbin,<br /><br /> Ugh. Thanks for catching this.<br /> Problem was that I was accounting the thread-local deflations twice:<br /> once in thread-local processing (basically a leftover from my earlier<br /> attempt to implement this accounting) and then again in<br /> finish_deflate_idle_monitors(). Should be fixed here:<br /><br /> <a href="http://cr.openjdk.java.net/~rkennke/8180932/webrev.09">http://cr.openjdk.java.net/~rkennke/8180932/webrev.09</a>/<br /> <<a href="http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.09">http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.09</a>/><br /></blockquote><br /> Nit:<br /> safepoint.cpp : ParallelSPCleanupTask<br /> "const char* name = " is not needed and 1 is unused</blockquote><br /> Sorry, I don't understand what you mean by this. I see code like this:<br /> <br /> const char* name = "deflating idle monitors";<br /> <br /> and it is used a few lines below, even 2x.<br /> <br /> What's '1 is unused' ?<br /> <br /><blockquote class="gmail_quote" style="margin: 0pt 0pt 1ex 0.8ex; border-left: 1px solid #ad7fa8; padding-left: 1ex;"><blockquote class="gmail_quote" style="margin: 0pt 0pt 1ex 0.8ex; border-left: 1px solid #8ae234; padding-left: 1ex;"><br /> Side question: which jtreg targets do you usually run?<br /></blockquote><br /> Right now I cherry pick directories from: hotspot/test/<br /><br /> I'm going to add a decent test group for local testing.<br /></blockquote> That would be good!<br /> <br /> <br /><blockquote class="gmail_quote" style="margin: 0pt 0pt 1ex 0.8ex; border-left: 1px solid #ad7fa8; padding-left: 1ex;"><br /><blockquote class="gmail_quote" style="margin: 0pt 0pt 1ex 0.8ex; border-left: 1px solid #8ae234; padding-left: 1ex;"><br /> Trying: make test TEST=hotspot_all<br /> gives me *lots* of failures due to missing jcstress stuff (?!)<br /> And even other subsets seem to depend on several bits and pieces<br /> that I<br /> have no idea about.<br /></blockquote><br /> Yes, you need to use internal tool 'jib' java integrate build to get<br /> that work or you can set some environment where the jcstress<br /> application stuff is...<br /></blockquote> Uhhh. We really do want a subset of tests that we can run reliably and<br /> that are self-contained, how else are people (without that jib thingy)<br /> supposed to do some sanity checking with their patches? ;-)<br /><blockquote class="gmail_quote" style="margin: 0pt 0pt 1ex 0.8ex; border-left: 1px solid #ad7fa8; padding-left: 1ex;"> I have a regression on ClassLoaderData root scanning, this should not<br /> be related,<br /> but I only have 3 patches which could cause this, if it's not<br /> something in the environment that have changed.<br /></blockquote> Let me know if it's my patch :-)<br /><blockquote class="gmail_quote" style="margin: 0pt 0pt 1ex 0.8ex; border-left: 1px solid #ad7fa8; padding-left: 1ex;"><br /> Also do not see any immediate performance gains (off vs 4 threads), it<br /> might be <a href="http://hg.openjdk.java.net/jdk10/hs/hotspot/rev/06994badeb24">http://hg.openjdk.java.net/jdk10/hs/hotspot/rev/06994badeb24</a><br /> , but I need to-do some more testing. I know you often run with none<br /> default GSI.<br /></blockquote> <br /> First of all, during the course of this review I reduced the change from<br /> an actual implementation to a kind of framework, and it needs some<br /> separate changes in the GC to make use of it. Not sure if you added<br /> corresponding code in (e.g.) G1?<br /> <br /> Also, this is only really visible in code that makes excessive use of<br /> monitors, i.e. the one linked by Carsten's original patch, or the test<br /> org.openjdk.gcbench.roots.Synchronizers.test in gc-bench:<br /> <br /> <a href="http://icedtea.classpath.org/hg/gc-bench">http://icedtea.classpath.org/hg/gc-bench</a>/<br /> <br /> There are also some popular real-world apps that tend to do this. From<br /> the top off my head, Cassandra is such an application.<br /> <br /> Thanks, Roman<br /> <br /><blockquote class="gmail_quote" style="margin: 0pt 0pt 1ex 0.8ex; border-left: 1px solid #ad7fa8; padding-left: 1ex;"><br /> I'll get back to you.<br /><br /> Thanks, Robbin<br /><br /><blockquote class="gmail_quote" style="margin: 0pt 0pt 1ex 0.8ex; border-left: 1px solid #8ae234; padding-left: 1ex;"><br /> Roman<br /><br /> Am 27.06.2017 um 16:51 schrieb Robbin Ehn:<br /><blockquote class="gmail_quote" style="margin: 0pt 0pt 1ex 0.8ex; border-left: 1px solid #fcaf3e; padding-left: 1ex;"> Hi Roman,<br /><br /> There is something wrong in calculations:<br /> INFO: Deflate: InCirc=43 InUse=18 Scavenged=25 ForceMonitorScavenge=0<br /> : pop=27051 free=215487<br /><br /> free is larger than population, have not had the time to dig into this.<br /><br /> Thanks, Robbin<br /><br /> On 06/22/2017 10:19 PM, Roman Kennke wrote:<br /><blockquote class="gmail_quote" style="margin: 0pt 0pt 1ex 0.8ex; border-left: 1px solid #e9b96e; padding-left: 1ex;"> So here's the latest iteration of that patch:<br /><br /> <a href="http://cr.openjdk.java.net/~rkennke/8180932/webrev.08">http://cr.openjdk.java.net/~rkennke/8180932/webrev.08</a>/<br /> <<a href="http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.08">http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.08</a>/><br /><br /> I checked and fixed all the counters. The problem here is that they<br /> are<br /> not updated in a single place (deflate_idle_monitors() ) but in<br /> several<br /> places, potentially by multiple threads. I split up deflation into<br /> prepare_.. and a finish_.. methods to initialize local and update<br /> global<br /> counters respectively, and pass around a counters object (allocated on<br /> stack) to the various code paths that use it. Updating the counters<br /> always happen under a lock, there's no need to do anything special<br /> with<br /> regards to concurrency.<br /><br /> I also checked the nmethod marking, but there doesn't seem to be<br /> anything in that code that looks problematic under concurrency. The<br /> worst that can happen is that two threads write the same value into an<br /> nmethod field. I think we can live with that ;-)<br /><br /> Good to go?<br /><br /> Tested by running specjvm and jcstress fastdebug+release without<br /> issues.<br /><br /> Roman<br /><br /> Am 02.06.2017 um 12:39 schrieb Robbin Ehn:<br /><blockquote class="gmail_quote" style="margin: 0pt 0pt 1ex 0.8ex; border-left: 1px solid #ccc; padding-left: 1ex;"> Hi Roman,<br /><br /> On 06/02/2017 11:41 AM, Roman Kennke wrote:<br /><blockquote class="gmail_quote" style="margin: 0pt 0pt 1ex 0.8ex; border-left: 1px solid #ccc; padding-left: 1ex;"> Hi David,<br /> thanks for reviewing. I'll be on vacation the next two weeks too,<br /> with<br /> only sporadic access to work stuff.<br /> Yes, exposure will not be as good as otherwise, but it's not totally<br /> untested either: the serial code path is the same as the<br /> parallel, the<br /> only difference is that it's not actually called by multiple<br /> threads.<br /> It's ok I think.<br /><br /> I found two more issues that I think should be addressed:<br /> - There are some counters in deflate_idle_monitors() and I'm not<br /> sure I<br /> correctly handle them in the split-up and MT'ed thread-local/ global<br /> list deflation<br /> - nmethod marking seems to unconditionally poke true or something<br /> like<br /> that in nmethod fields. This doesn't hurt correctness-wise, but it's<br /> probably worth checking if it's already true, especially when doing<br /> this<br /> with multiple threads concurrently.<br /><br /> I'll send an updated patch around later, I hope I can get to it<br /> today...<br /></blockquote><br /> I'll review that when you get it out.<br /> I think this looks as a reasonable step before we tackle this with a<br /> major effort, such as the JEP you and Carsten doing.<br /> And another effort to 'fix' nmethods marking.<br /><br /> Internal discussion yesterday lead us to conclude that the runtime<br /> will probably need more threads.<br /> This would be a good driver to do a 'global' worker pool which serves<br /> both gc, runtime and safepoints with threads.<br /><br /><blockquote class="gmail_quote" style="margin: 0pt 0pt 1ex 0.8ex; border-left: 1px solid #ccc; padding-left: 1ex;"><br /> Roman<br /><br /><blockquote class="gmail_quote" style="margin: 0pt 0pt 1ex 0.8ex; border-left: 1px solid #ccc; padding-left: 1ex;"> Hi Roman,<br /><br /> I am about to disappear on an extended vacation so will let others<br /> pursue this. IIUC this is longer an opt-in by the user at runtime,<br /> but<br /> an opt-in by the particular GC developers. Okay. My only concern<br /> with<br /> that is if Shenandoah is the only GC that currently opts in then<br /> this<br /> code is not going to get much testing and will be more prone to<br /> incidental breakage.<br /></blockquote></blockquote><br /> As I mentioned before, it seem like Erik Ö have some idea, maybe he<br /> can do this after his barrier patch.<br /><br /> Thanks!<br /><br /> /Robbin<br /><br /><blockquote class="gmail_quote" style="margin: 0pt 0pt 1ex 0.8ex; border-left: 1px solid #ccc; padding-left: 1ex;"><blockquote class="gmail_quote" style="margin: 0pt 0pt 1ex 0.8ex; border-left: 1px solid #ccc; padding-left: 1ex;"><br /> Cheers,<br /> David<br /><br /> On 2/06/2017 2:21 AM, Roman Kennke wrote:<br /><blockquote class="gmail_quote" style="margin: 0pt 0pt 1ex 0.8ex; border-left: 1px solid #ccc; padding-left: 1ex;"> Am 01.06.2017 um 17:50 schrieb Roman Kennke:<br /><blockquote class="gmail_quote" style="margin: 0pt 0pt 1ex 0.8ex; border-left: 1px solid #ccc; padding-left: 1ex;"> Am 01.06.2017 um 14:18 schrieb Robbin Ehn:<br /><blockquote class="gmail_quote" style="margin: 0pt 0pt 1ex 0.8ex; border-left: 1px solid #ccc; padding-left: 1ex;"> Hi Roman,<br /><br /> On 06/01/2017 11:29 AM, Roman Kennke wrote:<br /><blockquote class="gmail_quote" style="margin: 0pt 0pt 1ex 0.8ex; border-left: 1px solid #ccc; padding-left: 1ex;"> Am 31.05.2017 um 22:06 schrieb Robbin Ehn:<br /><blockquote class="gmail_quote" style="margin: 0pt 0pt 1ex 0.8ex; border-left: 1px solid #ccc; padding-left: 1ex;"> Hi Roman, I agree that is really needed but:<br /><br /> On 05/31/2017 10:27 AM, Roman Kennke wrote:<br /><blockquote class="gmail_quote" style="margin: 0pt 0pt 1ex 0.8ex; border-left: 1px solid #ccc; padding-left: 1ex;"> I realized that sharing workers with GC is not so easy.<br /><br /> We need to be able to use the workers at a safepoint during<br /> concurrent<br /> GC work (which also uses the same workers). This does not<br /> only<br /> require<br /> that those workers be suspended, like e.g.<br /> SuspendibleThreadSet::yield(), but they need to be idle, i.e.<br /> have<br /> finished their tasks. This needs some careful handling to<br /> work<br /> without<br /> races: it requires a SuspendibleThreadSetJoiner around the<br /> corresponding<br /> run_task() call and also the tasks themselves need to join<br /> the<br /> STS and<br /> handle requests for safepoints not by yielding, but by<br /> leaving<br /> the<br /> task.<br /> This is far too peculiar for me to make the call to hook<br /> up GC<br /> workers<br /> for safepoint cleanup, and I thus removed those parts. I<br /> left the<br /> API in<br /> CollectedHeap in place. I think GC devs who know better<br /> about G1<br /> and CMS<br /> should make that call, or else just use a separate thread<br /> pool.<br /><br /> <a href="http://cr.openjdk.java.net/~rkennke/8180932/webrev.05">http://cr.openjdk.java.net/~rkennke/8180932/webrev.05</a>/<br /> <<a href="http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.05">http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.05</a>/><br /><br /> Is it ok now?<br /></blockquote> I still think you should put the "Parallel Safepoint Cleanup"<br /> workers<br /> inside Shenandoah,<br /> so the SafepointSynchronizer only calls get_safepoint_workers,<br /> e.g.:<br /><br /> _cleanup_workers = heap->get_safepoint_workers();<br /> _num_cleanup_workers = _cleanup_workers != NULL ?<br /> _cleanup_workers->total_workers() : 1;<br /> ParallelSPCleanupTask cleanup(_cleanup_subtasks);<br /> StrongRootsScope srs(_num_cleanup_workers);<br /> if (_cleanup_workers != NULL) {<br /> _cleanup_workers->run_task(&cleanup,<br /> _num_cleanup_workers);<br /> } else {<br /> <a href="http://cleanup.work">cleanup.work</a>(0);<br /> }<br /><br /> That way you don't even need your new flags, but it will be<br /> up to<br /> the<br /> other GCs to make their worker available<br /> or cheat with a separate workgang.<br /></blockquote> I can do that, I don't mind. The question is, do we want that?<br /></blockquote> The problem is that we do not want to haste such decision, we<br /> believe<br /> there is a better solution.<br /> I think you also would want another solution.<br /> But it's seems like such solution with 1 'global' thread pool<br /> either<br /> own by GC or the VM it self is quite the undertaking.<br /> Since this probably will not be done any time soon my<br /> suggestion is,<br /> to not hold you back (we also want this), just to make<br /> the code parallel and as an intermediate step ask the GC if it<br /> minds<br /> sharing it's thread.<br /><br /> Now when Shenandoah is merged it's possible that e.g. G1 will<br /> share<br /> the code for a separate thread pool, do something of it's own or<br /> wait until the bigger question about thread pool(s) have been<br /> resolved.<br /><br /> By adding a thread pool directly to the SafepointSynchronizer<br /> and<br /> flags for it we might limit our future options.<br /><br /><blockquote class="gmail_quote" style="margin: 0pt 0pt 1ex 0.8ex; border-left: 1px solid #ccc; padding-left: 1ex;"> I wouldn't call it 'cheating with a separate workgang'<br /> though. I<br /> see<br /> that both G1 and CMS suspend their worker threads at a<br /> safepoint.<br /> However:<br /></blockquote> Yes it's not cheating but I want decent heuristics between e.g.<br /> number<br /> of concurrent marking threads and parallel safepoint threads<br /> since<br /> they compete for cpu time.<br /> As the code looks now, I think that decisions must be made by<br /> the<br /> GC.<br /></blockquote> Ok, I see your point. I updated the proposed patch accordingly:<br /><br /> <a href="http://cr.openjdk.java.net/~rkennke/8180932/webrev.06">http://cr.openjdk.java.net/~rkennke/8180932/webrev.06</a>/<br /> <<a href="http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.06">http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.06</a>/><br /></blockquote> Oops. Minor mistake there. Correction:<br /> <a href="http://cr.openjdk.java.net/~rkennke/8180932/webrev.07">http://cr.openjdk.java.net/~rkennke/8180932/webrev.07</a>/<br /> <<a href="http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.07">http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.07</a>/><br /><br /> (Removed 'class WorkGang' from safepoint.hpp, and forgot to add it<br /> into<br /> collectedHeap.hpp, resulting in build failure...)<br /><br /> Roman</blockquote><br /></blockquote><br /></blockquote></blockquote><br /></blockquote></blockquote><br /></blockquote></blockquote> <br /></blockquote></pre></blockquote></div><br>
-- <br>
Diese Nachricht wurde von meinem Android-Gerät mit K-9 Mail gesendet.</body></html>