RFR: bug: Timely Reducing Unused Committed Memory

Thu Sep 6 18:46:06 UTC 2018

Hi,

  some additional comment:

The specific issue with having a new additional thread is its (stack)
memory usage which will show up in memory usage benchmarks.

We could get away without an additional thread if we changed the
G1ConcurrentMarkThread to do some kind of (regular) polling instead of
signalling.

I.e. the G1ConcurrentMarkThread would poll a few status variables every
now and then, with sleeps inbetween.

Then it would be possible to start VM operations/GCs within that loop
easily as both G1ConcurrentMarkThread and VM thread (which runs the VM
operations) try to lock the same lock.

This would also allow easy disable/enable of the idle-uncommit feature
at runtime.

Btw, if I am not mistaken the current implementation of the idle
triggering might be a bit surprising: since the _last_heap_resized
counter is not reset for every gc, there may be unforeseen situations
where idle is detected while (infrequent) GCs are running.

Is the current implementation in this way what you expected?

Thanks,
  Thomas 

On Thu, 2018-09-06 at 17:51 +0200, Thomas Schatzl wrote:
> Hi again,
> 
> On Wed, 2018-09-05 at 23:52 +0000, Rodrigo Bruno wrote:
> > Hi Thomas,
> > 
> > thank you for your fast reply!
> > 
> > Thomas Schatzl <thomas.schatzl at oracle.com> escreveu no dia quarta,
> > 5/09/2018 à(s) 20:13:
> > > Hi Rodrigo,
> > > 
> > > On Tue, 2018-09-04 at 15:04 +0000, Rodrigo Bruno wrote:
> > > > Hi Thomas,
> > > > 
> > > 
> > > [...]
> > > 
> > > > 
> > > > I didn't devise any experiments to measure memory usage.
> > > > However,
> > > > we can obviously do that.
> 
> That's not required.
> 
> > > > 
> > > > The new version of the patch regarding 
> > > > 
> > > > http://openjdk.java.net/jeps/8204089
> > > > 
> > > > can be found at
> > > > 
> > > > http://www.gsd.inesc-id.pt/~rbruno/webrev.zip
> > > > 
> > > > This version already includes idle compaction (Full GC is 
> > > > optional) and your previous comments as well.
> > > > 
> > > > Let me know what you think. 
> > > 
> > > - small tidbit: the change did not update the type of the
> > > _last_heap_resize variable; the webrev does not compile as is
> > > because of the recent introduction to use -Wreorder.
> > > 
> > 
> > Oh... I will fix that and send a new version very soon.
> 
> Don't worry. When trying this myselves, I found that we are lacking
> some helper methods to the Ticks/Tickspan classes that I intend to
> introduce in a bit... so I kept it.
> 
> > > - the test crashes here with some assert during resizing (after
> > > concurrent marking) in our regression testing - it does not give
> > > any new errors though, so that is good at least. 
> > > I will definitely need to look into this - this may be an
> > > existing
> > > issue with (in this case heap shrinking) :)
> > > 
> > 
> > Which test? I was running the new test in Linux x86_64 slowdebug
> > build and not hitting any assert. How did you run it?
> > 
> > Still regardind tests, it there a way to run the same test with
> > multiple (different) command line arguments?
> 
> The problem is some assert in the resizing code that checks whether
> some cached value of amount of bytes used in the heap is equal to the
> actual bytes used.
> 
> This does not match if there is some region we are currently
> allocating
> into, as we are only updating that cached value when that region is
> full for various regions.
> 
> The resizing (resize_if_necessary_after_full_collection()) had the
> implict assumption that there is no such regions, so that cached
> value
> did not contain it.
> 
> It triggers e.g. with your test when running with -Xcomp.
> 
> >  
> > > - it would have been a lot nicer if the sleep_before_next_cycle()
> > > returned some integer/enum which can be acted upon outside of it.
> > > I.e. it is imho much more readable to separate control/decision
> > > making from acting on it compared to separating both in this
> > > case.
> > > The current sleep_before_next_cycle() is kind of hacky :) 
> > > I will look into this as well and propose an alternative. This
> > > may
> > > take a day or two.
> > 
> > Okay, we can definitely improve the current version of this method.
> 
> So there is a new version of the change at http://cr.openjdk.java.net
> /~
> tschatzl/jelastic/pgc-webrev.2/ (full) and http://cr.openjdk.java.net
> /~
> tschatzl/jelastic/pgc-webrev.1_to_2 (diff) that fixes a few bugs, and
> also makes that method nicer, actually reverting it to the former.
> 
> Let me explain what I changed:
> 
> - there is a race between execution of the VM operation to start the
> concurrent phase (calling the VM_G1CollectForAllocation) and the
> actual
> work done.
> I.e. actual work outside of
> G1ConcurrentMarkThread::sleep_before_next_cycle() could start before
> the actual VM operation started.
> 
> Finally, it is possible (but really unlikely) that the work outside
> completes before returning to
> G1ConcurrentMarkThread::sleep_before_next_cycle(), where the CGC_lock
> is grabbed again. However the VM operation itself also needs to
> CGC_lock to notify G1ConcurrentMarkThread, deadlocking (at least some
> code I had had that problem :))
> 
> So the only way out here unfortunately is to have a separate thread
> doing idle detection and scheduling of the necessary VM operation.
> That
> thread is only started if idle detection is active.
> 
> I think the code was sort-of aware of this issue by break'ing out of
> the while loop and that ominous "TODO" comment :) 
> 
> - the mentioned failing assert.
> 
> - I think something bad would happen if the idle timeout triggered
> while an existing concurrent marking would be under way (which could
> be
> if you set the "load" threshold high enough). I did not actually try
> so
> I may be wrong, but nevertheless.
> 
> - instead of the additional VM operation you can use
> Universe::collect() and then using the UseFullGCForIdleCompaction
> flag
> to schedule the correct VM operation (either full or concurrent
> start),
> and simply always resize the heap in the concurrent cycle (in this
> case
> in the Remark pause). There is a RFE for that out in the bug tracker,
> and we've been asked about this just recently. I will split it out
> from
> this webrev later.
> 
> That removes a lot of code.
> 
> - I added a second test case that tests UseFullGCForIdleCompaction to
> the junit test.
> 
> The webrev I've uploaded is probably not complete, in fact I only
> just
> submitted it to our test system.
> 
> At least the following work needs to be done:
> 
> - move G1DetectIdleThread out of the g1COncurrentMarkThread files
> into
> separate ones.
> 
> - naming of the flags: I think at least GCFrequency is too generic,
> and
> should be in ms too.
> 
> - debug messages for the idle detection to see in detail which
> condition triggered and why.
> 
> - the current implementation for the lazy idle thread initialization
> is
> probably wrong at the moment: GCFrequency that is used is manageable,
> i.e. the current implementation does not consider that. If you turn
> it
> on during runtime after it has initially been off, it does not start.
> Btw, this should be tested too.
> 
> Probably there are some bugs still left, but please have a look.
> 
> Would you mind looking into the issues above?
> 
> Thanks,
>   Thomas
>