RFR: 8158871: Long response times with G1 and StringDeduplication

Tue Jun 21 08:07:24 UTC 2016

On Thu, 2016-06-16 at 09:16 +0200, Per Liden wrote:
> Thanks StefanK/StefanJ/Dima/Thomas for reviewing.
> 
> Updated webrev: http://cr.openjdk.java.net/~pliden/8158871/webrev.2/
> Diff: http://cr.openjdk.java.net/~pliden/8158871/webrev.2.diff/
> 
> thanks,
> Per
> 
> On 2016-06-15 09:41, Per Liden wrote:
> > 
> > Hi Thomas,
> > 
> > On 2016-06-14 16:02, Thomas Schatzl wrote:
> > > 
> > > Hi Per,
> > > 
> > >    some quick initial comments:
> > > 
> > > On Tue, 2016-06-14 at 14:18 +0200, Per Liden wrote:
> > > > 
> > > > Hi,
> > > > 
> > > > Received some comments off-line and here's an updated webrev:
> > > > 
> > > > http://cr.openjdk.java.net/~pliden/8158871/webrev.1/
> > > > 
> > > > Main difference is that the overflow list now lives next to the
> > > > cache
> > > > list, rather than inside the cache list.
> > >    - did you ever measure how long the deletion of all entries
> > > could
> > Yes, for many millions of entries it can take many seconds. It's
> > especially bad on Solaris (where this bug was uncovered).
> > 
> > > 
> > > take? It does not try to synchronize for safepoints, so
> > > potentially
> > > could delay safepoint for a long time.
> > > (in G1StringDedupEntryCache::delete_overflowed)
> > I think you might have the STS backwards. You're only blocking (or
> > synchronizing with) safepoints when joined to the STS. The purpose
> > of this fix is to change how deletion works so that the dedup
> > thread isn't joined to the STS and hence doesn't block safepoints.

Okay.

> > >    - I would kind of prefer
> > > if G1StringDedupEntryCache::G1StringDedupEntryCache() would call
> > > set_max_size() to set _max_list_lengths instead of duplicating
> > > the code
> > > in the initialization list.
> > Will fix.

Thanks.

> > 
> > > 
> > > 
> > >    - any concurrent kind of work should have log message at the
> > > beginning and the end of that work, otherwise it may be hard to
> > > find
> > > out what the process is doing from the log (i.e. delaying the
> > > safepoint) at the moment.
> > I agree, strdedup should be more aligned with how logging is done
> > with UL these days. I'd prefer to file a separate bug for that as
> > it's kind of unrelated to this bug (this bug .
> > 
> > (and as mentioned above, this doesn't delay safepoints).

Okay. Did you already file a CR for that? Otherwise I can do that.

> > >    - I am not sure if the default value of max_cache_size are
> > > good values. I mean, if you have lots of threads (we are running
> > > on machines with 2k+ threads already), the cache size will be
> > > zero by default. 
> > I'm thinking that's ok. The max cache size will be small (or zero)
> > only if you have a tiny dedup table. A tiny dedup table indicates
> > that you aren't doing much inserts into it (if you did it would
> > automatically grow and max_cache_size would grow with it), so a
> > cache will not help you much anyway in that case.
> > 
> > It's probably also worth mentioning that the main purpose of the
> > cache is to allow for the delayed/deferred freeing of entries.

Not completely convinced, as this kind of assumes that freeing scales
pretty linearly with the number of threads.

However, due to malloc/free from experience not scaling that well to
that number of threads, this would mean these large machines will take
more time than comparable smaller machines during the pause for the
remaining entries.

Since we do log the time taken, we can at least pin down this as an
issue in the future. So, okay for now.

Thanks,
  Thomas