RFR: 8158871: Long response times with G1 and StringDeduplication

Thu Jun 16 07:16:52 UTC 2016

Thanks StefanK/StefanJ/Dima/Thomas for reviewing.

Updated webrev: http://cr.openjdk.java.net/~pliden/8158871/webrev.2/
Diff: http://cr.openjdk.java.net/~pliden/8158871/webrev.2.diff/

thanks,
Per

On 2016-06-15 09:41, Per Liden wrote:
> Hi Thomas,
>
> On 2016-06-14 16:02, Thomas Schatzl wrote:
>> Hi Per,
>>
>>    some quick initial comments:
>>
>> On Tue, 2016-06-14 at 14:18 +0200, Per Liden wrote:
>>> Hi,
>>>
>>> Received some comments off-line and here's an updated webrev:
>>>
>>> http://cr.openjdk.java.net/~pliden/8158871/webrev.1/
>>>
>>> Main difference is that the overflow list now lives next to the cache
>>> list, rather than inside the cache list.
>>
>>    - did you ever measure how long the deletion of all entries could
>
> Yes, for many millions of entries it can take many seconds. It's
> especially bad on Solaris (where this bug was uncovered).
>
>> take? It does not try to synchronize for safepoints, so potentially
>> could delay safepoint for a long time.
>> (in G1StringDedupEntryCache::delete_overflowed)
>
> I think you might have the STS backwards. You're only blocking (or
> synchronizing with) safepoints when joined to the STS. The purpose of
> this fix is to change how deletion works so that the dedup thread isn't
> joined to the STS and hence doesn't block safepoints.
>
>>
>>    - I would kind of prefer
>> if G1StringDedupEntryCache::G1StringDedupEntryCache() would call
>> set_max_size() to set _max_list_lengths instead of duplicating the code
>> in the initialization list.
>
> Will fix.
>
>>
>>    - any concurrent kind of work should have log message at the
>> beginning and the end of that work, otherwise it may be hard to find
>> out what the process is doing from the log (i.e. delaying the
>> safepoint) at the moment.
>
> I agree, strdedup should be more aligned with how logging is done with
> UL these days. I'd prefer to file a separate bug for that as it's kind
> of unrelated to this bug (this bug .
>
> (and as mentioned above, this doesn't delay safepoints).
>
>>
>>    - I am not sure if the default value of max_cache_size are good
>> values. I mean, if you have lots of threads (we are running on machines
>> with 2k+ threads already), the cache size will be zero by default.
>
> I'm thinking that's ok. The max cache size will be small (or zero) only
> if you have a tiny dedup table. A tiny dedup table indicates that you
> aren't doing much inserts into it (if you did it would automatically
> grow and max_cache_size would grow with it), so a cache will not help
> you much anyway in that case.
>
> It's probably also worth mentioning that the main purpose of the cache
> is to allow for the delayed/deferred freeing of entries.
>
> thanks,
> Per
>
>>
>> Thanks,
>>    Thomas
>>