RFR: 8158871: Long response times with G1 and StringDeduplication

Per Liden per.liden at oracle.com
Tue Jun 21 13:21:13 UTC 2016


Hi Thomas,

On 2016-06-21 10:07, Thomas Schatzl wrote:
> On Thu, 2016-06-16 at 09:16 +0200, Per Liden wrote:
>> Thanks StefanK/StefanJ/Dima/Thomas for reviewing.
>>
>> Updated webrev: http://cr.openjdk.java.net/~pliden/8158871/webrev.2/
>> Diff: http://cr.openjdk.java.net/~pliden/8158871/webrev.2.diff/
>>
>> thanks,
>> Per
>>
>> On 2016-06-15 09:41, Per Liden wrote:
>>>
>>> Hi Thomas,
>>>
>>> On 2016-06-14 16:02, Thomas Schatzl wrote:
>>>>
>>>> Hi Per,
>>>>
>>>>     some quick initial comments:
>>>>
>>>> On Tue, 2016-06-14 at 14:18 +0200, Per Liden wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>> Received some comments off-line and here's an updated webrev:
>>>>>
>>>>> http://cr.openjdk.java.net/~pliden/8158871/webrev.1/
>>>>>
>>>>> Main difference is that the overflow list now lives next to the
>>>>> cache
>>>>> list, rather than inside the cache list.
>>>>     - did you ever measure how long the deletion of all entries
>>>> could
>>> Yes, for many millions of entries it can take many seconds. It's
>>> especially bad on Solaris (where this bug was uncovered).
>>>
>>>>
>>>> take? It does not try to synchronize for safepoints, so
>>>> potentially
>>>> could delay safepoint for a long time.
>>>> (in G1StringDedupEntryCache::delete_overflowed)
>>> I think you might have the STS backwards. You're only blocking (or
>>> synchronizing with) safepoints when joined to the STS. The purpose
>>> of this fix is to change how deletion works so that the dedup
>>> thread isn't joined to the STS and hence doesn't block safepoints.
>
> Okay.
>
>>>>     - I would kind of prefer
>>>> if G1StringDedupEntryCache::G1StringDedupEntryCache() would call
>>>> set_max_size() to set _max_list_lengths instead of duplicating
>>>> the code
>>>> in the initialization list.
>>> Will fix.
>
> Thanks.
>
>>>
>>>>
>>>>
>>>>     - any concurrent kind of work should have log message at the
>>>> beginning and the end of that work, otherwise it may be hard to
>>>> find
>>>> out what the process is doing from the log (i.e. delaying the
>>>> safepoint) at the moment.
>>> I agree, strdedup should be more aligned with how logging is done
>>> with UL these days. I'd prefer to file a separate bug for that as
>>> it's kind of unrelated to this bug (this bug .
>>>
>>> (and as mentioned above, this doesn't delay safepoints).
>
> Okay. Did you already file a CR for that? Otherwise I can do that.

Just created it.

https://bugs.openjdk.java.net/browse/JDK-8159974

>
>>>>     - I am not sure if the default value of max_cache_size are
>>>> good values. I mean, if you have lots of threads (we are running
>>>> on machines with 2k+ threads already), the cache size will be
>>>> zero by default.
>>> I'm thinking that's ok. The max cache size will be small (or zero)
>>> only if you have a tiny dedup table. A tiny dedup table indicates
>>> that you aren't doing much inserts into it (if you did it would
>>> automatically grow and max_cache_size would grow with it), so a
>>> cache will not help you much anyway in that case.
>>>
>>> It's probably also worth mentioning that the main purpose of the
>>> cache is to allow for the delayed/deferred freeing of entries.
>
> Not completely convinced, as this kind of assumes that freeing scales
> pretty linearly with the number of threads.
>
> However, due to malloc/free from experience not scaling that well to
> that number of threads, this would mean these large machines will take
> more time than comparable smaller machines during the pause for the
> remaining entries.
>
> Since we do log the time taken, we can at least pin down this as an
> issue in the future. So, okay for now.

Sounds good, thanks.

cheers,
Per

>
> Thanks,
>    Thomas
>



More information about the hotspot-gc-dev mailing list