String Deduplication in JEP192
Per Liden
per.liden at oracle.com
Mon Mar 3 09:50:01 UTC 2014
Hi,
I think Charlie answered most or all questions already, just an
additional comment below.
On 2014-03-01 14:59, charlie hunt wrote:
> On 03/01/2014 07:45 AM, Kirk Pepperdine wrote:
>> On Mar 1, 2014, at 2:42 PM, charlie hunt <charlie.hunt at oracle.com>
>> wrote:
>>
>>> Hi Kirk,
>>>
>>> Thanks for the good questions. :-)
>>>
>>> At the risk of jumping in ahead of the JEP authors and reviewers ...
>>>
>>>> where did you get the statistics from?
>>> 900+ profiles
>> Customer apps I presume?
> Yep
>>> If you or others have a repository of profiles, (if can) please
>>> share your observations.
>> Sorry but I generally don’t keep customers code. ;-)
> Understood
>>>> If the weak generational hypothesis holds will you not be spending
>>>> more time deduping soon to be garbage which potentially would place
>>>> more load on the GC threads?
>>> As mentioned in the JEP, you can manipulate the age at which a
>>> String becomes a candidate with -XX:StringDeduplicationAgeThreshold.
>> Missed that one.. I guess an obvious threshold would be a promotion
>> to tenured????
> A reasonable place to start. ;-) As you know one's goals, i.e.
> throughput, latency or footprint will drive you to the best setting
> assuming a representative workload and monitoring production behavior.
It would have been nice to have the deduplication age threshold
automatically follow the tenuring threshold. However, there are some
technical details here which makes this problematic. To avoid inspecting
the same String more than once we want to be able to quickly filter out
Strings which have already been inspected. We want to do this cheaply as
it's in the hot path (i.e. we don't want to look it up in the dedup
table). With a fixed deduplication threshold we can do this by simply
looking at the String's age and the type of region it's in. The tenuring
threshold is dynamic and recalculated for each GC, if this was also used
as deduplication threshold we wouldn't be able to cheaply tell if a
String has already been inspected or not. I've prototyped different
approaches where the tenuring threshold was used as dedup threshold, but
in the end they all become less attractive options.
/Per
>>
>> I guess that only partially addresses the cost issues.
>
> Agreed ... no replacement for measurements (on representative data). :-)
>
> charlie
>>
>> — Kirk
>>
>
More information about the hotspot-gc-dev
mailing list