RFR(L): 8029075 - String deduplication in G1

Christian Thalinger christian.thalinger at oracle.com
Thu Mar 6 13:04:45 PST 2014


On Mar 5, 2014, at 1:56 AM, Per Liden <per.liden at oracle.com> wrote:

> Yep, a more efficient mechanism for reference processing without the current overhead per reference would open up for some interesting possibilities. I have some loose/untested ideas on how we could address some of the efficiency problems, but the memory overhead is trickier to solve.

I talked to John Rose about this and he has some ideas.  One way to solve this is to use Value Types but that is still far out.  Another way would be to use some kind of annotation to mark references and teach the GCs about that.

> 
> /Per
> 
> On 03/04/2014 05:55 PM, Christian Thalinger wrote:
>> That’s unfortunate.  It clearly would have been easier to do all that work in Java.
>> 
>> On Mar 4, 2014, at 1:34 AM, Per Liden <per.liden at oracle.com> wrote:
>> 
>>> Hi Christian,
>>> 
>>> That was something Mikael Vidstedt and I briefly touched on when we first started to think about deduplication. However, we didn't go down that path, partly because of the problem you mention, but also because of the general overhead of WeakReferences (in terms of memory and work added to the reference processor and Reference Handler).
>>> 
>>> /Per
>>> 
>>> On 03/04/2014 02:48 AM, Christian Thalinger wrote:
>>>> Just a general question:  Have you thought about writing the deduplication thread in Java?  Maybe also the deduplication table?
>>>> 
>>>> There is clearly the problem of enqueuing.  One easy solution would be to have the deduplication thread poll into the VM to get the next entries but there might be a better away to do it.
>>>> 
>>>> On Mar 3, 2014, at 5:33 AM, Per Liden <per.liden at oracle.com> wrote:
>>>> 
>>>>> Hi,
>>>>> 
>>>>> Could I please have this patch reviewed.
>>>>> 
>>>>> 
>>>>> Summary
>>>>> -------
>>>>> This patch implements JEP 192 - String deduplication in G1. The goal of string deduplication is to reduce the Java heap live-data set by enhancing the G1 garbage collector so that duplicate instances of String are automatically and continuously deduplicated.
>>>>> 
>>>>> I'd like to refer to the JEP for a more detailed description of this feature, the motivation for it, the expected benefit, how it's implemented, etc.
>>>>> 
>>>>> This patch is mainly G1-related, but also touches a few runtime files.
>>>>> 
>>>>> Webrev: http://cr.openjdk.java.net/~pliden/8029075/webrev.0/
>>>>> 
>>>>> JEP: http://openjdk.java.net/jeps/192
>>>>> 
>>>>> RFE: https://bugs.openjdk.java.net/browse/JDK-8029075
>>>>> 
>>>>> 
>>>>> Testing
>>>>> -------
>>>>> * JTreg - The patch includes the following 8 new tests:
>>>>>  - TestStringDeduplicationYoungGC.java: Tests deduplication during Young GC.
>>>>>  - TestStringDeduplicationFullGC.java: Tests deduplication during Full GC.
>>>>>  - TestStringDeduplicationAgeThreshold.java: Tests both valid and invalid age threshold settings.
>>>>>  - TestStringDeduplicationInterned.java: Tests that interned strings are deduplication before being interned.
>>>>>  - TestStringDeduplicationTableRehash.java: Stresses the hashtables ability to rehash all entries.
>>>>>  - TestStringDeduplicationTableResize.java: Stresses the hashtables ability to resize itself.
>>>>>  - TestStringDeduplicationMemoryUsage.java: Tests heap reduction when string deduplication is enabled.
>>>>>  - TestStringDeduplicationPrintOptions.java: Tests command line options.
>>>>> 
>>>>> * Stress testing:
>>>>>  - Kitchensink
>>>>>  - GCBasher
>>>>> 
>>>>> * Regression testing:
>>>>>  - JCK
>>>>>  - vmTestbase
>>>>>  - Bigapps
>>>>> 
>>>>> * Large scale benchmarks to test heap reduction and performance impact:
>>>>>  - FA CRM Sales Op. Flow
>>>>>  - FA DSS
>>>>> 
>>>>> * The following benchmarks have been executed to verify that this feature doesn't impact performance when disabled (even when disabled there are still some "if (UseStringDeduplication)" executed in some hot paths).
>>>>>  - SPECjbb2005
>>>>>  - SPECjbb2013
>>>>>  - SPECjvm2008-XML
>>>>> 
>>>>> * Various ad-hoc tests and microbenchmarks were also written and executed during the course of the development.
>>>>> 
>>>>> 
>>>>> Thanks!
>>>>> /Per
>>>> 
>>> 
>> 
> 



More information about the hotspot-dev mailing list