String Deduplication in JEP192
Per Liden
per.liden at oracle.com
Mon Mar 3 09:30:19 UTC 2014
Hi,
On 03/01/2014 12:08 AM, Bernd Eckenfels wrote:
> Hello,
>
> not sure what the proper process is, but I notice that Dalibor
> retweeted a JEP link to JEP192 - String Deduplication in G1.
>
> http://openjdk.java.net/jeps/192
>
> The most obvious thing I noticed is, that the JEP goes into detail to
> describe how a String object is constructed out of hashcode and char
> array. But it somehow totally misses offset+count fields (substrings).
As Thomas already mentioned, offset and count was removed from String
quite some time ago.
>
> One can say, it is not the scope of the JEP to be so detailed, but then
> the other details of the string object should be removed as well.
>
> What I somewhat also miss is a detailed description how this is
> integrated with the GC. I mean there are some interactions around the
> topic of aging, dereferencing and atomic replacement, but most of the
> JEP deals with functionality outside the GC.
What's inside vs. outside of the GC is of course open for different
interpretations. The deduplication thread can conceptually be seen as
just another concurrent GC thread, which adds a new concurrent GC phase
to G1. The interactions with the existing GC phases are described in the
"Implementation Overview" and "Candidate Selection".
>
> It looks a bit like it will suffer from similiar scalability problems
> then the already existing string pool. Maybe it would be better to
Note that deduplication is done concurrently with the application, so
the app is not directly affected in that sense. The deduplication thread
is the main user of the deduplication table and, unlike the current
StringTable, the deduplication hashtable is dynamically resized at runtime.
> re-design the string pool in a way it solves both problems with less
> work for the GC phases. This could go so far to even have a (new)
> string intern API which could be used by things like XML parsers or
> network decoders - which are typically a source of lots of string
> duplications in apps.
I guess you're suggesting using (a better) String.intern(). The
alternative of using String.intern() is mentioned briefly under
"Alternatives".
cheers,
/Per
>
> (And I am not sure if this should be so G1 specific, after all the
> adoption rate of G1 is still lower than it could be)
>
> Gruss
> Bernd
>
More information about the hotspot-gc-dev
mailing list