String Deduplication in JEP192

Fri Feb 28 23:08:23 UTC 2014

Hello,

not sure what the proper process is, but I notice that Dalibor
retweeted a JEP link to JEP192 - String Deduplication in G1.

http://openjdk.java.net/jeps/192

The most obvious thing I noticed is, that the JEP goes into detail to
describe how a String object is constructed out of hashcode and char
array. But it somehow totally misses offset+count fields (substrings).

One can say, it is not the scope of the JEP to be so detailed, but then
the other details of the string object should be removed as well.

What I somewhat also miss is a detailed description how this is
integrated with the GC. I mean there are some interactions around the
topic of aging, dereferencing and atomic replacement, but most of the
JEP deals with functionality outside the GC.

It looks a bit like it will suffer from similiar scalability problems
then the already existing string pool. Maybe it would be better to
re-design the string pool in a way it solves both problems with less
work for the GC phases. This could go so far to even have a (new)
string intern API which could be used by things like XML parsers or
network decoders - which are typically a source of lots of string
duplications in apps.

(And I am not sure if this should be so G1 specific, after all the
adoption rate of G1 is still lower than it could be)

Gruss
Bernd