Some questions about String interning for primitives

Roger Riggs roger.riggs at oracle.com
Thu Jun 3 13:50:38 UTC 2021


Hi Dave,

Have you seen the related thread about Integer.toString()?

https://mail.openjdk.java.net/pipermail/core-libs-dev/2021-April/076454.html
Allocation is cheap and short lived object storage is quick to be reclaimed.

What's missing is data about the cost vs savings in real applications.
And it extends all the way down to memory timing changes for recently
accessed memory vs not recently accessed memory, multi-level memory 
caches, etc.

Regards, Roger


On 6/3/21 3:47 AM, dfranken.jdk at gmail.com wrote:
> Dear readers,
>
> Apologies in advance if these questions have been asked and discussed
> before or if they are on the wrong mailing list, maybe they are more
> suited for project Valhalla, I'm not sure.
>
> I was wondering if it would be possible / feasible to intern primitive
> values and byte arrays and use these interned values instead of
> creating a new object for each conversion.
>
> Currently, the following code prints 'false' as s1 and s2 are
> references to different objects:
>
>    String s1 = Integer.toString(1);
>    s1 = s1.intern(); // Makes no difference whether intern is called
>    String s2 = Integer.toString(1);
>    System.out.println(s1 == s2);
>
> I know that there is an integer cache for boxing / unboxing commonly
> used integers (for numbers ranging from -128 to 127), how about a
> String cache for commonly converted primitives? We could use the same
> ranges initially.
>
> I.e. when I use Integer.toString(1) I don't really care if I get a
> newly allocated String, I only care that I get back a String which
> equals "1" and it's okay if that is a reference to an interned value.
> Likewise with the mirrored variants such as String.valueOf(..).
>
> I was also wondering how byte conversions could work with such a cache.
> Currently, the only way for me to go from bytes to a String is with
> new String(bytes, charset) which guarantees creation of a new String
> object each time. Well, what if the bytes often contain the same value?
>
> Would it be useful to be able to do:
>    
>    String s1 = String.valueOf(bytes, UTF_8);
>    String s2 = String.valueOf(bytes, UTF_8); // <-- returns the same
> refrence as s1
>
> Kind regards,
>
> Dave Franken
>    
>



More information about the core-libs-dev mailing list