Generic type signatures in MethodRepository have many copies of simple class names

Steven Schlansker stevenschlansker at gmail.com
Fri Aug 8 23:10:00 UTC 2025


Hello core-libs-dev, happy Friday!

While diagnosing an out of memory situation in our application, I noticed a surprising source of memory usage.
While this is not so severe to actually cause our OOM, it seems wasteful and I thought to bring it to your attention.

We use reflection-based technologies like Jackson JSON and RESTEasy that retrieve generic type information from many of our classes.
Our profiler provides diagnostics of wasted space due to duplicate objects, particularly Strings.

The analysis highlights many thousands of copies of String instances holding the full name of a class, e.g. "java.util.Optional"
or "com.mycompany.Id". The path to GC route looks like:

String <- sun.reflect.generics.tree.SimpleClassTypeSignature <- Object[] <- ArrayList <- ClassTypeSignature <- MethodTypeSignature <- MethodRepository <- Method

Seeing how these SimpleClassTypeSignature instances are created, it looks like they come from the sun.reflect.generics.parser.SignatureParser which calls
`input.substring(mark, index)`, possibly with a call to `replace('/', '.')` to munge the package name. In all but the simplest of cases, this will return a new String
for every call.

Since this String is representing a Class name, the cardinality should by its nature be very low. For each unique type, there will be many methods referring to it.
Additionally, this generic information is lazy-loaded at most once per Method object.

Therefore, SimpleClassTypeSignature.n seems like a natural place to apply String.intern(), for example changing:

public static SimpleClassTypeSignature make(String n,
    boolean dollar,
    TypeArgument[] tas){
        return new SimpleClassTypeSignature(n, dollar, tas);
}

to intern the name:

public static SimpleClassTypeSignature make(String n,
    boolean dollar,
    TypeArgument[] tas){
        return new SimpleClassTypeSignature(n.intern(), dollar, tas);
}

With any luck, maybe this can even share the same string instance as the class itself uses.
Am I correct in thinking this would be a moderately nice improvement, for a relatively cheap cost?
Or perhaps there's some reason this isn't a good idea?

Thank you for your thoughts on the subject,
Steven



More information about the core-libs-dev mailing list