RFR: 8327247: C2 uses up to 2GB of RAM to compile complex string concat in extreme cases

Chen Liang liach at openjdk.org
Fri Apr 12 14:39:43 UTC 2024


On Tue, 9 Apr 2024 12:01:49 GMT, Claes Redestad <redestad at openjdk.org> wrote:

> This patch suggests a workaround to an issue with huge SCF MH expression trees taking excessive JIT compilation resources by reviving (part of) the simple bytecode-generating strategy that was originally available as an all-or-nothing strategy choice. 
> 
> Instead of reintroducing a binary strategy choice I propose a threshold parameter, controlled by `-Djava.lang.invoke.StringConcat.highArityThreshold=<val>`: For expressions below or at this threshold there's no change, for expressions with an arity above it we use the `StringBuilder`-chain bytecode generator. 
> 
> There are a few trade-offs at play here which influence the choice of threshold. The simple high arity strategy will for example not see any reuse of LambdaForms but strictly always generate a class per indy callsite, which means we might end up with a higher total number of classes generated and loaded in applications if we set this value too low. It may also produce worse performance on average. On the other hand there is the observed increase in C2 resource usage as expressions grow unwieldy. On the other other hand high arity expressions are likely rare to begin with, with less opportunities for sharing than the more common low-arity expressions. 
> 
> I turned the submitted test case into a few JMH benchmarks and did some experiments with `-XX:CompileCommand=MemStat,StringConcat::concat,print`:
> 
> Baseline strategy:
> 13 args: 6.3M
> 23 args: 18M
> 123 args: 868M
> 
> `-Djava.lang.invoke.StringConcat.highArityThreshold=0`:
> 13 args: 2.11M
> 23 args: 3.67M
> 123 args: 4.75M
> 
> For 123 args the memory overhead of the baseline strategy is 180x, but for 23 args we're down to a 5x memory overhead, and down to a 3x overhead for 13 args. Since the absolute overhead for 23 is borderline acceptable (+15Mb) I've conservatively chosen a threshold at arity 20. This keeps C2 resource pressure at a reasonable level (< 18M) while avoiding perturbing performance at the vast majority of call sites.
> 
> I was asked to use the new class file API for mainline. There's a version of this patch implemented using ASM in 7c52a9f which might be a reasonable basis for a backport.

src/java.base/share/classes/java/lang/invoke/StringConcatFactory.java line 1406:

> 1404:                         @Override
> 1405:                         public void accept(ClassBuilder clb) {
> 1406:                             clb.withFlags(AccessFlag.PUBLIC, AccessFlag.FINAL, AccessFlag.SUPER, AccessFlag.SYNTHETIC)

Why is this hidden class public?

src/java.base/share/classes/java/lang/invoke/StringConcatFactory.java line 1409:

> 1407:                                 .withMethodBody(METHOD_NAME,
> 1408:                                         MethodTypeDesc.ofDescriptor(args.toMethodDescriptorString()),
> 1409:                                         ClassFile.ACC_FINAL | ClassFile.ACC_PUBLIC | ClassFile.ACC_STATIC,

Why is this method final? Don't think this method can ever be hidden.

src/java.base/share/classes/java/lang/invoke/StringConcatFactory.java line 1416:

> 1414:                 Class<?> innerClass = lookup.defineHiddenClass(classBytes, true, STRONG).lookupClass();
> 1415:                 DUMPER.dumpClass(className, innerClass, classBytes);
> 1416:                 MethodHandle mh = lookup.findStatic(innerClass, METHOD_NAME, args);

We can probably make the method private, and use the lookup from `defineHiddenClass` instead of the host class lookup.

src/java.base/share/classes/java/lang/invoke/StringConcatFactory.java line 1425:

> 1423: 
> 1424:         private static Consumer<CodeBuilder> generateMethod(String[] constants, MethodType args) {
> 1425:             return new Consumer<CodeBuilder>() {

Are you using inner classes for startup performance concerns? java.lang.invoke should be ready when you can reach here so using lambdas is safe.

src/java.base/share/classes/java/lang/invoke/StringConcatFactory.java line 1458:

> 1456: 
> 1457:         // Load the argument of type cl at slot onto stack, return the number of argument stack slots consumed.
> 1458:         private static int load(CodeBuilder cb, Class<?> cl, int slot) {

You can replace all calls to `load(cb, cl, slot)` with `cb.loadInstruction(TypeKind.from(cl), slot)`, and the retrn slot count can be accessed by `TypeKind::slotSize`

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/18690#discussion_r1562629418
PR Review Comment: https://git.openjdk.org/jdk/pull/18690#discussion_r1562630031
PR Review Comment: https://git.openjdk.org/jdk/pull/18690#discussion_r1562635315
PR Review Comment: https://git.openjdk.org/jdk/pull/18690#discussion_r1562637025
PR Review Comment: https://git.openjdk.org/jdk/pull/18690#discussion_r1562640252


More information about the core-libs-dev mailing list