[PATCH] SignatureParser performance problems due to excessive StringBuilder usage

Max Kanat-Alexander mkanat at google.com
Tue Nov 29 10:48:23 UTC 2016


Currently, any heavy use of sun.reflect.generics.parser.SignatureParser
leads to many calls to Arrays.copyOf, due to using StringBuilder inside of
SignatureParser to build up a string a character at a time. This showed up
as a top CPU-usage culprit in profiling of a large number of tests,
particularly those which use reflection-heavy dependency injection systems
such as Guice.

To determine the size to pre-size the StringBuilder in this patch, I made
AbstractStringBuilder.expandCapacity print out the original capacity and
new capacity every time it did a resize. I measured the outputs that
happened during a heavy real-world user of SignatureParser, and took
particular measurements at different StringBuilder sizes:

    16: Reallocs: 907310 Total Alloc Counting Only Realloc: 84324048
Estimated Total Alloc: 84310608 Copy Bytes: 27249937 Reallocs from initial
size: 517883
    32: Reallocs: 497220 Total Alloc Counting Only Realloc: 68184799
Estimated Total Alloc: 75412799 Copy Bytes: 22138274 Reallocs from initial
size: 291168
    40: Reallocs: 331045 Total Alloc Counting Only Realloc: 54170531
Estimated Total Alloc: 69096811 Copy Bytes: 17560068 Reallocs from initial
size: 143886
    48: Reallocs: 222232 Total Alloc Counting Only Realloc: 42141244
Estimated Total Alloc: 65213692 Copy Bytes: 13637722 Reallocs from initial
size: 36367
    56: Reallocs: 196595 Total Alloc Counting Only Realloc: 39246690
Estimated Total Alloc: 67601618 Copy Bytes: 12698496 Reallocs from initial
size: 10705
    64: Reallocs: 185941 Total Alloc Counting Only Realloc: 38031623
Estimated Total Alloc: 71112263 Copy Bytes: 12304703 Reallocs from initial
size: 158

As you can see, 48 gives us the lowest total memory allocation as well as
significantly reducing the number of times we reallocate arrays. With
higher values it becomes significantly non-linear in improvement and we
start to allocate more memory overall, so 48 seemed like the right number
to pick.

There is no bug which exactly matches this problem.
https://bugs.openjdk.java.net/browse/JDK-8035424 is close, but describes a
different root cause for performance issues--one which I did not actually
see as occupying significant CPU time in profiles.

This is my first OpenJDK contribution, so I can't file a bug myself.

There is no test included as this is entirely a performance problem. This
patch should be entirely behavior-preserving.

# HG changeset patch
# User mkanat
# Date 1480414839 28800
#      Tue Nov 29 02:20:39 2016 -0800
# Node ID 514fca7d9aa19b074eb32ee8ac700c796b4d9444
# Parent  7901a13a051ce1630477ef9f8a17c5af6c515e69
Optimize SignatureParser's use of StringBuilder.

diff --git a/src/java.base/share/classes/sun/reflect/generics/parser/SignatureParser.java
b/src/java.base/share/classes/sun/reflect/generics/parser/SignatureParser.java
--- a/src/java.base/share/classes/sun/reflect/generics/parser/SignatureParser.java
+++ b/src/java.base/share/classes/sun/reflect/generics/parser/SignatureParser.java
@@ -70,6 +70,11 @@
     private static final char EOI = ':';
     private static final boolean DEBUG = false;

+    // StringBuilder does a lot of array copies if we don't pre-size
+    // it when parsing a full class name. This value was determined
+    // empirically by measurements of real-world code.
+    public static final int CLASS_NAME_SB_SIZE = 48;
+
     // private constructor - enforces use of static factory
     private SignatureParser(){}

@@ -251,9 +256,19 @@
         return FormalTypeParameter.make(id, bs);
     }

-    private String parseIdentifier(){
+    private String parseIdentifier() {
         StringBuilder result = new StringBuilder();
-        while (!Character.isWhitespace(current())) {
+        parseIdentifierInto(result);
+        return result.toString();
+    }
+
+    // This is separate from parseIdentifier for performance reasons.
+    // For a caller who already has a StringBuilder, it's much faster to
+    // re-use the existing builder, because it results in far fewer internal
+    // array copies inside of StringBuilder.
+    private void parseIdentifierInto(StringBuilder result) {
+        int startIndex = index;
+        parseLoop: while (!Character.isWhitespace(current())) {
             char c = current();
             switch(c) {
             case ';':
@@ -263,16 +278,14 @@
             case ':':
             case '>':
             case '<':
-                return result.toString();
-            default:{
-                result.append(c);
+                break parseLoop;
+            default:
                 advance();
             }
+        }
+        result.append(input, startIndex, index - startIndex);
+    }

-            }
-        }
-        return result.toString();
-    }
     /**
      * FieldTypeSignature:
      *     ClassTypeSignature
@@ -325,18 +338,16 @@
         // Parse both any optional leading PackageSpecifier as well as
         // the following SimpleClassTypeSignature.

-        String id = parseIdentifier();
+        StringBuilder idBuild = new StringBuilder(CLASS_NAME_SB_SIZE);
+        parseIdentifierInto(idBuild);

-        if (current() == '/') { // package name
-            StringBuilder idBuild = new StringBuilder(id);
+        while (current() == '/') { // package name
+            advance();
+            idBuild.append('.');
+            parseIdentifierInto(idBuild);
+        }

-            while(current() == '/') {
-                advance();
-                idBuild.append(".");
-                idBuild.append(parseIdentifier());
-            }
-            id = idBuild.toString();
-        }
+        String id = idBuild.toString();

         switch (current()) {
         case ';':


More information about the core-libs-dev mailing list