[10] RFR: 8185104: Generate CharacterDataLatin1 lookup tables directly

Claes Redestad claes.redestad at oracle.com
Sun Jul 23 13:37:44 UTC 2017


java.lang.CharacterDataLatin1 and others are generated at build time by 
the GenerateCharacter tool, which has a -string mode to generate lookup 
tables as Strings literals rather than arrays directly. This serves 
multiple purposes:

1. it reduces the size of the generated bytecode, which is necessary to 
keep code below method bytecode limits if the arrays generated are very 
2. it may under certain circumstances (large enough arrays, JITed code) 
be a performance optimization

While having other performance benefits, the compact strings feature 
that went into 9 made String::toCharArray less friendly to startup, and 
since the same feature means we're now always loading 
CharacterDataLatin1 on bootstrap in all locales it seemed reasonable to 
re-examine whether this class in particular really gains from generating 
its lookup tables via String literals.

Turns out it doesn't. By generating CharacterDataLatin1 tables as arrays 

- executed bytecode drop from 21,782 to 2,077 when initializing this 
class (approximately 2% of executed bytecode; 1.5% of total instructions)
- slight reduction (~1Kb) in the minimum retained heap usage
- the size of CharacterDataLatin1.class only grows from 6458 to 7385 bytes

Proposed patch is to drop passing -string to GenerateCharacter for the 
latin1 case:

Webrev: http://cr.openjdk.java.net/~redestad/8185104/jdk.00/
Bug:    https://bugs.openjdk.java.net/browse/JDK-8185104



More information about the core-libs-dev mailing list