[10] RFR: 8185104: Generate CharacterDataLatin1 lookup tables directly

Alan Bateman Alan.Bateman at oracle.com
Mon Jul 24 08:29:00 UTC 2017



On 23/07/2017 14:37, Claes Redestad wrote:
> Hi,
>
> java.lang.CharacterDataLatin1 and others are generated at build time 
> by the GenerateCharacter tool, which has a -string mode to generate 
> lookup tables as Strings literals rather than arrays directly. This 
> serves multiple purposes:
>
> 1. it reduces the size of the generated bytecode, which is necessary 
> to keep code below method bytecode limits if the arrays generated are 
> very large
> 2. it may under certain circumstances (large enough arrays, JITed 
> code) be a performance optimization
>
> While having other performance benefits, the compact strings feature 
> that went into 9 made String::toCharArray less friendly to startup, 
> and since the same feature means we're now always loading 
> CharacterDataLatin1 on bootstrap in all locales it seemed reasonable 
> to re-examine whether this class in particular really gains from 
> generating its lookup tables via String literals.
>
> Turns out it doesn't. By generating CharacterDataLatin1 tables as 
> arrays explicitly:
>
> - executed bytecode drop from 21,782 to 2,077 when initializing this 
> class (approximately 2% of executed bytecode; 1.5% of total instructions)
> - slight reduction (~1Kb) in the minimum retained heap usage
> - the size of CharacterDataLatin1.class only grows from 6458 to 7385 
> bytes
>
> Proposed patch is to drop passing -string to GenerateCharacter for the 
> latin1 case:
>
> Webrev: http://cr.openjdk.java.net/~redestad/8185104/jdk.00/
> Bug:    https://bugs.openjdk.java.net/browse/JDK-8185104
This is a good sleuthing. I can't see of any reason why the tables in 
CharacterDataLatin1 need to be generated as Strings now I think this 
change is good.

-Alan



More information about the build-dev mailing list