RFR: String Density/Compact String JEP 254 (update)

Xueming Shen xueming.shen at oracle.com
Fri Oct 30 21:30:23 UTC 2015


Hi,

Thanks for the comments/suggestions. Here are the updated webrevs with 
minor changes here
and there based on the feedback.

http://cr.openjdk.java.net/~sherman/8054307/jdk/
http://cr.openjdk.java.net/~thartmann/compact_strings/webrev/hotspot/

[closed, Oracle internal only]
http://javaweb.us.oracle.com/~tohartma/compact_strings/hotspot/
http://javaweb.us.oracle.com/~tohartma/compact_strings/hotspot_test_closed/

The code is ready for integration. The current plan is to integrate via 
the hotspot repo in coming
week if it passes the PIT.

Thanks
-Sherman

On 10/5/15 8:30 AM, Xueming Shen wrote:
> (resent to hotspot-dev at openjdk.java.net)
>
> Hi,
>
> Please review the change for JEP 254/Compact String project.
>
> JPE 254: http://openjdk.java.net/jeps/254
> Issue:   https://bugs.openjdk.java.net/browse/JDK-8054307
> Webrevs: http://cr.openjdk.java.net/~sherman/8054307/jdk/
> http://cr.openjdk.java.net/~thartmann/compact_strings/webrev/hotspot
>
> Description:
>
>   String Density project is to change the internal representation of the
>   String class from a UTF-16 char array to a byte array plus an encoding
>   flag field. The new String class stores characters encoded either as
>   ISO-8859-1/Latin-1 (one byte per character), or as UTF-16 (two bytes
>   per character), based upon the contents of the string. The encoding
>   flag indicates which encoding is used. It offers reduced memory 
> footprint
>   while maintaining throughput performance. See JEP 254 for more 
> additional
>   information
>
> Implementation repo/try out:
>   http://hg.openjdk.java.net/jdk9/sandbox/  branch: JDK-8054307-branch
>
>   $ hg clone http://hg.openjdk.java.net/jdk9/sandbox/
>   $ cd sandbox
>   $ sh ./get_source.sh
>   $ sh ./common/bin/hgforest.sh up -r JDK-8054307-branch
>   $ make configure
>   $ make images
>
> Implementation Notes:
>
>  - To change the internal representation of the String and the String
>    builder classes (AbstractStringBuilder, StringBuilder and 
> StringBuffer)
>    from a UTF-16 char array to a byte array plus an encoding flag field.
>
>    The new representation stores the String characters in a single byte
>    format using the lower 8-bit of character's 16-bit UTF16 value, and
>    sets the encoding flag as LATIN1, if all characters of the String
>    object are Unicode Latin1 characters (with its UTF16 value < \u0100)
>
>    It stores the String characters in 2-byte format with their UTF-16 
> value
>    and sets the flag as UTF16, if any of the character inside the String
>    object is NOT Unicode latin1 character.
>
>  - To change the method implementation of the String class and its 
> builders
>    to function on the new internal character storage, mainly to 
> delegate to
>    two implementation classes StringUTF16 and StringLatin1
>
>  - To update the StringCoding class to decoding/encoding the String 
> between
>    String.byte[]/coder(LATIN1/UTF16) <-> byte[](native encoding) instead
>    of the original String.char[] <-> byte[] (native encoding)
>
>  - To update the hotSpot compiler (new and updated instrinsics), GC 
> (String
>    Deduplication mods) and Runtime to work with the new internal 
> "byte[] +
>    coder flag" representation.
>
>    See Tobias's note for details of the hotspot changes:
> http://cr.openjdk.java.net/~thartmann/compact_strings/hotspot-impl-note
>
>  - To add a vm option "CompactStrings" (default is true) to provide a
>    switch-off mechanism to always store the String characters in UTF16
>    encoding (always 2 bytes, but still in a byte[], instead of the
>    original char[]).
>
>
> Supporting performance artifacts:
>
>  - Report(s) on memory footprint impact
>
> http://cr.openjdk.java.net/~shade/density/string-density-report.pdf
>
>    Latest SPECjbb2005 footprint reduction and throughput numbers for both
>    Intel (Linux) and SPARC, in which it shows the Compact String binaries
>    use less memory and have higher throughput.
>
>    latest:http://cr.openjdk.java.net/~sherman/8054307/specjbb2005
>    old: 
> http://cr.openjdk.java.net/~huntch/string-density/reports/String-Density-SPARC-jbb2005-Report.pdf
>
>  - Throughput performance impact via String API micro-benchmarks
>
> http://cr.openjdk.java.net/~thartmann/compact_strings/microbenchmarks/Haswell_090915.pdf 
>
> http://cr.openjdk.java.net/~thartmann/compact_strings/microbenchmarks/IvyBridge_090915.pdf 
>
> http://cr.openjdk.java.net/~thartmann/compact_strings/microbenchmarks/Sparc_090915.pdf 
>
>    http://cr.openjdk.java.net/~sherman/8054307/string-coding.txt
>
> Thanks,
> Sherman




More information about the core-libs-dev mailing list