RFR: String Density/Compact String JEP 254 (update)
Xueming Shen
xueming.shen at oracle.com
Fri Oct 30 21:30:23 UTC 2015
Hi,
Thanks for the comments/suggestions. Here are the updated webrevs with
minor changes here
and there based on the feedback.
http://cr.openjdk.java.net/~sherman/8054307/jdk/
http://cr.openjdk.java.net/~thartmann/compact_strings/webrev/hotspot/
[closed, Oracle internal only]
http://javaweb.us.oracle.com/~tohartma/compact_strings/hotspot/
http://javaweb.us.oracle.com/~tohartma/compact_strings/hotspot_test_closed/
The code is ready for integration. The current plan is to integrate via
the hotspot repo in coming
week if it passes the PIT.
Thanks
-Sherman
On 10/5/15 8:30 AM, Xueming Shen wrote:
> (resent to hotspot-dev at openjdk.java.net)
>
> Hi,
>
> Please review the change for JEP 254/Compact String project.
>
> JPE 254: http://openjdk.java.net/jeps/254
> Issue: https://bugs.openjdk.java.net/browse/JDK-8054307
> Webrevs: http://cr.openjdk.java.net/~sherman/8054307/jdk/
> http://cr.openjdk.java.net/~thartmann/compact_strings/webrev/hotspot
>
> Description:
>
> String Density project is to change the internal representation of the
> String class from a UTF-16 char array to a byte array plus an encoding
> flag field. The new String class stores characters encoded either as
> ISO-8859-1/Latin-1 (one byte per character), or as UTF-16 (two bytes
> per character), based upon the contents of the string. The encoding
> flag indicates which encoding is used. It offers reduced memory
> footprint
> while maintaining throughput performance. See JEP 254 for more
> additional
> information
>
> Implementation repo/try out:
> http://hg.openjdk.java.net/jdk9/sandbox/ branch: JDK-8054307-branch
>
> $ hg clone http://hg.openjdk.java.net/jdk9/sandbox/
> $ cd sandbox
> $ sh ./get_source.sh
> $ sh ./common/bin/hgforest.sh up -r JDK-8054307-branch
> $ make configure
> $ make images
>
> Implementation Notes:
>
> - To change the internal representation of the String and the String
> builder classes (AbstractStringBuilder, StringBuilder and
> StringBuffer)
> from a UTF-16 char array to a byte array plus an encoding flag field.
>
> The new representation stores the String characters in a single byte
> format using the lower 8-bit of character's 16-bit UTF16 value, and
> sets the encoding flag as LATIN1, if all characters of the String
> object are Unicode Latin1 characters (with its UTF16 value < \u0100)
>
> It stores the String characters in 2-byte format with their UTF-16
> value
> and sets the flag as UTF16, if any of the character inside the String
> object is NOT Unicode latin1 character.
>
> - To change the method implementation of the String class and its
> builders
> to function on the new internal character storage, mainly to
> delegate to
> two implementation classes StringUTF16 and StringLatin1
>
> - To update the StringCoding class to decoding/encoding the String
> between
> String.byte[]/coder(LATIN1/UTF16) <-> byte[](native encoding) instead
> of the original String.char[] <-> byte[] (native encoding)
>
> - To update the hotSpot compiler (new and updated instrinsics), GC
> (String
> Deduplication mods) and Runtime to work with the new internal
> "byte[] +
> coder flag" representation.
>
> See Tobias's note for details of the hotspot changes:
> http://cr.openjdk.java.net/~thartmann/compact_strings/hotspot-impl-note
>
> - To add a vm option "CompactStrings" (default is true) to provide a
> switch-off mechanism to always store the String characters in UTF16
> encoding (always 2 bytes, but still in a byte[], instead of the
> original char[]).
>
>
> Supporting performance artifacts:
>
> - Report(s) on memory footprint impact
>
> http://cr.openjdk.java.net/~shade/density/string-density-report.pdf
>
> Latest SPECjbb2005 footprint reduction and throughput numbers for both
> Intel (Linux) and SPARC, in which it shows the Compact String binaries
> use less memory and have higher throughput.
>
> latest:http://cr.openjdk.java.net/~sherman/8054307/specjbb2005
> old:
> http://cr.openjdk.java.net/~huntch/string-density/reports/String-Density-SPARC-jbb2005-Report.pdf
>
> - Throughput performance impact via String API micro-benchmarks
>
> http://cr.openjdk.java.net/~thartmann/compact_strings/microbenchmarks/Haswell_090915.pdf
>
> http://cr.openjdk.java.net/~thartmann/compact_strings/microbenchmarks/IvyBridge_090915.pdf
>
> http://cr.openjdk.java.net/~thartmann/compact_strings/microbenchmarks/Sparc_090915.pdf
>
> http://cr.openjdk.java.net/~sherman/8054307/string-coding.txt
>
> Thanks,
> Sherman
More information about the core-libs-dev
mailing list