FYC: 7197183 : Provide CharSequence.subSequenceView which allows for sub-sequence views of character sequences.
Louis Wasserman
lowasser at google.com
Thu Jul 17 00:12:47 UTC 2014
If I recall correctly, CharBuffer.wrap(charSequence).subSequence(from, to)
has equivalent semantics without requiring a new implementation, even if
the implementation is fairly simple.
On Wed, Jul 16, 2014 at 5:09 PM, Mike Duigou <mike.duigou at oracle.com> wrote:
> Hello all;
>
> In Java 7u6 there was a significant change in the implementation of
> java.lang.String (JDK-6924259). This was done to reduce the size of String
> instances and it has been generally regarded as a positive change. As with
> almost any significant change to a class as core to Java as String there
> have also been applications negatively impacted. Most of the problems
> involve applications which make heavy use of String.substring() as
> sub-string instances now involve creation of their own copies of the
> backing characters.
>
> There have been previous discussions of mitigations to the 6924259 change
> in String.substring() behaviour. These discussions haven't come to positive
> conclusions mostly because they generally require too many changes to the
> specification or behaviour of String. So here's another proposal (enclosed)
> that doesn't change the behaviour of any existing classes. It adds two new
> methods to CharSequence to create sub-sequence views of character
> sequences. The size of sub-sequence instances very closely matches the size
> of pre-6924259 String instances and indeed the implementation has the same
> pre-6924259 limitations, namely that the entire source CharSequence remains
> alive as long as the sub-sequence is referenced.
>
> Unlike pre-6924259 the CharSubSequenceView can not be reliably compared
> via equals() to String instances and it is unsuitable for use as a hash map
> key.
>
> With these benefits and caveats in mind, would you use this?
>
> Mike
>
> diff -r 66f582158e1c src/share/classes/java/lang/CharSequence.java
> --- a/src/share/classes/java/lang/CharSequence.java Wed Jul 16
> 20:43:53 2014 +0100
> +++ b/src/share/classes/java/lang/CharSequence.java Wed Jul 16
> 16:58:52 2014 -0700
> @@ -25,11 +25,14 @@
>
> package java.lang;
>
> +import java.io.Serializable;
> import java.util.NoSuchElementException;
> +import java.util.Objects;
> import java.util.PrimitiveIterator;
> import java.util.Spliterator;
> import java.util.Spliterators;
> import java.util.function.IntConsumer;
> +import java.util.function.IntSupplier;
> import java.util.stream.IntStream;
> import java.util.stream.StreamSupport;
>
> @@ -231,4 +234,114 @@
> Spliterator.ORDERED,
> false);
> }
> +
> + /**
> + * Provides a sub-sequence view on a character sequence. Changes in
> the
> + * source will be reflected in the sub-sequence. The sub-sequence
> must, at
> + * all times, be a proper sub-sequence of the source character
> sequence.
> + *
> + * @since 1.9
> + */
> + static final class CharSubSequenceView implements CharSequence,
> Serializable {
> +
> + private final CharSequence source;
> + private final int fromInclusive;
> + private final IntSupplier toExclusive;
> +
> + CharSubSequenceView(CharSequence source, int fromInclusive, int
> toExclusive) {
> + this(source, fromInclusive, () -> toExclusive);
> + }
> +
> + CharSubSequenceView(CharSequence source, int fromInclusive,
> IntSupplier toExclusive) {
> + this.source = Objects.requireNonNull(source);
> + if(fromInclusive < 0 || fromInclusive >= source.length() ||
> + toExclusive.getAsInt() < fromInclusive ||
> toExclusive.getAsInt() > source.length()) {
> + throw new IllegalArgumentException("Invalid index");
> + }
> + this.fromInclusive = fromInclusive;
> + this.toExclusive = toExclusive;
> + }
> +
> + @Override
> + public int length() {
> + return toExclusive.getAsInt() - fromInclusive;
> + }
> +
> + @Override
> + public char charAt(int index) {
> + if(index >= length()) {
> + throw new IllegalArgumentException("Invalid Index");
> + }
> + //
> + return source.charAt(fromInclusive + index);
> + }
> +
> + @Override
> + public CharSequence subSequence(int start, int end) {
> + if (end > length()) {
> + throw new IllegalArgumentException("Invalid Index");
> + }
> + return source.subSequence(fromInclusive + start, fromInclusive
> + end);
> + }
> +
> + @Override
> + public String toString() {
> + int len = length();
> + char[] chars = new char[len];
> + for(int each = 0; each < len; each++) {
> + chars[each] = charAt(each);
> + }
> + return new String(chars, true);
> + }
> + }
> +
> + /**
> + * Returns as a character sequence the specified sub-sequence view of
> the
> + * provided source character sequence. Changes in the source will be
> + * reflected in the sub-sequence. The sub-sequence must, at all
> times, be
> + * a proper sub-sequence of the source character sequence.
> + *
> + * @param source The character sequence from which the sub-sequence is
> + * derived.
> + * @param startInclusive The index of the character in the source
> character
> + * sequence which will be the first character in the sub-sequence.
> + * @param endExclusive The index after the last the character in the
> source
> + * character sequence which will be the last character in the
> sub-sequence
> + * @return the character sub-sequence.
> + * @since 1.9
> + */
> + static CharSequence subSequenceView(CharSequence source, int
> startInclusive, int endExclusive) {
> + return new CharSubSequenceView(source, startInclusive,
> endExclusive);
> + }
> +
> + /**
> + * Returns as a character sequence the specified sub-sequence view of
> the
> + * provided source character sequence. Changes in the source will be
> + * reflected in the sub-sequence. The sub-sequence must, at all
> times, be
> + * a proper sub-sequence of the source character sequence. This
> variation
> + * allows for the size of the sub-sequence to vary, usually to follow
> the
> + * size of a growing character sequence.
> + *
> + * @apiNote The most common usage of this subSequence is to follow
> changes
> + * in the size of the source.
> + * {@code
> + * StringBuilder source = new StringBuilder("prefix:");
> + * CharSeqence toEnd = CharSequence.subSequence(source, 7,
> source::length);
> + * }
> + * In this example the value of {@code toEnd} will always be a
> sub-sequence
> + * of {@code source} but will omit the first 7 characters.
> + *
> + * @param source The character sequence from which the sub-sequence is
> + * derived.
> + * @param startInclusive The index of the character in the source
> character
> + * sequence which will be the first character in the sub-sequence.
> + * @param endExclusive A supplier which returns the index after the
> last the
> + * character in the source character sequence which will be the last
> + * character in the sub-sequence
> + * @return the character sub-sequence.
> + * @since 1.9
> + */
> + static CharSequence subSequenceView(CharSequence source, int
> startInclusive, IntSupplier endExclusive) {
> + return new CharSubSequenceView(source, startInclusive,
> endExclusive);
> + }
> }
>
>
--
Louis Wasserman
More information about the core-libs-dev
mailing list