<html>
  <head>
    <meta content="text/html; charset=windows-1252"
      http-equiv="Content-Type">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    <p>I would like to go with the webrev.2 version, with asserts
      removed.  All the reviewers have said they are OK with that
      version, and it allows more complete testing than the minimal
      version.</p>
    <p>dl<br>
    </p>
    <br>
    <div class="moz-cite-prefix">On 3/23/17 12:03 PM,
      <a class="moz-txt-link-abbreviated" href="mailto:dean.long@oracle.com">dean.long@oracle.com</a> wrote:<br>
    </div>
    <blockquote
      cite="mid:7b319f81-2a48-1533-efe1-6fd86a3d7244@oracle.com"
      type="cite">
      <meta content="text/html; charset=windows-1252"
        http-equiv="Content-Type">
      <p>On 3/23/17 11:25 AM, <a moz-do-not-send="true"
          class="moz-txt-link-abbreviated"
          href="mailto:dean.long@oracle.com">dean.long@oracle.com</a>
        wrote:<br>
      </p>
      <blockquote
        cite="mid:7ade4c8d-891c-618b-0fa3-4d641da236ba@oracle.com"
        type="cite">On 3/22/17 1:49 PM, Vladimir Ivanov wrote: <br>
        <br>
        <blockquote type="cite">
          <blockquote type="cite">Also, it looks like the changes I made
            to ASB.appendChars(char[] s, int <br>
            off, int end) are not needed. <br>
          </blockquote>
          <br>
          Agree. <br>
          <br>
          <blockquote type="cite">
            <blockquote type="cite">Vladimir, don't you need to replace
              checkIndex with checkOffset in <br>
              indexOf and lastIndexOf, so that we allow count == length?
              <br>
            </blockquote>
          </blockquote>
          <br>
          Yes, my bad. Good catch. Updated webrev in place. <br>
          <br>
          FTR I haven't done any extensive testing of the minimized fix.
          <br>
          <br>
          If we agree to proceed with it, the regression test should be
          updated as well. I think the viable solution would be to
          construct broken SBs (using reflection) and invoke affected
          methods on them. <br>
          <br>
        </blockquote>
        <br>
        We can construct broken SBs using the Helper class that gets
        patched into java.lang.  I'll work on that. <br>
        <br>
      </blockquote>
      <br>
      Nevermind.  I forgot that some problems can only happen when the
      SB is changed half-way through the method.  For example,<br>
      in append(), we can't force an overflow unless we change the SB
      after ensureCapacityInternal() is called.  I could do something
      like:<br>
      <br>
      <meta http-equiv="content-type" content="text/html;
        charset=windows-1252">
      <pre> 760     public AbstractStringBuilder append(int i) {
<span class="new"> 761         int count = this.count;</span>
 762         int spaceNeeded = count + Integer.stringSize(i);
 763         ensureCapacityInternal(spaceNeeded);
 764         if (isLatin1()) {
 >>>>>>          Helper.fuzzValue(this);
 765             Integer.getChars(i, spaceNeeded, value);
 766         } else {
 767             byte[] val = this.value;
 >>>>>>          Helper.fuzzValue(this);
 <span class="changed">768             checkBoundsBeginEnd(count, spaceNeeded, val.length >> 1);</span>
 769             Integer.getCharsUTF16(i, spaceNeeded, val);
 770         }
<span class="changed"> 771         this.count = spaceNeeded;</span>
 772         return this;
 773     }</pre>
      <br>
      where the default Helper.fuzzValue() is an empty method, but the
      test would patch in its own version of Helper that changes the ASB
      field values.  I like this less than refactoring the checks to
      StringUTF16.<br>
      <br>
      dl<br>
      <br>
      <meta http-equiv="content-type" content="text/html;
        charset=windows-1252">
      <blockquote
        cite="mid:7ade4c8d-891c-618b-0fa3-4d641da236ba@oracle.com"
        type="cite">dl <br>
        <br>
        <blockquote type="cite">Best regards, <br>
          Vladimir Ivanov <br>
          <br>
          <blockquote type="cite">
            <blockquote type="cite">On 3/22/17 8:35 AM, Vladimir Ivanov
              wrote: <br>
              <blockquote type="cite">
                <blockquote type="cite">
                  <blockquote type="cite">
                    <blockquote type="cite">So are we convinced that the
                      proposed changes will never lead to a <br>
                      crash due to a missing or incorrect bounds check,
                      due to a racy <br>
                      use of <br>
                      an unsynchronized ASB instance e.g. StringBuilder?
                      <br>
                    </blockquote>
                    <br>
                    If only we had a static analysis tool that could
                    tell us if the <br>
                    code is <br>
                    safe.  Because we don't, in my initial changeset, we
                    always take a <br>
                    snapshot of the ASB fields by passing those field
                    values to <br>
                    StringUTF16 <br>
                    before doing checks on them.  And I wrote a test to
                    make sure that <br>
                    those <br>
                    StringUTF16 interfaces are catching all the
                    underflows and overflows I <br>
                    could imagine, and I added verification code to
                    detect when a check <br>
                    was <br>
                    missed. <br>
                    <br>
                    However, all the reviewers have requested to
                    minimize the amount of <br>
                    changes.  In Vladimir's version, if there is a
                    missing check <br>
                    somewhere, <br>
                    then yes it could lead to a crash. <br>
                  </blockquote>
                </blockquote>
                <br>
                I'd like to point out that asserts and verification code
                are disabled <br>
                by default. They are invaluable during problem
                diagnosis, but don't <br>
                help at all from defence-in-depth perspective. <br>
                <br>
                But I agree that it's easier to reason about and test
                the initial <br>
                version of the fix. <br>
                <br>
                <blockquote type="cite">I wonder if the reviewers have
                  fully realized the potential impact <br>
                  here? <br>
                  This has exposed a flaw in the way intrinsics are used
                  from core <br>
                  classes. <br>
                </blockquote>
                <br>
                FTR here are the checks I omitted in the minimized
                version (modulo <br>
                separation of indexOf/lastIndexOf for
                trusted/non-trusted callers): <br>
                <br>
                <a moz-do-not-send="true" class="moz-txt-link-freetext"
href="http://cr.openjdk.java.net/%7Evlivanov/dlong/8158168/redundant_checks/">http://cr.openjdk.java.net/~vlivanov/dlong/8158168/redundant_checks/</a>
                <br>
                <br>
                Other than that, the difference is mainly about undoing
                refactorings <br>
                and removing verification logic (asserts + checks in the
                JVM). <br>
                <br>
                There are still unsafe accesses which are considered
                safe in both <br>
                versions (see StringUTF16.Trusted usages in the initial
                version [1]). <br>
                <br>
                We used to provide safe wrappers for unsafe intrinsics
                which makes it <br>
                much easier to reason about code correctness. I'd like
                to see compact <br>
                string code refactored that way and IMO the initial
                version by Dean <br>
                is a big step in the right direction. <br>
                <br>
                I still prefer to see a point fix in 9 and major
                refactoring <br>
                happening in 10, but I'll leave the decision on how to
                proceed with <br>
                the fix to core-libs folks. After finishing the exercise
                minimizing <br>
                the fix, I'm much more comfortable with the initial fix
                [1] (though <br>
                there are changes I consider excessive). <br>
                <br>
                Best regards, <br>
                Vladimir Ivanov <br>
                <br>
                [1] <a moz-do-not-send="true"
                  class="moz-txt-link-freetext"
                  href="http://cr.openjdk.java.net/%7Edlong/8158168/webrev.0">http://cr.openjdk.java.net/~dlong/8158168/webrev.0</a>
                <br>
                <br>
                <blockquote type="cite">
                  <blockquote type="cite">
                    <blockquote type="cite">
                      <blockquote type="cite">
                        <blockquote type="cite">Some clarifications: <br>
                          <br>
                          ============ <br>
src/java.base/share/classes/java/lang/String.java: <br>
                          <br>
                          The bounds check is needed only in
                          String.nonSyncContentEquals <br>
                          when it <br>
                          extracts info from AbstractStringBuilder. I
                          don't see how out of <br>
                          bounds access can happen in
                          String.contentEquals: <br>
                                   if (n != length()) { <br>
                                       return false; <br>
                                   } <br>
                          ... <br>
                                       for (int i = 0; i < n; i++) {
                          <br>
                                           if (StringUTF16.getChar(val,
                          i) != cs.charAt(i)) { <br>
                          <br>
                        </blockquote>
                        <br>
                        OK. <br>
                        <br>
                        <blockquote type="cite">============ <br>
src/java.base/share/classes/java/lang/StringConcatHelper.java: <br>
                          <br>
                          I think bounds checks in
                          StringConcatHelper.prepend() are skipped <br>
                          intentionally, since
                          java.lang.invoke.StringConcatFactory <br>
                          constructs <br>
                          method handle chains which already contain
                          bounds checks: array <br>
                          length <br>
                          is precomputed based on argument values and
                          all accesses are <br>
                          guaranteed to be in bounds. <br>
                          <br>
                        </blockquote>
                        <br>
                        This is calling the trusted version of
                        getChars() with no bounds <br>
                        checks.  It was a little more obvious when I had
                        the Trusted inner <br>
                        class. <br>
                        <br>
                        <blockquote type="cite">============ <br>
src/java.base/share/classes/java/lang/StringUTF16.java: <br>
                          <br>
                          +    static void putChar(byte[] val, int
                          index, int c) { <br>
                          +        assert index >= 0 && index
                          < length(val) : "Trusted caller <br>
                          missed bounds check"; <br>
                          <br>
                          Unfortunately, asserts can affect inlining
                          decisions (since they <br>
                          increase bytecode size). In order to minimize
                          possible performance <br>
                          impact, I suggest to remove them from the fix
                          targeting 9. <br>
                          <br>
                        </blockquote>
                        <br>
                        Sure. <br>
                        <br>
                        <blockquote type="cite">============ <br>
                               private static int
                          indexOfSupplementary(byte[] value, int <br>
                          ch, int <br>
                          fromIndex, int max) { <br>
                                   if (Character.isValidCodePoint(ch)) {
                          <br>
                                       final char hi =
                          Character.highSurrogate(ch); <br>
                                       final char lo =
                          Character.lowSurrogate(ch); <br>
                          +            checkBoundsBeginEnd(fromIndex,
                          max, value); <br>
                          <br>
                          The check is redundant here. fromIndex &
                          max are always inbounds by <br>
                          construction: <br>
                          <br>
                              public static int indexOf(byte[] value,
                          int ch, int <br>
                          fromIndex) { <br>
                                  int max = value.length >> 1; <br>
                                  if (fromIndex < 0) { <br>
                                      fromIndex = 0; <br>
                                  } else if (fromIndex >= max) { <br>
                                      // Note: fromIndex might be near
                          -1>>>1. <br>
                                      return -1; <br>
                                  } <br>
                          ... <br>
                                      return indexOfSupplementary(value,
                          ch, fromIndex, max); <br>
                          <br>
                        </blockquote>
                        <br>
                        OK. <br>
                        <br>
                        <blockquote type="cite">============ <br>
                          I moved bounds checks from
                          StringUTF16.lastIndexOf/indexOf to <br>
                          ABS.indexOf/lastIndexOf. I think it's enough
                          to do range check on <br>
                          ABS.value & ABS.count. After that, all
                          accesses should be <br>
                          inbounds by <br>
                          construction (in String.indexOf/lastIndexOf):
                          <br>
                          <br>
jdk/src/java.base/share/classes/java/lang/StringUTF16.java: <br>
                              static int lastIndexOf(byte[] src, byte
                          srcCoder, int srcCount, <br>
                                                     String tgtStr, int
                          fromIndex) { <br>
                          <br>
                                  int rightIndex = srcCount - tgtCount;
                          <br>
                                  if (fromIndex > rightIndex) { <br>
                                      fromIndex = rightIndex; <br>
                                  } <br>
                                  if (fromIndex < 0) { <br>
                                      return -1; <br>
                                  } <br>
                          <br>
jdk/src/java.base/share/classes/java/lang/StringUTF16.java: <br>
                              public static int lastIndexOf(byte[] src,
                          int srcCount, <br>
                                                            byte[] tgt,
                          int tgtCount, int <br>
                          fromIndex) { <br>
                                  int min = tgtCount - 1; <br>
                                  int i = min + fromIndex; <br>
                                  int strLastIndex = tgtCount - 1; <br>
                                  char strLastChar = getChar(tgt,
                          strLastIndex); <br>
                          <br>
                              startSearchForLastChar: <br>
                                  while (true) { <br>
                                      while (i >= min &&
                          getChar(src, i) != strLastChar) { <br>
                          <br>
                          There are 2 places: <br>
                            * getChar(tgt, strLastIndex) =>
                          getChar(tgt, tgtCount-1) - <br>
                          inbound <br>
                          <br>
                            * getChar(src, i); i in [ min; min+fromIndex
                          ] <br>
                              min = tgtCount - 1 <br>
                              rightIndex = srcCount - tgtCount <br>
                              fromIndex <= rightIndex <br>
                          <br>
                                     0 <= min + fromIndex <= min +
                          rightIndex == (tgtCount <br>
                          - 1) <br>
                          + (srcCount - tgtCount) == srcCount - 1 <br>
                          <br>
                              Hence, should be covered by the check on
                          count & value: <br>
                                public int lastIndexOf(String str, int
                          fromIndex) { <br>
                          +         byte[] value = this.value; <br>
                          +         int count = this.count; <br>
                          +         byte coder = this.coder; <br>
                          +         checkIndex(count, value.length
                          >> coder); <br>
                                    return String.lastIndexOf(value,
                          coder, count, str, <br>
                          fromIndex); <br>
                                } <br>
                          <br>
                        </blockquote>
                        <br>
                        OK, I will go with your version if it's OK with
                        Sherman. <br>
                        <br>
                        dl <br>
                        <br>
                        <blockquote type="cite">Best regards, <br>
                          Vladimir Ivanov <br>
                          <br>
                          <blockquote type="cite">On 3/17/17 5:58 AM,
                            Vladimir Ivanov wrote: <br>
                            <blockquote type="cite"> <br>
                              <blockquote type="cite">
                                <blockquote type="cite">I have the same
                                  concern. Can we fix the immediate
                                  problem in <br>
                                  9 and <br>
                                  integrate verification logic in 10? <br>
                                  <br>
                                </blockquote>
                                <br>
                                OK, Tobias is suggesting having
                                verification logic only <br>
                                inside the <br>
                                intrinsics.  Are you suggesting removing
                                that as well? <br>
                              </blockquote>
                              <br>
                              Yes and put them back in 10. <br>
                              <br>
                              <blockquote type="cite">I'm OK with
                                removing all the verification, but that
                                won't reduce <br>
                                the <br>
                                library changes much.  I could undo the
                                renaming to <br>
                                Trusted.getChar, but <br>
                                we would still have the bounds checks
                                moved into StringUTF16. <br>
                              </blockquote>
                              <br>
                              I suggest to go with a point fix for 9:
                              just add missing range <br>
                              checks. <br>
                            </blockquote>
                            <br>
                          </blockquote>
                        </blockquote>
                        <br>
                      </blockquote>
                    </blockquote>
                    <br>
                  </blockquote>
                </blockquote>
              </blockquote>
              <br>
            </blockquote>
            <br>
          </blockquote>
        </blockquote>
        <br>
      </blockquote>
      <br>
    </blockquote>
    <br>
  </body>
</html>