[9] RFR(L) 8158168: SIGSEGV: CollectedHeap::fill_with_objects(HeapWord*, unsigned long, bool)+0xa8

Fri Mar 31 06:49:05 UTC 2017

On 30.03.2017 20:24, dean.long at oracle.com wrote:
> I would like to go with the webrev.2 version, with asserts removed.  All the reviewers have said they are OK with that version, and it allows more complete testing than the minimal version.

Okay, I'm fine with that!

Best regards,
Tobias

> On 3/23/17 12:03 PM, dean.long at oracle.com wrote:
>>
>> On 3/23/17 11:25 AM, dean.long at oracle.com wrote:
>>
>>> On 3/22/17 1:49 PM, Vladimir Ivanov wrote:
>>>
>>>>> Also, it looks like the changes I made to ASB.appendChars(char[] s, int
>>>>> off, int end) are not needed.
>>>>
>>>> Agree.
>>>>
>>>>>> Vladimir, don't you need to replace checkIndex with checkOffset in
>>>>>> indexOf and lastIndexOf, so that we allow count == length?
>>>>
>>>> Yes, my bad. Good catch. Updated webrev in place.
>>>>
>>>> FTR I haven't done any extensive testing of the minimized fix.
>>>>
>>>> If we agree to proceed with it, the regression test should be updated as well. I think the viable solution would be to construct broken SBs (using reflection) and invoke affected methods on them.
>>>>
>>>
>>> We can construct broken SBs using the Helper class that gets patched into java.lang.  I'll work on that.
>>>
>>
>> Nevermind.  I forgot that some problems can only happen when the SB is changed half-way through the method.  For example,
>> in append(), we can't force an overflow unless we change the SB after ensureCapacityInternal() is called.  I could do something like:
>>
>>  760     public AbstractStringBuilder append(int i) {
>> 761 int count = this.count;
>>  762         int spaceNeeded = count + Integer.stringSize(i);
>>  763         ensureCapacityInternal(spaceNeeded);
>>  764         if (isLatin1()) {
>>  >>>>>>          Helper.fuzzValue(this);
>>  765             Integer.getChars(i, spaceNeeded, value);
>>  766         } else {
>>  767             byte[] val = this.value;
>>  >>>>>>          Helper.fuzzValue(this);
>>  768 checkBoundsBeginEnd(count, spaceNeeded, val.length >> 1);
>>  769             Integer.getCharsUTF16(i, spaceNeeded, val);
>>  770         }
>> 771 this.count = spaceNeeded;
>>  772         return this;
>>  773     }
>>
>> where the default Helper.fuzzValue() is an empty method, but the test would patch in its own version of Helper that changes the ASB field values.  I like this less than refactoring the checks to StringUTF16.
>>
>> dl
>>
>>> dl
>>>
>>>> Best regards,
>>>> Vladimir Ivanov
>>>>
>>>>>> On 3/22/17 8:35 AM, Vladimir Ivanov wrote:
>>>>>>>>>> So are we convinced that the proposed changes will never lead to a
>>>>>>>>>> crash due to a missing or incorrect bounds check, due to a racy
>>>>>>>>>> use of
>>>>>>>>>> an unsynchronized ASB instance e.g. StringBuilder?
>>>>>>>>>
>>>>>>>>> If only we had a static analysis tool that could tell us if the
>>>>>>>>> code is
>>>>>>>>> safe.  Because we don't, in my initial changeset, we always take a
>>>>>>>>> snapshot of the ASB fields by passing those field values to
>>>>>>>>> StringUTF16
>>>>>>>>> before doing checks on them.  And I wrote a test to make sure that
>>>>>>>>> those
>>>>>>>>> StringUTF16 interfaces are catching all the underflows and overflows I
>>>>>>>>> could imagine, and I added verification code to detect when a check
>>>>>>>>> was
>>>>>>>>> missed.
>>>>>>>>>
>>>>>>>>> However, all the reviewers have requested to minimize the amount of
>>>>>>>>> changes.  In Vladimir's version, if there is a missing check
>>>>>>>>> somewhere,
>>>>>>>>> then yes it could lead to a crash.
>>>>>>>
>>>>>>> I'd like to point out that asserts and verification code are disabled
>>>>>>> by default. They are invaluable during problem diagnosis, but don't
>>>>>>> help at all from defence-in-depth perspective.
>>>>>>>
>>>>>>> But I agree that it's easier to reason about and test the initial
>>>>>>> version of the fix.
>>>>>>>
>>>>>>>> I wonder if the reviewers have fully realized the potential impact
>>>>>>>> here?
>>>>>>>> This has exposed a flaw in the way intrinsics are used from core
>>>>>>>> classes.
>>>>>>>
>>>>>>> FTR here are the checks I omitted in the minimized version (modulo
>>>>>>> separation of indexOf/lastIndexOf for trusted/non-trusted callers):
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~vlivanov/dlong/8158168/redundant_checks/
>>>>>>>
>>>>>>> Other than that, the difference is mainly about undoing refactorings
>>>>>>> and removing verification logic (asserts + checks in the JVM).
>>>>>>>
>>>>>>> There are still unsafe accesses which are considered safe in both
>>>>>>> versions (see StringUTF16.Trusted usages in the initial version [1]).
>>>>>>>
>>>>>>> We used to provide safe wrappers for unsafe intrinsics which makes it
>>>>>>> much easier to reason about code correctness. I'd like to see compact
>>>>>>> string code refactored that way and IMO the initial version by Dean
>>>>>>> is a big step in the right direction.
>>>>>>>
>>>>>>> I still prefer to see a point fix in 9 and major refactoring
>>>>>>> happening in 10, but I'll leave the decision on how to proceed with
>>>>>>> the fix to core-libs folks. After finishing the exercise minimizing
>>>>>>> the fix, I'm much more comfortable with the initial fix [1] (though
>>>>>>> there are changes I consider excessive).
>>>>>>>
>>>>>>> Best regards,
>>>>>>> Vladimir Ivanov
>>>>>>>
>>>>>>> [1] http://cr.openjdk.java.net/~dlong/8158168/webrev.0
>>>>>>>
>>>>>>>>>>>> Some clarifications:
>>>>>>>>>>>>
>>>>>>>>>>>> ============
>>>>>>>>>>>> src/java.base/share/classes/java/lang/String.java:
>>>>>>>>>>>>
>>>>>>>>>>>> The bounds check is needed only in String.nonSyncContentEquals
>>>>>>>>>>>> when it
>>>>>>>>>>>> extracts info from AbstractStringBuilder. I don't see how out of
>>>>>>>>>>>> bounds access can happen in String.contentEquals:
>>>>>>>>>>>>          if (n != length()) {
>>>>>>>>>>>>              return false;
>>>>>>>>>>>>          }
>>>>>>>>>>>> ...
>>>>>>>>>>>>              for (int i = 0; i < n; i++) {
>>>>>>>>>>>>                  if (StringUTF16.getChar(val, i) != cs.charAt(i)) {
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> OK.
>>>>>>>>>>>
>>>>>>>>>>>> ============
>>>>>>>>>>>> src/java.base/share/classes/java/lang/StringConcatHelper.java:
>>>>>>>>>>>>
>>>>>>>>>>>> I think bounds checks in StringConcatHelper.prepend() are skipped
>>>>>>>>>>>> intentionally, since java.lang.invoke.StringConcatFactory
>>>>>>>>>>>> constructs
>>>>>>>>>>>> method handle chains which already contain bounds checks: array
>>>>>>>>>>>> length
>>>>>>>>>>>> is precomputed based on argument values and all accesses are
>>>>>>>>>>>> guaranteed to be in bounds.
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> This is calling the trusted version of getChars() with no bounds
>>>>>>>>>>> checks.  It was a little more obvious when I had the Trusted inner
>>>>>>>>>>> class.
>>>>>>>>>>>
>>>>>>>>>>>> ============
>>>>>>>>>>>> src/java.base/share/classes/java/lang/StringUTF16.java:
>>>>>>>>>>>>
>>>>>>>>>>>> +    static void putChar(byte[] val, int index, int c) {
>>>>>>>>>>>> +        assert index >= 0 && index < length(val) : "Trusted caller
>>>>>>>>>>>> missed bounds check";
>>>>>>>>>>>>
>>>>>>>>>>>> Unfortunately, asserts can affect inlining decisions (since they
>>>>>>>>>>>> increase bytecode size). In order to minimize possible performance
>>>>>>>>>>>> impact, I suggest to remove them from the fix targeting 9.
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Sure.
>>>>>>>>>>>
>>>>>>>>>>>> ============
>>>>>>>>>>>>      private static int indexOfSupplementary(byte[] value, int
>>>>>>>>>>>> ch, int
>>>>>>>>>>>> fromIndex, int max) {
>>>>>>>>>>>>          if (Character.isValidCodePoint(ch)) {
>>>>>>>>>>>>              final char hi = Character.highSurrogate(ch);
>>>>>>>>>>>>              final char lo = Character.lowSurrogate(ch);
>>>>>>>>>>>> +            checkBoundsBeginEnd(fromIndex, max, value);
>>>>>>>>>>>>
>>>>>>>>>>>> The check is redundant here. fromIndex & max are always inbounds by
>>>>>>>>>>>> construction:
>>>>>>>>>>>>
>>>>>>>>>>>>     public static int indexOf(byte[] value, int ch, int
>>>>>>>>>>>> fromIndex) {
>>>>>>>>>>>>         int max = value.length >> 1;
>>>>>>>>>>>>         if (fromIndex < 0) {
>>>>>>>>>>>>             fromIndex = 0;
>>>>>>>>>>>>         } else if (fromIndex >= max) {
>>>>>>>>>>>>             // Note: fromIndex might be near -1>>>1.
>>>>>>>>>>>>             return -1;
>>>>>>>>>>>>         }
>>>>>>>>>>>> ...
>>>>>>>>>>>>             return indexOfSupplementary(value, ch, fromIndex, max);
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> OK.
>>>>>>>>>>>
>>>>>>>>>>>> ============
>>>>>>>>>>>> I moved bounds checks from StringUTF16.lastIndexOf/indexOf to
>>>>>>>>>>>> ABS.indexOf/lastIndexOf. I think it's enough to do range check on
>>>>>>>>>>>> ABS.value & ABS.count. After that, all accesses should be
>>>>>>>>>>>> inbounds by
>>>>>>>>>>>> construction (in String.indexOf/lastIndexOf):
>>>>>>>>>>>>
>>>>>>>>>>>> jdk/src/java.base/share/classes/java/lang/StringUTF16.java:
>>>>>>>>>>>>     static int lastIndexOf(byte[] src, byte srcCoder, int srcCount,
>>>>>>>>>>>>                            String tgtStr, int fromIndex) {
>>>>>>>>>>>>
>>>>>>>>>>>>         int rightIndex = srcCount - tgtCount;
>>>>>>>>>>>>         if (fromIndex > rightIndex) {
>>>>>>>>>>>>             fromIndex = rightIndex;
>>>>>>>>>>>>         }
>>>>>>>>>>>>         if (fromIndex < 0) {
>>>>>>>>>>>>             return -1;
>>>>>>>>>>>>         }
>>>>>>>>>>>>
>>>>>>>>>>>> jdk/src/java.base/share/classes/java/lang/StringUTF16.java:
>>>>>>>>>>>>     public static int lastIndexOf(byte[] src, int srcCount,
>>>>>>>>>>>>                                   byte[] tgt, int tgtCount, int
>>>>>>>>>>>> fromIndex) {
>>>>>>>>>>>>         int min = tgtCount - 1;
>>>>>>>>>>>>         int i = min + fromIndex;
>>>>>>>>>>>>         int strLastIndex = tgtCount - 1;
>>>>>>>>>>>>         char strLastChar = getChar(tgt, strLastIndex);
>>>>>>>>>>>>
>>>>>>>>>>>>     startSearchForLastChar:
>>>>>>>>>>>>         while (true) {
>>>>>>>>>>>>             while (i >= min && getChar(src, i) != strLastChar) {
>>>>>>>>>>>>
>>>>>>>>>>>> There are 2 places:
>>>>>>>>>>>>   * getChar(tgt, strLastIndex) => getChar(tgt, tgtCount-1) -
>>>>>>>>>>>> inbound
>>>>>>>>>>>>
>>>>>>>>>>>>   * getChar(src, i); i in [ min; min+fromIndex ]
>>>>>>>>>>>>     min = tgtCount - 1
>>>>>>>>>>>>     rightIndex = srcCount - tgtCount
>>>>>>>>>>>>     fromIndex <= rightIndex
>>>>>>>>>>>>
>>>>>>>>>>>>            0 <= min + fromIndex <= min + rightIndex == (tgtCount
>>>>>>>>>>>> - 1)
>>>>>>>>>>>> + (srcCount - tgtCount) == srcCount - 1
>>>>>>>>>>>>
>>>>>>>>>>>>     Hence, should be covered by the check on count & value:
>>>>>>>>>>>>       public int lastIndexOf(String str, int fromIndex) {
>>>>>>>>>>>> +         byte[] value = this.value;
>>>>>>>>>>>> +         int count = this.count;
>>>>>>>>>>>> +         byte coder = this.coder;
>>>>>>>>>>>> +         checkIndex(count, value.length >> coder);
>>>>>>>>>>>>           return String.lastIndexOf(value, coder, count, str,
>>>>>>>>>>>> fromIndex);
>>>>>>>>>>>>       }
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> OK, I will go with your version if it's OK with Sherman.
>>>>>>>>>>>
>>>>>>>>>>> dl
>>>>>>>>>>>
>>>>>>>>>>>> Best regards,
>>>>>>>>>>>> Vladimir Ivanov
>>>>>>>>>>>>
>>>>>>>>>>>>> On 3/17/17 5:58 AM, Vladimir Ivanov wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I have the same concern. Can we fix the immediate problem in
>>>>>>>>>>>>>>>> 9 and
>>>>>>>>>>>>>>>> integrate verification logic in 10?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> OK, Tobias is suggesting having verification logic only
>>>>>>>>>>>>>>> inside the
>>>>>>>>>>>>>>> intrinsics.  Are you suggesting removing that as well?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Yes and put them back in 10.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I'm OK with removing all the verification, but that won't reduce
>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>> library changes much.  I could undo the renaming to
>>>>>>>>>>>>>>> Trusted.getChar, but
>>>>>>>>>>>>>>> we would still have the bounds checks moved into StringUTF16.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I suggest to go with a point fix for 9: just add missing range
>>>>>>>>>>>>>> checks.
>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>
>>>>>>
>>>>>
>>>
>>
>