[9] RFR(S): Crash with assert: symbol conversion failure in java_lang_String::create_from_symbol()

Tobias Hartmann tobias.hartmann at oracle.com
Tue Sep 20 06:20:59 UTC 2016


Hi David,

On 20.09.2016 02:02, David Holmes wrote:
> Let me get this straight.
> 
> The compact strings push occurred on Nov 3, 2015, some ten months ago! One of the assertions was commented out in that push because otherwise some tests failed. Someone made the decision that the tests were right and the assert was wrong - in which case it should just have been deleted from the code.

Right, I referenced the failing (closed) test in my latest comment in the bug. It's not clear to me whether the test or the assert is "correct" and unfortunately I was not able to reproduce this old failure. As suggested, I would like to file a new bug to investigate re-enabling this assert.

> Now 10 months later the other assert is suddenly failing and the claim is that it is the assert that is wrong! Why is it failing now - is this is a new test? If so we have to establish whether it is the test code or the VM code that is making the wrong assumptions. If the test is deemed valid then it is now passing invalid UTF-8 through a path never before tested with invalid UTF-8, so we need to know that the code will in fact handle that invalid input safely. Only then should the assert be removed, and that is only if it is determined that the test is valid.

As described in the bug, the JNI test is broken and reads the UTF-8 strings from after the end of a char array. This garbage may or may not be invalid UTF-8. We were just unlucky to hit this now. So in this case it's the test that is wrong and not the assert.

> In this particular case a bad string is being passed to jni_FindClass. JNI doesn't do validation of the string (bar minimal sanity checks) but should the VM be being more vigilant? It may be that an invalidly formed UTF-8 string can travel safely through the VM, and in this case (as Tobias indicated) just result in the class not being found. Or it may be there are places that make assumptions about the validity of such a string.

Yes and to be on the safe side and to move forward, I would like to close this bug as "not an issue" and file a new bug to investigate re-enabling the create_from_str() assert.

Are you okay with that? 

Thanks,
Tobias

> 
> David
> -----
> 
> On 19/09/2016 7:21 PM, Tobias Hartmann wrote:
>> Hi David,
>>
>> On 19.09.2016 10:30, David Holmes wrote:
>>> On 19/09/2016 6:09 PM, Tobias Hartmann wrote:
>>>> Coleen, David, thanks for the reviews and sorry for the delay!
>>>>
>>>> I agree that we should validate the Strings we get through JNI, but the code in create_from_str() and create_from_symbol() was originally added to check the result of String compression with the Compact Strings feature and not to check for valid UTF-8.
>>>>
>>>> I think UTF-8 validity should be checked earlier (before or during Symbol creation) and this affects other JNI methods that take C-Strings as well (jni_NewString, jni_NewStringUTF, jni_DefineClass, ...). This is a general problem with JNI not validating input Strings.
>>>>
>>>> Are you fine with pushing the proposed fix (webrev.00) and open a follow up bug to fix JNI (and potentially re-enable the Compact Strings asserts because their existence is still justified)?
>>>
>>> As I wrote in the bug report:
>>>
>>> "I don't think I agree with the removed checks in either case. JNI doesn't do argument checking and so doesn't detect the invalid UTF-8 string. But the VM expects to work with valid strings. It seems quite reasonable to me that the VM should validate the UTF-8 and report some kind of failure to make these invalid inputs visible. Otherwise the bad JNI codes goes undetected. Maybe an assertion is too strong, perhaps we should throw InternalError? "
>>
>> Yes, I've seen that and, as I said, I agree that we should validate the Strings.
>>
>>> If the VM does not detect this and respond to it exactly what happens with the bad UTF-8 string?
>>
>> In the FindClass case, we just return NULL. I don't know if there are potential problems with invalid UTF-8 strings at other places in the code.
>>
>>> JNI doesn't do input validation - it is a "feature" of JNI so not to penalize correct code.
>>
>> Right, but this check only affects debug builds.
>>
>>> If the test that triggered this is sending in bad UTF-8 because it expects JNI to do validation then it is an invalid test. I think the asserts serve their purpose exactly - to show where invalid inputs came from. If this was bad user code then it would be up to them to fix it. If it is a bad test then the test should be fixed.
>>
>> Sure, the test is broken and will be fixed. So you're suggesting to leave the check in create_from_symbol() as it is and re-enable the check in create_from_str()?
>>
>> I'm fine with that but as I remember, re-enabling the check in create_from_str() causes other test failures.
>>
>> Best regards,
>> Tobias
>>
>>> Sorry.
>>>
>>> David
>>>
>>>> Thanks,
>>>> Tobias
>>>>
>>>> On 12.09.2016 02:47, David Holmes wrote:
>>>>> On 10/09/2016 6:55 AM, Coleen Phillimore wrote:
>>>>>> This change is fine because it matches the commented out assert in
>>>>>> create_from_str().   We should probably figure out what it would take to
>>>>>> check the characters coming in from JNI and decide whether we should do
>>>>>> this.  If not, it doesn't make sense to have commented out asserts.
>>>>>> But this is okay for jdk9.
>>>>>
>>>>> Grumble, grumble ... both are bad. If the VM doesn't validate this bad UTF-8 then where does it go? And how does the generator of the bad UTF-8 get informed? An assert may be too drastic but can we throw an exception (InternalError?) ?
>>>>>
>>>>> David
>>>>>
>>>>>> Thanks,
>>>>>> Coleen
>>>>>>
>>>>>>
>>>>>> On 9/9/16 8:42 AM, Tobias Hartmann wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> please review the following patch:
>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8164561
>>>>>>> http://cr.openjdk.java.net/~thartmann/8164561/webrev.00/
>>>>>>>
>>>>>>> The verification code in java_lang_String::create_from_symbol() that
>>>>>>> was added by Compact Strings fails because the input symbol does not
>>>>>>> contain valid UTF8. The problem is that a JCK JNI test passes an
>>>>>>> invalid UTF8 string as class name to the JNI method "FindClass". In
>>>>>>> fact, the string contains garbage from reading past array boundaries
>>>>>>> because of a bug in the test [1]. The JNI spec [2] states that 'name'
>>>>>>> should be "a fully-qualified class name (that is, a package name,
>>>>>>> delimited by “/”, followed by the class name). If the name begins with
>>>>>>> “[“ (the array signature character), it returns an array class. The
>>>>>>> string is encoded in modified UTF-8".
>>>>>>>
>>>>>>> I nevertheless think that we should not crash in the case of an
>>>>>>> invalid UTF8 string and therefore disabled the verification code with
>>>>>>> a comment. We did the same for java_lang_String::create_from_str() [3].
>>>>>>>
>>>>>>> Tested with failing JCK test and JPRT (running).
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Tobias
>>>>>>>
>>>>>>> [1] https://bugs.openjdk.java.net/browse/JCK-7307244
>>>>>>> [2]
>>>>>>> https://docs.oracle.com/javase/8/docs/technotes/guides/jni/spec/functions.html#FindClass
>>>>>>>
>>>>>>> [3]
>>>>>>> http://hg.openjdk.java.net/jdk9/hs/hotspot/file/d060826d0911/src/share/vm/classfile/javaClasses.cpp#l274
>>>>>>>
>>>>>>


More information about the hotspot-dev mailing list