RFR for JDK-8022879 TEST_BUG: sun/nio/cs/MalformedSurrogates.java fails intermittently
Martin Buchholz
martinrb at google.com
Wed Dec 18 08:47:05 PST 2013
The latest webrev looks great to me!
On Tue, Dec 17, 2013 at 10:34 PM, Eric Wang <yiming.wang at oracle.com> wrote:
> Hi Martin,
>
> I have updated the test to add another 2 surrogates, solitary high and
> solitary low, so 4 possibilities are covered.
> The verbose code has been cleaned based on your suggestion, Can you please
> review again?
> http://cr.openjdk.java.net/~ewang/JDK-8022879/webrev.01/
>
> Thanks,
> Eric
>
> On 2013/12/18 3:03, Martin Buchholz wrote:
>
> Thanks, Eric.
>
> This looks good enough to check in. Optionally:
>
> Regarding my comment on what surrogate to test, I would tend to check 4
> possibilities: solitary high surrogate, solitary low, high + low, low + high
>
>
> The code below is a bit more verbose than necessary. We can remove f by
> doing
>
> try {
> ...
> throw new ..("should throw ...");
> } catch (MalformedInputException | UnmappableCharacterException success) {
> }
>
> en.reset();
>
> 69 boolean f = false; 70 try { 71 en.encode(CharBuffer.wrap(surrogate)); 72 } catch (MalformedInputException | UnmappableCharacterException ex) { 73 f = true; 74 } finally { 75 en.reset(); 76 } 77 if (!f) { 78 throw new RuntimeException("testMalformedSurrogate failed with charset " + cs.name()); 79 }
>
>
>
>
>
> On Tue, Dec 17, 2013 at 1:18 AM, Eric Wang <yiming.wang at oracle.com> wrote:
>
>> Hi Martin,
>>
>> I have updated the test based on your comments and split the old logic
>> into different functions to test legal or illegal surrogate pairs. Can you
>> please help to review it?
>> http://cr.openjdk.java.net/~ewang/JDK-8022879/webrev.01/
>>
>> For your last comment "reversed surrogate", I'm not sure whether my
>> understanding is correct. Does that mean reverse legal surrogate e.g.
>> \uD800\uDC00 to \uDC00\uD800 which is kind of malformed surrogate.
>> I have covered this case in the test.
>>
>> Thanks,
>> Eric
>>
>> On 2013/12/14 8:15, Martin Buchholz wrote:
>>
>> I do like this approach. I think we can have some cleanup to make it a
>> little clearer what is being tested.
>>
>> Below, it makes no sense to check canEncode when you are creating a
>> decoder.
>>
>> 60 try { 61 de = charset.newDecoder(); 62 } catch (UnsupportedOperationException ex) { 63 if (canEncode) { 64 throw ex; 65 } 66 }
>>
>>
>> *Once you do this:*
>>
>> 68 if (!canEncode) { 69 continue; 70 }
>>
>> the loop should no longer test canEncode, but does, obscuring the logic.
>>
>> Use standard whitespace:
>>
>> 109 }finally {
>>
>> When encoding legal surrogate pair, should see UnmappableCharacterException,
>> not MalformedInputException.
>> Reverse is true with illegal surrogate pair.
>>
>> should probably tests isolated, in addition to reversed, surrogates.
>>
>> Compare my old test FindCanEncodeBugs.
>>
>>
>>
>> On Thu, Dec 12, 2013 at 4:51 AM, Eric Wang <yiming.wang at oracle.com>wrote:
>>
>>> Hi Martin & All,
>>>
>>> Please review the changes for MalformedSurrogates.
>>> http://cr.openjdk.java.net/~ewang/JDK-8022879/webrev.00/
>>>
>>> The logic of changes are
>>> 1. Iterate all charsets of system to check whether the charset supports
>>> encode.
>>> 2. If yes, check whether it supports surrogate.
>>> 3. If yes, check whether it rejects malformed surrogate and can
>>> en(de)code normal surrogate.
>>>
>>> Thanks,
>>> Eric
>>>
>>> On 2013/11/8 0:23, Martin Buchholz wrote:
>>>
>>> I still like my old idea of iterating over all charsets and checking
>>> their reasonableness properties.
>>>
>>> Probably all charsets bundled with the jdk should reject unpaired
>>> surrogates when encoding. Check it! Failure to do so might be considered
>>> a security bug.
>>>
>>> I would use CodingErrorAction.REPORT instead of REPLACE and examine
>>> that the resulting exception occurs and has reasonable detail.
>>>
>>> For properly paired surrogates, check that the charset can encode the
>>> resulting codepoint using canEncode, and if so, check that encoding
>>> succeeds.
>>>
>>>
>>> On Thu, Nov 7, 2013 at 7:22 AM, Eric Wang <yiming.wang at oracle.com>wrote:
>>>
>>>> Hi Everyone
>>>>
>>>> I am working on bug https://bugs.openjdk.java.net/browse/JDK-8022879.
>>>> The test sun/nio/cs/MalformedSurrogates.java<http://hg.openjdk.java.net/jdk8/tl/jdk/file/44fa6bf42846/test/sun/nio/cs/MalformedSurrogates.java>doesn't run if the system default encoding is UTF-8. But unfortunately,
>>>> UTF-8 is the default charset of most test machines, it means the test get
>>>> few chances to be executed.
>>>> Another defect is the test would failed if the default charset is
>>>> UTF-16 or UTF-32 as the test doesn't take the 2 charsets into consideration.
>>>>
>>>> The idea of fix is no matter what system charset it is, the test
>>>> should always be executed. Here thanks Martin's suggestion that instead of
>>>> checking byte size, we can use CharsetEncoder.canEncode() and CharsetEncoder.onMalformedInput(CodingErrorAction.REPLACE)
>>>> to check and replace malformed chars.
>>>>
>>>> So the test can be re-designed as below:
>>>>
>>>> 1. To use CharsetEncoder.canEncode() to check whether the string
>>>> includes malformed characters.
>>>> 2. If a string includes malformed characters e.g.
>>>> "abc\uD800\uDB00efgh", then set
>>>> CharsetEncoder.onMalformedInput(CodingErrorAction.REPLACE) to replace the
>>>> malformed characters to the replacement "?" when calling
>>>> CharsetEncoder.encode() method.
>>>> 3. Verified by decoding the encoded ByteBuffer to CharBuffer, check
>>>> whether it includes replacement "?" and compare it with old string, if not
>>>> equal, then test passed.
>>>> 4. If a sting doesn't include malformed characters e.g.
>>>> "abc\uD800\uDC00efgh", the CharsetEncoder.encode() converts it to
>>>> ByteBuffer which doesn't include replacement "?"
>>>> 5. Verified by decoding the encoded ByteBuffer to CharBuffer, confirm
>>>> that it doesn't include replacment "?" and compare it with old string, if
>>>> equal, then test passed.
>>>>
>>>> Please let me know if you have any comments or suggestions.
>>>>
>>>> Thanks,
>>>> Eric
>>>>
>>>
>>>
>>>
>>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/nio-dev/attachments/20131218/ad2aaca9/attachment-0001.html
More information about the nio-dev
mailing list