RFR for JDK-8022879 TEST_BUG: sun/nio/cs/MalformedSurrogates.java fails intermittently

Thu Nov 7 08:23:20 PST 2013

I still like my old idea of iterating over all charsets and checking their
reasonableness properties.

Probably all charsets bundled with the jdk should reject unpaired
surrogates when encoding.  Check it!  Failure to do so might be considered
a security bug.

I would use CodingErrorAction.REPORT  instead of REPLACE and examine that
the resulting exception occurs and has reasonable detail.

For properly paired surrogates, check that the charset can encode the
resulting codepoint using canEncode, and if so, check that encoding
succeeds.

On Thu, Nov 7, 2013 at 7:22 AM, Eric Wang <yiming.wang at oracle.com> wrote:

>    Hi Everyone
>
> I am working on bug https://bugs.openjdk.java.net/browse/JDK-8022879.
> The test sun/nio/cs/MalformedSurrogates.java<http://hg.openjdk.java.net/jdk8/tl/jdk/file/44fa6bf42846/test/sun/nio/cs/MalformedSurrogates.java>doesn't run if the system default encoding is UTF-8. But unfortunately,
> UTF-8 is the default charset of most test machines, it means the test get
> few chances to be executed.
> Another defect is the test would failed if the default charset is UTF-16
> or UTF-32 as the test doesn't take the 2 charsets into consideration.
>
> The idea of fix  is no matter what system charset it is, the test should
> always be executed. Here thanks Martin's suggestion that instead of
> checking byte size, we can use CharsetEncoder.canEncode() and CharsetEncoder.onMalformedInput(CodingErrorAction.REPLACE)
> to check and replace malformed chars.
>
> So the test can be re-designed as below:
>
> 1. To use CharsetEncoder.canEncode() to check whether the string includes
> malformed characters.
> 2. If a string includes malformed characters e.g. "abc\uD800\uDB00efgh",
> then set CharsetEncoder.onMalformedInput(CodingErrorAction.REPLACE) to
> replace the malformed characters to the replacement "?" when calling
> CharsetEncoder.encode() method.
> 3. Verified by decoding the encoded ByteBuffer to CharBuffer, check
> whether it includes replacement "?" and compare it with old string, if not
> equal, then test passed.
> 4. If a sting doesn't include malformed characters e.g.
> "abc\uD800\uDC00efgh", the CharsetEncoder.encode() converts it to
> ByteBuffer which doesn't include replacement "?"
> 5. Verified by decoding the encoded ByteBuffer to CharBuffer, confirm that
> it doesn't include replacment "?" and compare it with old string, if equal,
> then test passed.
>
> Please let me know if you have any comments or suggestions.
>
> Thanks,
> Eric
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/nio-dev/attachments/20131107/f75f5a3c/attachment.html