[9] RFR: 8145974: XMLStreamWriter produces invalid XML for surrogate pairs on OutputStreamWriter

huizhe wang huizhe.wang at oracle.com
Thu May 12 04:51:43 UTC 2016


Hi Aleksej,

The change looks good overall.  It may be better to replace the name 
"writeCodePoint" with "writeCharRef" or "writeEscaped". Doing the 
"isSurrogatePair" check in place of the call for writeSurrogatePair may 
be more descriptive and readable as well, that is:

+                if ( index != end - 1 &&Character.isSurrogatePair(ch, content[index+1])) {
+writeCharRef( Character.toCodePoint(ch, content[index+1]));
+                    index++;
+                } else {
+writeCharRef(ch);
+                }

If you do that, you wouldn't need the method "writeSurrogatePair".

For the test, it may be good to call writer.close() at the end of the 
test. Also, would the content read is the same (vs contains) as the 
expectedContent?

Thanks,
Joe

On 5/11/2016 4:30 PM, Aleks Efimov wrote:
> Hello,
>
> Please, help to review the fix for XMLStreamWriter bug [1]:
> XMLStreamWriter incorrectly writes surrogate pairs into pair of 
> invalid character references.
> For example: "\ud83d\ude0a" is transformed into "��". It 
> should be one character reference "😊" instead.
> The proposed patch fixes the XMLStreamWriterImpl to correctly process 
> surrogate pairs:
> http://cr.openjdk.java.net/~aefimov/8145974/9/00
>
> The build with fix applied was tested with JTREG and JCK xml tests - 
> no related issues detected.
>
> Aleksej
>
> [1] https://bugs.openjdk.java.net/browse/JDK-8145974
>




More information about the core-libs-dev mailing list