<Swing Dev> <Swind Dev> [7u6] Review request for 6836089: Swing HTML parser can't properly decode codepoints outside the Unicode Plane 0 into a surrogate pair

Vladislav Karnaukhov Vladislav.Karnaukhov at oracle.com
Thu Jun 28 14:58:44 UTC 2012


Hello Phil, Pavel,

thank you for your comments. I've reworked fix and testcase, please find 
new webrev here: http://cr.openjdk.java.net/~vkarnauk/6836089/webrev.03/

Regards,
- Vlad

On 6/27/2012 10:13 PM, Phil Race wrote:
> Well its not only unnecessary but is likely wrong .. I don't think
> you looked at what mapNumericReference() does.
>
> -phil.
>
> On 6/27/2012 11:03 AM, Vladislav Karnaukhov wrote:
>> Hello Phil,
>>
>> I used Character.toChars() in both branches because I wanted to 
>> delegate code point conversion to char or surrogate pair entirely to 
>> Character class...
>>
>> Regards,
>> - Vlad
>>
>> On 6/26/2012 9:49 PM, Phil Race wrote:
>>> I don't understand why you call Character.toChars() if you've just 
>>> determined
>>> you don't need to ?
>>>
>>> ie what was wrong with
>>>
>>> data = ( n >>> 16 == 0) ? {mapNumericReference((char) n)} : 
>>> Character.toChars(n);
>>>
>>> ?
>>>
>>> In the case of an invalid supplementary pair, maybe it would be 
>>> safer to  return { ' ' } ?
>>>
>>>
>>> One thing I see in the parsing code that is not new or changed here, 
>>> that
>>> may bear examination, is that there's a loop that keeps on reading 
>>> so long
>>> so long there are new digits. I am not sure its wise to keep going once
>>> you overflow.
>>>
>>> -phil.
>>>
>>> On 6/26/2012 12:37 AM, Vladislav Karnaukhov wrote:
>>>> Hello Pavel,
>>>>
>>>> I can provide you with the link to 6u19, but this is direct 
>>>> forward-port and no code changes were made.
>>>>
>>>> I'll make changes as you've pointed out in 1) and 2)
>>>>
>>>> About 3) - is it a requirement to use "? :" operator? I personally 
>>>> prefer single-line if-else, but I don't want to argue over code 
>>>> style, and surely I'll follow code design practices.
>>>>
>>>> Regards,
>>>> - Vlad
>>>>
>>>> On 6/25/2012 6:43 PM, Pavel Porvatov wrote:
>>>>> Hi Vladislav,
>>>>>
>>>>> Do you have a link to the fix for 6u19?
>>>>>
>>>>> I didn't investigate the fix deeply, but
>>>>>
>>>>> 1.
>>>>>  private final int MAX_BMP_BOUND = 65535;
>>>>> should be static (otherwise variable name should be in lower case)
>>>>>
>>>>> 2. Add a space in single line comments
>>>>>
>>>>> 3.
>>>>> +                    char data[];
>>>>> +                    if (n <= MAX_BMP_BOUND) {
>>>>> +                        data = 
>>>>> Character.toChars(mapNumericReference((char) n));
>>>>> +                    } else {
>>>>> +                        data = Character.toChars(n);
>>>>> +                    }
>>>>> +
>>>>>                  return data;
>>>>>
>>>>> can be written in one line via "? :" operator and looks more 
>>>>> readable for me
>>>>>
>>>>> Thanks, Pavel
>>>>>> Hello,
>>>>>>
>>>>>> please review the fix for 6836089: Swing HTML parser can't 
>>>>>> properly decode codepoints outside the Unicode Plane 0 into a 
>>>>>> surrogate pair. This is a forward port from JDK6 (fixed escalated 
>>>>>> issue, fix integrated) to JDK7.
>>>>>>
>>>>>> The issue is a defect in Swing HTML Parser: if the codepoint is 
>>>>>> outside BMP (Unicode Plain 0), Parser incorrectly decodes 
>>>>>> codepoint into surrogate pair. The fix is to use 
>>>>>> Character.toChars() method if codepoint value is greater than 
>>>>>> upper bound of BMP.
>>>>>>
>>>>>> Webrev: http://cr.openjdk.java.net/~vkarnauk/6836089/webrev.00/
>>>>>> Bug description: 
>>>>>> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6836089
>>>>>>
>>>>>> Regards,
>>>>>> - Vlad
>>>>>
>>>>
>>>>
>>>
>>
>>
>





More information about the swing-dev mailing list