<Swing Dev> <Swind Dev> [7u6] Review request for 6836089: Swing HTML parser can't properly decode codepoints outside the Unicode Plane 0 into a surrogate pair

Vladislav Karnaukhov Vladislav.Karnaukhov at oracle.com
Wed Jun 27 18:03:59 UTC 2012


Hello Phil,

I used Character.toChars() in both branches because I wanted to delegate 
code point conversion to char or surrogate pair entirely to Character 
class...

Regards,
- Vlad

On 6/26/2012 9:49 PM, Phil Race wrote:
> I don't understand why you call Character.toChars() if you've just 
> determined
> you don't need to ?
>
> ie what was wrong with
>
> data = ( n >>> 16 == 0) ? {mapNumericReference((char) n)} : 
> Character.toChars(n);
>
> ?
>
> In the case of an invalid supplementary pair, maybe it would be safer 
> to  return { ' ' } ?
>
>
> One thing I see in the parsing code that is not new or changed here, that
> may bear examination, is that there's a loop that keeps on reading so 
> long
> so long there are new digits. I am not sure its wise to keep going once
> you overflow.
>
> -phil.
>
> On 6/26/2012 12:37 AM, Vladislav Karnaukhov wrote:
>> Hello Pavel,
>>
>> I can provide you with the link to 6u19, but this is direct 
>> forward-port and no code changes were made.
>>
>> I'll make changes as you've pointed out in 1) and 2)
>>
>> About 3) - is it a requirement to use "? :" operator? I personally 
>> prefer single-line if-else, but I don't want to argue over code 
>> style, and surely I'll follow code design practices.
>>
>> Regards,
>> - Vlad
>>
>> On 6/25/2012 6:43 PM, Pavel Porvatov wrote:
>>> Hi Vladislav,
>>>
>>> Do you have a link to the fix for 6u19?
>>>
>>> I didn't investigate the fix deeply, but
>>>
>>> 1.
>>>  private final int MAX_BMP_BOUND = 65535;
>>> should be static (otherwise variable name should be in lower case)
>>>
>>> 2. Add a space in single line comments
>>>
>>> 3.
>>> +                    char data[];
>>> +                    if (n <= MAX_BMP_BOUND) {
>>> +                        data = 
>>> Character.toChars(mapNumericReference((char) n));
>>> +                    } else {
>>> +                        data = Character.toChars(n);
>>> +                    }
>>> +
>>>                  return data;
>>>
>>> can be written in one line via "? :" operator and looks more 
>>> readable for me
>>>
>>> Thanks, Pavel
>>>> Hello,
>>>>
>>>> please review the fix for 6836089: Swing HTML parser can't properly 
>>>> decode codepoints outside the Unicode Plane 0 into a surrogate 
>>>> pair. This is a forward port from JDK6 (fixed escalated issue, fix 
>>>> integrated) to JDK7.
>>>>
>>>> The issue is a defect in Swing HTML Parser: if the codepoint is 
>>>> outside BMP (Unicode Plain 0), Parser incorrectly decodes codepoint 
>>>> into surrogate pair. The fix is to use Character.toChars() method 
>>>> if codepoint value is greater than upper bound of BMP.
>>>>
>>>> Webrev: http://cr.openjdk.java.net/~vkarnauk/6836089/webrev.00/
>>>> Bug description: 
>>>> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6836089
>>>>
>>>> Regards,
>>>> - Vlad
>>>
>>
>>
>





More information about the swing-dev mailing list