Code Review Request, JDK-8146600 AVA Normalizer.Form issue

Xuelei Fan xuelei.fan at oracle.com
Mon Sep 19 15:16:35 UTC 2016


On 9/19/2016 11:03 PM, Wang Weijun wrote:
> After some thinking, my current opinion is.
>
> 1. Maybe NFC is better than NFKD, but I am not a Unicode expert.
>
It is updated from NFKD to NFD.  I did not get the point.  Do you mean 
NFC is better than NFD?

> 2. I think the real bug is the order of escaping and normalization. The normalization (if a must) should be performed earlier right after valStr is created and only performed on valStr. Otherwise the NFKD normalization would generate new chars that need to be escaped. Again I am not a Unicode expert and I don't know if NFC will also do the same.
>
I don't get the point.  The update is moving from NFKD to NFD.  No NFKD 
normalization any more.

> If 2) is fixed, whatever is correct in 1) does not matter much.
>
If we continue to use NFKD, normalization before escaping would result 
in unexpected string as we talked for the hello-world example.  It is 
something I want to avoid, so that it is fixed to use NFD instead.  I 
think if we are moving to use NFD, it is does not matter to escaping 
first or normalization first if I understand the UTF-8 correctly.

Thanks,
Xuelei

> Thanks
> Max
>
>> On Sep 19, 2016, at 10:32 AM, Xuelei Fan <xuelei.fan at oracle.com> wrote:
>>
>>> 4. Is it possible to perform normalization before escaping special characters?
>>>
>> Yes.  I though about this case.  The current fix comes from the fact that UTF-8 "Hello, world!" and "Hello, world!" should be different. Parsing them as the same thing may result in unexpected serious issues.
>



More information about the security-dev mailing list