Code review request, 8020842 IDN do not throw IAE when hostname ends with a trailing dot
Xuelei Fan
Xuelei.Fan at Oracle.Com
Fri Aug 9 07:09:33 UTC 2013
On Aug 9, 2013, at 14:08, Dmitry Samersoff <dmitry.samersoff at oracle.com> wrote:
> Xuelei,
>
> 119 p = q + 1;
> 120 if (p < input.length() || q == (input.length() - 1)) {
>
> Could be simplified to:
>
> q <= input.length()-1
>
It's cool!
Xuelei
> -Dmitry
>
> On 2013-08-09 04:41, Xuelei Fan wrote:
>> Ping.
>>
>> Thanks,
>> Xuelei
>>
>> On 8/7/2013 11:17 PM, Xuelei Fan wrote:
>>> Please review the new update:
>>>
>>> http://cr.openjdk.java.net./~xuelei/8020842/webrev.01/
>>>
>>> With this update, "com." is valid (return "com."); "." and
>>> "example..com" are invalid. And IAE will be thrown for invalid IDN.
>>>
>>> Thanks,
>>> Xuelei
>>>
>>> On 8/7/2013 10:18 PM, Michael McMahon wrote:
>>>> On 07/08/13 15:13, Xuelei Fan wrote:
>>>>> On 8/7/2013 10:05 PM, Michael McMahon wrote:
>>>>>> Resolvers seem to accept queries using trailing dots.
>>>>>>
>>>>>> eg nslookup www.oracle.com.
>>>>>>
>>>>>> or InetAddress.getByName("www.oracle.com.");
>>>>>>
>>>>>> The part of RFC3490 quoted below seems to me to be saying
>>>>>> that the empty label implied by the trailing dot is not regarded
>>>>>> as a label so that you don't end up calling toAscii() or toUnicode()
>>>>>> with an empty string. I don't think it's saying the trailing dot can't
>>>>>> be there.
>>>>> It makes sense.
>>>>>
>>>>> What's your preference to return for IDN.toASCII("www.oracle.com."),
>>>>> "www.oracle.com." or "www.oracle.com"? The current returned value is
>>>>> "www.oracle.com". I would like to reserve the behavior in this update.
>>>>
>>>> My opinion is to keep it as at present ie. "www.oracle.com."
>>>>
>>>> Michael
>>>>
>>>>> I think we are on same page soon.
>>>>>
>>>>> Thanks,
>>>>> Xuelei
>>>>>
>>>>>> Michael
>>>>>>
>>>>>> On 07/08/13 13:44, Xuelei Fan wrote:
>>>>>>> On 8/7/2013 12:06 AM, Matthew Hall wrote:
>>>>>>>> Trailing dots are allowed in plain DNS (thus almost surely in IDN),
>>>>>>>> and the single dot represents the root zone. So you have to be
>>>>>>>> careful making this sort of change to check the DNS RFCs first.
>>>>>>> That's the first question we need to answer, whether IDN allow tailling
>>>>>>> dots ("com."), zero-length root label ("."), and zero-length label ("",
>>>>>>> for example ""example..com")?
>>>>>>>
>>>>>>> Per the specification of IDN.toASCII():
>>>>>>> =======================================
>>>>>>> "ToASCII operation can fail. ToASCII fails if any step of it fails. If
>>>>>>> ToASCII operation fails, an IllegalArgumentException will be thrown. In
>>>>>>> this case, the input string should not be used in an internationalized
>>>>>>> domain name.
>>>>>>>
>>>>>>> A label is an individual part of a domain name. The original ToASCII
>>>>>>> operation, as defined in RFC 3490, only operates on a single label.
>>>>>>> This
>>>>>>> method can handle both label and entire domain name, by assuming that
>>>>>>> labels in a domain name are always separated by dots. ...
>>>>>>>
>>>>>>> Throws IllegalArgumentException - if the input string doesn't
>>>>>>> conform to
>>>>>>> RFC 3490 specification"
>>>>>>>
>>>>>>> Per the specification of RFC 3490:
>>>>>>> ==================================
>>>>>>> [section 2]
>>>>>>> "A label is an individual part of a domain name. Labels are usually
>>>>>>> shown separated by dots; for example, the domain name
>>>>>>> "www.example.com" is composed of three labels: "www", "example", and
>>>>>>> "com". (The zero-length root label described in [STD13], which can
>>>>>>> be explicit as in "www.example.com." or implicit as in
>>>>>>> "www.example.com", is not considered a label in this
>>>>>>> specification.)"
>>>>>>>
>>>>>>> "An "internationalized label" is a label to which the ToASCII
>>>>>>> operation (see section 4) can be applied without failing (with the
>>>>>>> UseSTD3ASCIIRules flag unset). ...
>>>>>>> Although most Unicode characters can appear in
>>>>>>> internationalized labels, ToASCII will fail for some input strings,
>>>>>>> and such strings are not valid internationalized labels."
>>>>>>>
>>>>>>> "An "internationalized domain name" (IDN) is a domain name in which
>>>>>>> every label is an internationalized label."
>>>>>>>
>>>>>>> [Section 4.1]
>>>>>>> "ToASCII consists of the following steps:
>>>>>>>
>>>>>>> ...
>>>>>>>
>>>>>>> 8. Verify that the number of code points is in the range 1 to 63
>>>>>>> inclusive."
>>>>>>>
>>>>>>>
>>>>>>> Here are the questions:
>>>>>>> 1. whether "example..com" is an valid IDN?
>>>>>>> As dot is used as label separators, there are three labels,
>>>>>>> "example", "", "com". Per RFC 3490, "" is not a valid label. Hence,
>>>>>>> "example..com" is not a valid IDN.
>>>>>>>
>>>>>>> We need to address the issue in IDN.
>>>>>>>
>>>>>>> 2. whether "xyz." is an valid IDN?
>>>>>>> It's an gray area, I think. We can treat the trailing "." as root
>>>>>>> label, or a label separator.
>>>>>>> If the trailing "." is treated as label separator, "xyz." is
>>>>>>> invalid
>>>>>>> per RFC 3490.
>>>>>>> if the trailing "." is treated as root label, what's the expected
>>>>>>> return value of IDN.toASCII("xyz.")? I think the return value can be
>>>>>>> either "xyz." or "xyz". The current implementation returns "xyz".
>>>>>>>
>>>>>>> We may need not to update the implementation if tailing "." is
>>>>>>> treated as root label.
>>>>>>>
>>>>>>> 3. whether "." is an valid IDN?
>>>>>>> It's an gray area again, I think.
>>>>>>> As above, if the trailing "." is treated as root label, I think
>>>>>>> the
>>>>>>> return value can be either "." or "". The current implementation
>>>>>>> throws
>>>>>>> a StringIndexOutOfBoundsException.
>>>>>>>
>>>>>>> However, what empty domain name ("") really means? I would
>>>>>>> prefer to
>>>>>>> return "." for "." instead.
>>>>>>>
>>>>>>> We need to address the issue in IDN.
>>>>>>>
>>>>>>>
>>>>>>> Here comes the solution, the IDN.toASCII() returns:
>>>>>>> 1. "." for ".";
>>>>>>> 2. "xyz" for "xyz.";
>>>>>>> 3. IAE for "example..com".
>>>>>>>
>>>>>>> Does it make sense?
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Xuelei
>>>>>>>
>>>>>>>
>>>>>>> On 8/7/2013 1:35 AM, Michael McMahon wrote:
>>>>>>>> I don't really understand the reason for the restriction in
>>>>>>>> SNIHostName
>>>>>>>> But, I guess that is where it should be enforced if it is required.
>>>>>>>>
>>>>>>>> Michael.
>>>>>>>>
>>>>>>>> On 06/08/13 17:43, Dmitry Samersoff wrote:
>>>>>>>>> Xuelei,
>>>>>>>>>
>>>>>>>>> . (dot) is perfectly valid domain name and it means root domain so
>>>>>>>>> com.
>>>>>>>>> is valid domain name as well.
>>>>>>>>>
>>>>>>>>> It thinks to me that in context of methods your change we should
>>>>>>>>> ignore
>>>>>>>>> trailing dots, rather than throw exception.
>>>>>>>>>
>>>>>>>>> -Dmitry
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 2013-08-06 15:44, Xuelei Fan wrote:
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> Please review the bug fix to strict the illegal input checking in
>>>>>>>>>> IDN.
>>>>>>>>>>
>>>>>>>>>> webrev: http://cr.openjdk.java.net./~xuelei/8020842/webrev.00/
>>>>>>>>>>
>>>>>>>>>> Here is two test cases, which are expected to get IAE.
>>>>>>>>>>
>>>>>>>>>> Case 1:
>>>>>>>>>> String host = IDN.toASCII(".", IDN.USE_STD3_ASCII_RULES);
>>>>>>>>>> Exception in thread "main"
>>>>>>>>>> java.lang.StringIndexOutOfBoundsException:
>>>>>>>>>> String index out of range: 0
>>>>>>>>>> at java.lang.StringBuffer.charAt(StringBuffer.java:204)
>>>>>>>>>> at java.net.IDN.toASCIIInternal(IDN.java:279)
>>>>>>>>>> at java.net.IDN.toASCII(IDN.java:118)
>>>>>>>>>>
>>>>>>>>>> Case 2:
>>>>>>>>>> String host = IDN.toASCII("com.", IDN.USE_STD3_ASCII_RULES);
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Xuelei
>
>
> --
> Dmitry Samersoff
> Oracle Java development team, Saint Petersburg, Russia
> * I would love to change the world, but they won't give me the source code.
More information about the security-dev
mailing list