Code review request, 8020842 IDN do not throw IAE when hostname ends with a trailing dot

Xuelei Fan xuelei.fan at oracle.com
Fri Aug 9 00:41:23 UTC 2013


Ping.

Thanks,
Xuelei

On 8/7/2013 11:17 PM, Xuelei Fan wrote:
> Please review the new update:
> 
> http://cr.openjdk.java.net./~xuelei/8020842/webrev.01/
> 
> With this update, "com." is valid (return "com."); "." and
> "example..com" are invalid.  And IAE will be thrown for invalid IDN.
> 
> Thanks,
> Xuelei
> 
> On 8/7/2013 10:18 PM, Michael McMahon wrote:
>> On 07/08/13 15:13, Xuelei Fan wrote:
>>> On 8/7/2013 10:05 PM, Michael McMahon wrote:
>>>> Resolvers seem to accept queries using trailing dots.
>>>>
>>>> eg nslookup www.oracle.com.
>>>>
>>>> or InetAddress.getByName("www.oracle.com.");
>>>>
>>>> The part of RFC3490 quoted below seems to me to be saying
>>>> that the empty label implied by the trailing dot is not regarded
>>>> as a label so that you don't end up calling toAscii() or toUnicode()
>>>> with an empty string. I don't think it's saying the trailing dot can't
>>>> be there.
>>>>
>>> It makes sense.
>>>
>>> What's your preference to return for IDN.toASCII("www.oracle.com."),
>>> "www.oracle.com." or "www.oracle.com"? The current returned value is
>>> "www.oracle.com".  I would like to reserve the behavior in this update.
>>
>> My opinion is to keep it as at present ie. "www.oracle.com."
>>
>> Michael
>>
>>> I think we are on same page soon.
>>>
>>> Thanks,
>>> Xuelei
>>>
>>>> Michael
>>>>
>>>> On 07/08/13 13:44, Xuelei Fan wrote:
>>>>> On 8/7/2013 12:06 AM, Matthew Hall wrote:
>>>>>> Trailing dots are allowed in plain DNS (thus almost surely in IDN),
>>>>>> and the single dot represents the root zone. So you have to be
>>>>>> careful making this sort of change to check the DNS RFCs first.
>>>>> That's the first question we need to answer, whether IDN allow tailling
>>>>> dots ("com."), zero-length root label ("."), and zero-length label ("",
>>>>> for example ""example..com")?
>>>>>
>>>>> Per the specification of IDN.toASCII():
>>>>> =======================================
>>>>> "ToASCII operation can fail. ToASCII fails if any step of it fails. If
>>>>> ToASCII operation fails, an IllegalArgumentException will be thrown. In
>>>>> this case, the input string should not be used in an internationalized
>>>>> domain name.
>>>>>
>>>>> A label is an individual part of a domain name. The original ToASCII
>>>>> operation, as defined in RFC 3490, only operates on a single label.
>>>>> This
>>>>> method can handle both label and entire domain name, by assuming that
>>>>> labels in a domain name are always separated by dots. ...
>>>>>
>>>>> Throws IllegalArgumentException - if the input string doesn't
>>>>> conform to
>>>>> RFC 3490 specification"
>>>>>
>>>>> Per the specification of RFC 3490:
>>>>> ==================================
>>>>> [section 2]
>>>>> "A label is an individual part of a domain name.  Labels are usually
>>>>>    shown separated by dots; for example, the domain name
>>>>>    "www.example.com" is composed of three labels: "www", "example", and
>>>>>    "com".  (The zero-length root label described in [STD13], which can
>>>>>    be explicit as in "www.example.com." or implicit as in
>>>>>    "www.example.com", is not considered a label in this
>>>>> specification.)"
>>>>>
>>>>> "An "internationalized label" is a label to which the ToASCII
>>>>>    operation (see section 4) can be applied without failing (with the
>>>>>    UseSTD3ASCIIRules flag unset).  ...
>>>>>    Although most Unicode characters can appear in
>>>>>    internationalized labels, ToASCII will fail for some input strings,
>>>>>    and such strings are not valid internationalized labels."
>>>>>
>>>>> "An "internationalized domain name" (IDN) is a domain name in which
>>>>>    every label is an internationalized label."
>>>>>
>>>>> [Section 4.1]
>>>>> "ToASCII consists of the following steps:
>>>>>
>>>>>    ...
>>>>>
>>>>>    8. Verify that the number of code points is in the range 1 to 63
>>>>>         inclusive."
>>>>>
>>>>>
>>>>> Here are the questions:
>>>>> 1. whether "example..com" is an valid IDN?
>>>>>      As dot is used as label separators, there are three labels,
>>>>> "example", "", "com".  Per RFC 3490, "" is not a valid label. Hence,
>>>>> "example..com" is not a valid IDN.
>>>>>
>>>>>      We need to address the issue in IDN.
>>>>>
>>>>> 2. whether "xyz." is an valid IDN?
>>>>>      It's an gray area, I think. We can treat the trailing "." as root
>>>>> label, or a label separator.
>>>>>      If the trailing "." is treated as label separator, "xyz." is
>>>>> invalid
>>>>> per RFC 3490.
>>>>>      if the trailing "." is treated as root label, what's the expected
>>>>> return value of IDN.toASCII("xyz.")?  I think the return value can be
>>>>> either "xyz." or "xyz".  The current implementation returns "xyz".
>>>>>
>>>>>      We may need not to update the implementation if tailing "." is
>>>>> treated as root label.
>>>>>
>>>>> 3. whether "." is an valid IDN?
>>>>>      It's an gray area again, I think.
>>>>>      As above, if the trailing "." is treated as root label, I think
>>>>> the
>>>>> return value can be either "." or "".  The current implementation
>>>>> throws
>>>>> a StringIndexOutOfBoundsException.
>>>>>
>>>>>      However, what empty domain name ("") really means?  I would
>>>>> prefer to
>>>>> return "." for "." instead.
>>>>>
>>>>>      We need to address the issue in IDN.
>>>>>
>>>>>
>>>>> Here comes the solution, the IDN.toASCII() returns:
>>>>> 1. "." for ".";
>>>>> 2. "xyz" for "xyz.";
>>>>> 3. IAE for "example..com".
>>>>>
>>>>> Does it make sense?
>>>>>
>>>>> Thanks,
>>>>> Xuelei
>>>>>
>>>>>
>>>>> On 8/7/2013 1:35 AM, Michael McMahon wrote:
>>>>>> I don't really understand the reason for the restriction in
>>>>>> SNIHostName
>>>>>> But, I guess that is where it should be enforced if it is required.
>>>>>>
>>>>>> Michael.
>>>>>>
>>>>>> On 06/08/13 17:43, Dmitry Samersoff wrote:
>>>>>>> Xuelei,
>>>>>>>
>>>>>>> . (dot) is perfectly valid domain name and it means root domain so
>>>>>>> com.
>>>>>>> is valid domain name as well.
>>>>>>>
>>>>>>> It thinks to me that in context of methods your change we should
>>>>>>> ignore
>>>>>>> trailing dots, rather than throw exception.
>>>>>>>
>>>>>>> -Dmitry
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 2013-08-06 15:44, Xuelei Fan wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> Please review the bug fix to strict the illegal input checking in
>>>>>>>> IDN.
>>>>>>>>
>>>>>>>> webrev: http://cr.openjdk.java.net./~xuelei/8020842/webrev.00/
>>>>>>>>
>>>>>>>> Here is two test cases, which are expected to get IAE.
>>>>>>>>
>>>>>>>> Case 1:
>>>>>>>> String host = IDN.toASCII(".", IDN.USE_STD3_ASCII_RULES);
>>>>>>>> Exception in thread "main"
>>>>>>>> java.lang.StringIndexOutOfBoundsException:
>>>>>>>> String index out of range: 0
>>>>>>>>            at java.lang.StringBuffer.charAt(StringBuffer.java:204)
>>>>>>>>            at java.net.IDN.toASCIIInternal(IDN.java:279)
>>>>>>>>            at java.net.IDN.toASCII(IDN.java:118)
>>>>>>>>
>>>>>>>> Case 2:
>>>>>>>> String host = IDN.toASCII("com.", IDN.USE_STD3_ASCII_RULES);
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Xuelei
>>>>>>>>
>>
> 




More information about the security-dev mailing list