Code review request, 8020842 IDN do not throw IAE when hostname ends with a trailing dot
Michael McMahon
michael.x.mcmahon at oracle.com
Wed Aug 7 14:18:14 UTC 2013
On 07/08/13 15:13, Xuelei Fan wrote:
> On 8/7/2013 10:05 PM, Michael McMahon wrote:
>> Resolvers seem to accept queries using trailing dots.
>>
>> eg nslookup www.oracle.com.
>>
>> or InetAddress.getByName("www.oracle.com.");
>>
>> The part of RFC3490 quoted below seems to me to be saying
>> that the empty label implied by the trailing dot is not regarded
>> as a label so that you don't end up calling toAscii() or toUnicode()
>> with an empty string. I don't think it's saying the trailing dot can't
>> be there.
>>
> It makes sense.
>
> What's your preference to return for IDN.toASCII("www.oracle.com."),
> "www.oracle.com." or "www.oracle.com"? The current returned value is
> "www.oracle.com". I would like to reserve the behavior in this update.
My opinion is to keep it as at present ie. "www.oracle.com."
Michael
> I think we are on same page soon.
>
> Thanks,
> Xuelei
>
>> Michael
>>
>> On 07/08/13 13:44, Xuelei Fan wrote:
>>> On 8/7/2013 12:06 AM, Matthew Hall wrote:
>>>> Trailing dots are allowed in plain DNS (thus almost surely in IDN),
>>>> and the single dot represents the root zone. So you have to be
>>>> careful making this sort of change to check the DNS RFCs first.
>>> That's the first question we need to answer, whether IDN allow tailling
>>> dots ("com."), zero-length root label ("."), and zero-length label ("",
>>> for example ""example..com")?
>>>
>>> Per the specification of IDN.toASCII():
>>> =======================================
>>> "ToASCII operation can fail. ToASCII fails if any step of it fails. If
>>> ToASCII operation fails, an IllegalArgumentException will be thrown. In
>>> this case, the input string should not be used in an internationalized
>>> domain name.
>>>
>>> A label is an individual part of a domain name. The original ToASCII
>>> operation, as defined in RFC 3490, only operates on a single label. This
>>> method can handle both label and entire domain name, by assuming that
>>> labels in a domain name are always separated by dots. ...
>>>
>>> Throws IllegalArgumentException - if the input string doesn't conform to
>>> RFC 3490 specification"
>>>
>>> Per the specification of RFC 3490:
>>> ==================================
>>> [section 2]
>>> "A label is an individual part of a domain name. Labels are usually
>>> shown separated by dots; for example, the domain name
>>> "www.example.com" is composed of three labels: "www", "example", and
>>> "com". (The zero-length root label described in [STD13], which can
>>> be explicit as in "www.example.com." or implicit as in
>>> "www.example.com", is not considered a label in this specification.)"
>>>
>>> "An "internationalized label" is a label to which the ToASCII
>>> operation (see section 4) can be applied without failing (with the
>>> UseSTD3ASCIIRules flag unset). ...
>>> Although most Unicode characters can appear in
>>> internationalized labels, ToASCII will fail for some input strings,
>>> and such strings are not valid internationalized labels."
>>>
>>> "An "internationalized domain name" (IDN) is a domain name in which
>>> every label is an internationalized label."
>>>
>>> [Section 4.1]
>>> "ToASCII consists of the following steps:
>>>
>>> ...
>>>
>>> 8. Verify that the number of code points is in the range 1 to 63
>>> inclusive."
>>>
>>>
>>> Here are the questions:
>>> 1. whether "example..com" is an valid IDN?
>>> As dot is used as label separators, there are three labels,
>>> "example", "", "com". Per RFC 3490, "" is not a valid label. Hence,
>>> "example..com" is not a valid IDN.
>>>
>>> We need to address the issue in IDN.
>>>
>>> 2. whether "xyz." is an valid IDN?
>>> It's an gray area, I think. We can treat the trailing "." as root
>>> label, or a label separator.
>>> If the trailing "." is treated as label separator, "xyz." is invalid
>>> per RFC 3490.
>>> if the trailing "." is treated as root label, what's the expected
>>> return value of IDN.toASCII("xyz.")? I think the return value can be
>>> either "xyz." or "xyz". The current implementation returns "xyz".
>>>
>>> We may need not to update the implementation if tailing "." is
>>> treated as root label.
>>>
>>> 3. whether "." is an valid IDN?
>>> It's an gray area again, I think.
>>> As above, if the trailing "." is treated as root label, I think the
>>> return value can be either "." or "". The current implementation throws
>>> a StringIndexOutOfBoundsException.
>>>
>>> However, what empty domain name ("") really means? I would prefer to
>>> return "." for "." instead.
>>>
>>> We need to address the issue in IDN.
>>>
>>>
>>> Here comes the solution, the IDN.toASCII() returns:
>>> 1. "." for ".";
>>> 2. "xyz" for "xyz.";
>>> 3. IAE for "example..com".
>>>
>>> Does it make sense?
>>>
>>> Thanks,
>>> Xuelei
>>>
>>>
>>> On 8/7/2013 1:35 AM, Michael McMahon wrote:
>>>> I don't really understand the reason for the restriction in SNIHostName
>>>> But, I guess that is where it should be enforced if it is required.
>>>>
>>>> Michael.
>>>>
>>>> On 06/08/13 17:43, Dmitry Samersoff wrote:
>>>>> Xuelei,
>>>>>
>>>>> . (dot) is perfectly valid domain name and it means root domain so com.
>>>>> is valid domain name as well.
>>>>>
>>>>> It thinks to me that in context of methods your change we should ignore
>>>>> trailing dots, rather than throw exception.
>>>>>
>>>>> -Dmitry
>>>>>
>>>>>
>>>>>
>>>>> On 2013-08-06 15:44, Xuelei Fan wrote:
>>>>>> Hi,
>>>>>>
>>>>>> Please review the bug fix to strict the illegal input checking in IDN.
>>>>>>
>>>>>> webrev: http://cr.openjdk.java.net./~xuelei/8020842/webrev.00/
>>>>>>
>>>>>> Here is two test cases, which are expected to get IAE.
>>>>>>
>>>>>> Case 1:
>>>>>> String host = IDN.toASCII(".", IDN.USE_STD3_ASCII_RULES);
>>>>>> Exception in thread "main" java.lang.StringIndexOutOfBoundsException:
>>>>>> String index out of range: 0
>>>>>> at java.lang.StringBuffer.charAt(StringBuffer.java:204)
>>>>>> at java.net.IDN.toASCIIInternal(IDN.java:279)
>>>>>> at java.net.IDN.toASCII(IDN.java:118)
>>>>>>
>>>>>> Case 2:
>>>>>> String host = IDN.toASCII("com.", IDN.USE_STD3_ASCII_RULES);
>>>>>>
>>>>>> Thanks,
>>>>>> Xuelei
>>>>>>
More information about the security-dev
mailing list