Code review request, 8020842 IDN do not throw IAE when hostname ends with a trailing dot

Xuelei Fan Xuelei.Fan at Oracle.Com
Fri Aug 9 07:09:33 UTC 2013


On Aug 9, 2013, at 14:08, Dmitry Samersoff <dmitry.samersoff at oracle.com> wrote:

> Xuelei,
> 
> 119             p = q + 1;
> 120             if (p < input.length() || q == (input.length() - 1)) {
> 
> Could be simplified to:
> 
> q <= input.length()-1
> 
It's cool!

Xuelei

> -Dmitry
> 
> On 2013-08-09 04:41, Xuelei Fan wrote:
>> Ping.
>> 
>> Thanks,
>> Xuelei
>> 
>> On 8/7/2013 11:17 PM, Xuelei Fan wrote:
>>> Please review the new update:
>>> 
>>> http://cr.openjdk.java.net./~xuelei/8020842/webrev.01/
>>> 
>>> With this update, "com." is valid (return "com."); "." and
>>> "example..com" are invalid.  And IAE will be thrown for invalid IDN.
>>> 
>>> Thanks,
>>> Xuelei
>>> 
>>> On 8/7/2013 10:18 PM, Michael McMahon wrote:
>>>> On 07/08/13 15:13, Xuelei Fan wrote:
>>>>> On 8/7/2013 10:05 PM, Michael McMahon wrote:
>>>>>> Resolvers seem to accept queries using trailing dots.
>>>>>> 
>>>>>> eg nslookup www.oracle.com.
>>>>>> 
>>>>>> or InetAddress.getByName("www.oracle.com.");
>>>>>> 
>>>>>> The part of RFC3490 quoted below seems to me to be saying
>>>>>> that the empty label implied by the trailing dot is not regarded
>>>>>> as a label so that you don't end up calling toAscii() or toUnicode()
>>>>>> with an empty string. I don't think it's saying the trailing dot can't
>>>>>> be there.
>>>>> It makes sense.
>>>>> 
>>>>> What's your preference to return for IDN.toASCII("www.oracle.com."),
>>>>> "www.oracle.com." or "www.oracle.com"? The current returned value is
>>>>> "www.oracle.com".  I would like to reserve the behavior in this update.
>>>> 
>>>> My opinion is to keep it as at present ie. "www.oracle.com."
>>>> 
>>>> Michael
>>>> 
>>>>> I think we are on same page soon.
>>>>> 
>>>>> Thanks,
>>>>> Xuelei
>>>>> 
>>>>>> Michael
>>>>>> 
>>>>>> On 07/08/13 13:44, Xuelei Fan wrote:
>>>>>>> On 8/7/2013 12:06 AM, Matthew Hall wrote:
>>>>>>>> Trailing dots are allowed in plain DNS (thus almost surely in IDN),
>>>>>>>> and the single dot represents the root zone. So you have to be
>>>>>>>> careful making this sort of change to check the DNS RFCs first.
>>>>>>> That's the first question we need to answer, whether IDN allow tailling
>>>>>>> dots ("com."), zero-length root label ("."), and zero-length label ("",
>>>>>>> for example ""example..com")?
>>>>>>> 
>>>>>>> Per the specification of IDN.toASCII():
>>>>>>> =======================================
>>>>>>> "ToASCII operation can fail. ToASCII fails if any step of it fails. If
>>>>>>> ToASCII operation fails, an IllegalArgumentException will be thrown. In
>>>>>>> this case, the input string should not be used in an internationalized
>>>>>>> domain name.
>>>>>>> 
>>>>>>> A label is an individual part of a domain name. The original ToASCII
>>>>>>> operation, as defined in RFC 3490, only operates on a single label.
>>>>>>> This
>>>>>>> method can handle both label and entire domain name, by assuming that
>>>>>>> labels in a domain name are always separated by dots. ...
>>>>>>> 
>>>>>>> Throws IllegalArgumentException - if the input string doesn't
>>>>>>> conform to
>>>>>>> RFC 3490 specification"
>>>>>>> 
>>>>>>> Per the specification of RFC 3490:
>>>>>>> ==================================
>>>>>>> [section 2]
>>>>>>> "A label is an individual part of a domain name.  Labels are usually
>>>>>>>   shown separated by dots; for example, the domain name
>>>>>>>   "www.example.com" is composed of three labels: "www", "example", and
>>>>>>>   "com".  (The zero-length root label described in [STD13], which can
>>>>>>>   be explicit as in "www.example.com." or implicit as in
>>>>>>>   "www.example.com", is not considered a label in this
>>>>>>> specification.)"
>>>>>>> 
>>>>>>> "An "internationalized label" is a label to which the ToASCII
>>>>>>>   operation (see section 4) can be applied without failing (with the
>>>>>>>   UseSTD3ASCIIRules flag unset).  ...
>>>>>>>   Although most Unicode characters can appear in
>>>>>>>   internationalized labels, ToASCII will fail for some input strings,
>>>>>>>   and such strings are not valid internationalized labels."
>>>>>>> 
>>>>>>> "An "internationalized domain name" (IDN) is a domain name in which
>>>>>>>   every label is an internationalized label."
>>>>>>> 
>>>>>>> [Section 4.1]
>>>>>>> "ToASCII consists of the following steps:
>>>>>>> 
>>>>>>>   ...
>>>>>>> 
>>>>>>>   8. Verify that the number of code points is in the range 1 to 63
>>>>>>>        inclusive."
>>>>>>> 
>>>>>>> 
>>>>>>> Here are the questions:
>>>>>>> 1. whether "example..com" is an valid IDN?
>>>>>>>     As dot is used as label separators, there are three labels,
>>>>>>> "example", "", "com".  Per RFC 3490, "" is not a valid label. Hence,
>>>>>>> "example..com" is not a valid IDN.
>>>>>>> 
>>>>>>>     We need to address the issue in IDN.
>>>>>>> 
>>>>>>> 2. whether "xyz." is an valid IDN?
>>>>>>>     It's an gray area, I think. We can treat the trailing "." as root
>>>>>>> label, or a label separator.
>>>>>>>     If the trailing "." is treated as label separator, "xyz." is
>>>>>>> invalid
>>>>>>> per RFC 3490.
>>>>>>>     if the trailing "." is treated as root label, what's the expected
>>>>>>> return value of IDN.toASCII("xyz.")?  I think the return value can be
>>>>>>> either "xyz." or "xyz".  The current implementation returns "xyz".
>>>>>>> 
>>>>>>>     We may need not to update the implementation if tailing "." is
>>>>>>> treated as root label.
>>>>>>> 
>>>>>>> 3. whether "." is an valid IDN?
>>>>>>>     It's an gray area again, I think.
>>>>>>>     As above, if the trailing "." is treated as root label, I think
>>>>>>> the
>>>>>>> return value can be either "." or "".  The current implementation
>>>>>>> throws
>>>>>>> a StringIndexOutOfBoundsException.
>>>>>>> 
>>>>>>>     However, what empty domain name ("") really means?  I would
>>>>>>> prefer to
>>>>>>> return "." for "." instead.
>>>>>>> 
>>>>>>>     We need to address the issue in IDN.
>>>>>>> 
>>>>>>> 
>>>>>>> Here comes the solution, the IDN.toASCII() returns:
>>>>>>> 1. "." for ".";
>>>>>>> 2. "xyz" for "xyz.";
>>>>>>> 3. IAE for "example..com".
>>>>>>> 
>>>>>>> Does it make sense?
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> Xuelei
>>>>>>> 
>>>>>>> 
>>>>>>> On 8/7/2013 1:35 AM, Michael McMahon wrote:
>>>>>>>> I don't really understand the reason for the restriction in
>>>>>>>> SNIHostName
>>>>>>>> But, I guess that is where it should be enforced if it is required.
>>>>>>>> 
>>>>>>>> Michael.
>>>>>>>> 
>>>>>>>> On 06/08/13 17:43, Dmitry Samersoff wrote:
>>>>>>>>> Xuelei,
>>>>>>>>> 
>>>>>>>>> . (dot) is perfectly valid domain name and it means root domain so
>>>>>>>>> com.
>>>>>>>>> is valid domain name as well.
>>>>>>>>> 
>>>>>>>>> It thinks to me that in context of methods your change we should
>>>>>>>>> ignore
>>>>>>>>> trailing dots, rather than throw exception.
>>>>>>>>> 
>>>>>>>>> -Dmitry
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On 2013-08-06 15:44, Xuelei Fan wrote:
>>>>>>>>>> Hi,
>>>>>>>>>> 
>>>>>>>>>> Please review the bug fix to strict the illegal input checking in
>>>>>>>>>> IDN.
>>>>>>>>>> 
>>>>>>>>>> webrev: http://cr.openjdk.java.net./~xuelei/8020842/webrev.00/
>>>>>>>>>> 
>>>>>>>>>> Here is two test cases, which are expected to get IAE.
>>>>>>>>>> 
>>>>>>>>>> Case 1:
>>>>>>>>>> String host = IDN.toASCII(".", IDN.USE_STD3_ASCII_RULES);
>>>>>>>>>> Exception in thread "main"
>>>>>>>>>> java.lang.StringIndexOutOfBoundsException:
>>>>>>>>>> String index out of range: 0
>>>>>>>>>>           at java.lang.StringBuffer.charAt(StringBuffer.java:204)
>>>>>>>>>>           at java.net.IDN.toASCIIInternal(IDN.java:279)
>>>>>>>>>>           at java.net.IDN.toASCII(IDN.java:118)
>>>>>>>>>> 
>>>>>>>>>> Case 2:
>>>>>>>>>> String host = IDN.toASCII("com.", IDN.USE_STD3_ASCII_RULES);
>>>>>>>>>> 
>>>>>>>>>> Thanks,
>>>>>>>>>> Xuelei
> 
> 
> -- 
> Dmitry Samersoff
> Oracle Java development team, Saint Petersburg, Russia
> * I would love to change the world, but they won't give me the source code.



More information about the security-dev mailing list