Code review request, 8020842 IDN do not throw IAE when hostname ends with a trailing dot

Dmitry Samersoff dmitry.samersoff at oracle.com
Fri Aug 9 06:08:19 UTC 2013


Xuelei,

 119             p = q + 1;
 120             if (p < input.length() || q == (input.length() - 1)) {

Could be simplified to:

q <= input.length()-1

-Dmitry

On 2013-08-09 04:41, Xuelei Fan wrote:
> Ping.
> 
> Thanks,
> Xuelei
> 
> On 8/7/2013 11:17 PM, Xuelei Fan wrote:
>> Please review the new update:
>>
>> http://cr.openjdk.java.net./~xuelei/8020842/webrev.01/
>>
>> With this update, "com." is valid (return "com."); "." and
>> "example..com" are invalid.  And IAE will be thrown for invalid IDN.
>>
>> Thanks,
>> Xuelei
>>
>> On 8/7/2013 10:18 PM, Michael McMahon wrote:
>>> On 07/08/13 15:13, Xuelei Fan wrote:
>>>> On 8/7/2013 10:05 PM, Michael McMahon wrote:
>>>>> Resolvers seem to accept queries using trailing dots.
>>>>>
>>>>> eg nslookup www.oracle.com.
>>>>>
>>>>> or InetAddress.getByName("www.oracle.com.");
>>>>>
>>>>> The part of RFC3490 quoted below seems to me to be saying
>>>>> that the empty label implied by the trailing dot is not regarded
>>>>> as a label so that you don't end up calling toAscii() or toUnicode()
>>>>> with an empty string. I don't think it's saying the trailing dot can't
>>>>> be there.
>>>>>
>>>> It makes sense.
>>>>
>>>> What's your preference to return for IDN.toASCII("www.oracle.com."),
>>>> "www.oracle.com." or "www.oracle.com"? The current returned value is
>>>> "www.oracle.com".  I would like to reserve the behavior in this update.
>>>
>>> My opinion is to keep it as at present ie. "www.oracle.com."
>>>
>>> Michael
>>>
>>>> I think we are on same page soon.
>>>>
>>>> Thanks,
>>>> Xuelei
>>>>
>>>>> Michael
>>>>>
>>>>> On 07/08/13 13:44, Xuelei Fan wrote:
>>>>>> On 8/7/2013 12:06 AM, Matthew Hall wrote:
>>>>>>> Trailing dots are allowed in plain DNS (thus almost surely in IDN),
>>>>>>> and the single dot represents the root zone. So you have to be
>>>>>>> careful making this sort of change to check the DNS RFCs first.
>>>>>> That's the first question we need to answer, whether IDN allow tailling
>>>>>> dots ("com."), zero-length root label ("."), and zero-length label ("",
>>>>>> for example ""example..com")?
>>>>>>
>>>>>> Per the specification of IDN.toASCII():
>>>>>> =======================================
>>>>>> "ToASCII operation can fail. ToASCII fails if any step of it fails. If
>>>>>> ToASCII operation fails, an IllegalArgumentException will be thrown. In
>>>>>> this case, the input string should not be used in an internationalized
>>>>>> domain name.
>>>>>>
>>>>>> A label is an individual part of a domain name. The original ToASCII
>>>>>> operation, as defined in RFC 3490, only operates on a single label.
>>>>>> This
>>>>>> method can handle both label and entire domain name, by assuming that
>>>>>> labels in a domain name are always separated by dots. ...
>>>>>>
>>>>>> Throws IllegalArgumentException - if the input string doesn't
>>>>>> conform to
>>>>>> RFC 3490 specification"
>>>>>>
>>>>>> Per the specification of RFC 3490:
>>>>>> ==================================
>>>>>> [section 2]
>>>>>> "A label is an individual part of a domain name.  Labels are usually
>>>>>>    shown separated by dots; for example, the domain name
>>>>>>    "www.example.com" is composed of three labels: "www", "example", and
>>>>>>    "com".  (The zero-length root label described in [STD13], which can
>>>>>>    be explicit as in "www.example.com." or implicit as in
>>>>>>    "www.example.com", is not considered a label in this
>>>>>> specification.)"
>>>>>>
>>>>>> "An "internationalized label" is a label to which the ToASCII
>>>>>>    operation (see section 4) can be applied without failing (with the
>>>>>>    UseSTD3ASCIIRules flag unset).  ...
>>>>>>    Although most Unicode characters can appear in
>>>>>>    internationalized labels, ToASCII will fail for some input strings,
>>>>>>    and such strings are not valid internationalized labels."
>>>>>>
>>>>>> "An "internationalized domain name" (IDN) is a domain name in which
>>>>>>    every label is an internationalized label."
>>>>>>
>>>>>> [Section 4.1]
>>>>>> "ToASCII consists of the following steps:
>>>>>>
>>>>>>    ...
>>>>>>
>>>>>>    8. Verify that the number of code points is in the range 1 to 63
>>>>>>         inclusive."
>>>>>>
>>>>>>
>>>>>> Here are the questions:
>>>>>> 1. whether "example..com" is an valid IDN?
>>>>>>      As dot is used as label separators, there are three labels,
>>>>>> "example", "", "com".  Per RFC 3490, "" is not a valid label. Hence,
>>>>>> "example..com" is not a valid IDN.
>>>>>>
>>>>>>      We need to address the issue in IDN.
>>>>>>
>>>>>> 2. whether "xyz." is an valid IDN?
>>>>>>      It's an gray area, I think. We can treat the trailing "." as root
>>>>>> label, or a label separator.
>>>>>>      If the trailing "." is treated as label separator, "xyz." is
>>>>>> invalid
>>>>>> per RFC 3490.
>>>>>>      if the trailing "." is treated as root label, what's the expected
>>>>>> return value of IDN.toASCII("xyz.")?  I think the return value can be
>>>>>> either "xyz." or "xyz".  The current implementation returns "xyz".
>>>>>>
>>>>>>      We may need not to update the implementation if tailing "." is
>>>>>> treated as root label.
>>>>>>
>>>>>> 3. whether "." is an valid IDN?
>>>>>>      It's an gray area again, I think.
>>>>>>      As above, if the trailing "." is treated as root label, I think
>>>>>> the
>>>>>> return value can be either "." or "".  The current implementation
>>>>>> throws
>>>>>> a StringIndexOutOfBoundsException.
>>>>>>
>>>>>>      However, what empty domain name ("") really means?  I would
>>>>>> prefer to
>>>>>> return "." for "." instead.
>>>>>>
>>>>>>      We need to address the issue in IDN.
>>>>>>
>>>>>>
>>>>>> Here comes the solution, the IDN.toASCII() returns:
>>>>>> 1. "." for ".";
>>>>>> 2. "xyz" for "xyz.";
>>>>>> 3. IAE for "example..com".
>>>>>>
>>>>>> Does it make sense?
>>>>>>
>>>>>> Thanks,
>>>>>> Xuelei
>>>>>>
>>>>>>
>>>>>> On 8/7/2013 1:35 AM, Michael McMahon wrote:
>>>>>>> I don't really understand the reason for the restriction in
>>>>>>> SNIHostName
>>>>>>> But, I guess that is where it should be enforced if it is required.
>>>>>>>
>>>>>>> Michael.
>>>>>>>
>>>>>>> On 06/08/13 17:43, Dmitry Samersoff wrote:
>>>>>>>> Xuelei,
>>>>>>>>
>>>>>>>> . (dot) is perfectly valid domain name and it means root domain so
>>>>>>>> com.
>>>>>>>> is valid domain name as well.
>>>>>>>>
>>>>>>>> It thinks to me that in context of methods your change we should
>>>>>>>> ignore
>>>>>>>> trailing dots, rather than throw exception.
>>>>>>>>
>>>>>>>> -Dmitry
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 2013-08-06 15:44, Xuelei Fan wrote:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> Please review the bug fix to strict the illegal input checking in
>>>>>>>>> IDN.
>>>>>>>>>
>>>>>>>>> webrev: http://cr.openjdk.java.net./~xuelei/8020842/webrev.00/
>>>>>>>>>
>>>>>>>>> Here is two test cases, which are expected to get IAE.
>>>>>>>>>
>>>>>>>>> Case 1:
>>>>>>>>> String host = IDN.toASCII(".", IDN.USE_STD3_ASCII_RULES);
>>>>>>>>> Exception in thread "main"
>>>>>>>>> java.lang.StringIndexOutOfBoundsException:
>>>>>>>>> String index out of range: 0
>>>>>>>>>            at java.lang.StringBuffer.charAt(StringBuffer.java:204)
>>>>>>>>>            at java.net.IDN.toASCIIInternal(IDN.java:279)
>>>>>>>>>            at java.net.IDN.toASCII(IDN.java:118)
>>>>>>>>>
>>>>>>>>> Case 2:
>>>>>>>>> String host = IDN.toASCII("com.", IDN.USE_STD3_ASCII_RULES);
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Xuelei
>>>>>>>>>
>>>
>>
> 


-- 
Dmitry Samersoff
Oracle Java development team, Saint Petersburg, Russia
* I would love to change the world, but they won't give me the source code.



More information about the security-dev mailing list