Code review request, 8020842 IDN do not throw IAE when hostname ends with a trailing dot

Dmitry Samersoff dmitry.samersoff at oracle.com
Wed Aug 7 06:10:23 PDT 2013


Xuelei,

root label is an empty label[1], dot is a label separator, so in printed
form domain names is dot-terminated.

Please see also below inline.

[1]
RFC rfc1034.txt:

Internally, programs that manipulate domain names should represent them
as sequences of labels, where each label is a length octet followed by
an octet string.  Because all domain names end at the root, *which has a
null string for a label*, these internal representations can use a
length byte of zero to terminate a domain name.


On 2013-08-07 16:44, Xuelei Fan wrote:
> On 8/7/2013 12:06 AM, Matthew Hall wrote:
>> Trailing dots are allowed in plain DNS (thus almost surely in IDN),
>> and the single dot represents the root zone. So you have to be
>> careful making this sort of change to check the DNS RFCs first.
> 
> That's the first question we need to answer, whether IDN allow tailling
> dots ("com."), zero-length root label ("."), and zero-length label ("",
> for example ""example..com")?
> 
> Per the specification of IDN.toASCII():
> =======================================
> "ToASCII operation can fail. ToASCII fails if any step of it fails. If
> ToASCII operation fails, an IllegalArgumentException will be thrown. In
> this case, the input string should not be used in an internationalized
> domain name.
> 
> A label is an individual part of a domain name. The original ToASCII
> operation, as defined in RFC 3490, only operates on a single label. This
> method can handle both label and entire domain name, by assuming that
> labels in a domain name are always separated by dots. ...
> 
> Throws IllegalArgumentException - if the input string doesn't conform to
> RFC 3490 specification"
> 
> Per the specification of RFC 3490:
> ==================================
> [section 2]
> "A label is an individual part of a domain name.  Labels are usually
>  shown separated by dots; for example, the domain name
>  "www.example.com" is composed of three labels: "www", "example", and
>  "com".  (The zero-length root label described in [STD13], which can
>  be explicit as in "www.example.com." or implicit as in
>  "www.example.com", is not considered a label in this specification.)"
> 
> "An "internationalized label" is a label to which the ToASCII
>  operation (see section 4) can be applied without failing (with the
>  UseSTD3ASCIIRules flag unset).  ...
>  Although most Unicode characters can appear in
>  internationalized labels, ToASCII will fail for some input strings,
>  and such strings are not valid internationalized labels."
> 
> "An "internationalized domain name" (IDN) is a domain name in which
>  every label is an internationalized label."
> 
> [Section 4.1]
> "ToASCII consists of the following steps:
> 
>  ...
> 
>  8. Verify that the number of code points is in the range 1 to 63
>       inclusive."
> 
> 
> Here are the questions:
> 1. whether "example..com" is an valid IDN?
>    As dot is used as label separators, there are three labels,
> "example", "", "com".  Per RFC 3490, "" is not a valid label. Hence,
> "example..com" is not a valid IDN.
> 
>    We need to address the issue in IDN.

Root label can't appear in the middle of domain name, so example..com is
an invalid domain name and appropriate exception have to be thrown.

> 
> 2. whether "xyz." is an valid IDN?
>    It's an gray area, I think. We can treat the trailing "." as root
> label, or a label separator.
>    If the trailing "." is treated as label separator, "xyz." is invalid
> per RFC 3490.
>    if the trailing "." is treated as root label, what's the expected
> return value of IDN.toASCII("xyz.")?  I think the return value can be
> either "xyz." or "xyz".  The current implementation returns "xyz".
> 
>    We may need not to update the implementation if tailing "." is
> treated as root label.

Empty label at the end of domain names is valid per RFC 1034 and means
root label. So we should process this name and return all non-empty
labels.

> 3. whether "." is an valid IDN?
>    It's an gray area again, I think.
>    As above, if the trailing "." is treated as root label, I think the
> return value can be either "." or "".  The current implementation throws
> a StringIndexOutOfBoundsException.
> 
>    However, what empty domain name ("") really means?  I would prefer to
> return "." for "." instead.
> 
>    We need to address the issue in IDN.

As dot is a label separator and root (empty) label can't appear in the
middle of domain name, . (dot) is not valid name and this case is
similar to case (1) - we should throw an appropriate exception.

-Dmitry

> 
> Here comes the solution, the IDN.toASCII() returns:
> 1. "." for ".";
> 2. "xyz" for "xyz.";
> 3. IAE for "example..com".
> 
> Does it make sense?
> 
> Thanks,
> Xuelei
> 
> 
> On 8/7/2013 1:35 AM, Michael McMahon wrote:
>> I don't really understand the reason for the restriction in SNIHostName
>> But, I guess that is where it should be enforced if it is required.
>>
>> Michael.
>>
>> On 06/08/13 17:43, Dmitry Samersoff wrote:
>>> Xuelei,
>>>
>>> . (dot) is perfectly valid domain name and it means root domain so com.
>>> is valid domain name as well.
>>>
>>> It thinks to me that in context of methods your change we should ignore
>>> trailing dots, rather than throw exception.
>>>
>>> -Dmitry
>>>
>>>
>>>
>>> On 2013-08-06 15:44, Xuelei Fan wrote:
>>>> Hi,
>>>>
>>>> Please review the bug fix to strict the illegal input checking in IDN.
>>>>
>>>> webrev: http://cr.openjdk.java.net./~xuelei/8020842/webrev.00/
>>>>
>>>> Here is two test cases, which are expected to get IAE.
>>>>
>>>> Case 1:
>>>> String host = IDN.toASCII(".", IDN.USE_STD3_ASCII_RULES);
>>>> Exception in thread "main" java.lang.StringIndexOutOfBoundsException:
>>>> String index out of range: 0
>>>>          at java.lang.StringBuffer.charAt(StringBuffer.java:204)
>>>>          at java.net.IDN.toASCIIInternal(IDN.java:279)
>>>>          at java.net.IDN.toASCII(IDN.java:118)
>>>>
>>>> Case 2:
>>>> String host = IDN.toASCII("com.", IDN.USE_STD3_ASCII_RULES);
>>>>
>>>> Thanks,
>>>> Xuelei
>>>>
>>>
>>
> 


-- 
Dmitry Samersoff
Oracle Java development team, Saint Petersburg, Russia
* I would love to change the world, but they won't give me the sources.



More information about the net-dev mailing list