Code review request, 8020842 IDN do not throw IAE when hostname ends with a trailing dot

Michael McMahon michael.x.mcmahon at oracle.com
Wed Aug 7 07:05:40 PDT 2013


Resolvers seem to accept queries using trailing dots.

eg nslookup www.oracle.com.

or InetAddress.getByName("www.oracle.com.");

The part of RFC3490 quoted below seems to me to be saying
that the empty label implied by the trailing dot is not regarded
as a label so that you don't end up calling toAscii() or toUnicode()
with an empty string. I don't think it's saying the trailing dot can't 
be there.

Michael

On 07/08/13 13:44, Xuelei Fan wrote:
> On 8/7/2013 12:06 AM, Matthew Hall wrote:
>> Trailing dots are allowed in plain DNS (thus almost surely in IDN),
>> and the single dot represents the root zone. So you have to be
>> careful making this sort of change to check the DNS RFCs first.
> That's the first question we need to answer, whether IDN allow tailling
> dots ("com."), zero-length root label ("."), and zero-length label ("",
> for example ""example..com")?
>
> Per the specification of IDN.toASCII():
> =======================================
> "ToASCII operation can fail. ToASCII fails if any step of it fails. If
> ToASCII operation fails, an IllegalArgumentException will be thrown. In
> this case, the input string should not be used in an internationalized
> domain name.
>
> A label is an individual part of a domain name. The original ToASCII
> operation, as defined in RFC 3490, only operates on a single label. This
> method can handle both label and entire domain name, by assuming that
> labels in a domain name are always separated by dots. ...
>
> Throws IllegalArgumentException - if the input string doesn't conform to
> RFC 3490 specification"
>
> Per the specification of RFC 3490:
> ==================================
> [section 2]
> "A label is an individual part of a domain name.  Labels are usually
>   shown separated by dots; for example, the domain name
>   "www.example.com" is composed of three labels: "www", "example", and
>   "com".  (The zero-length root label described in [STD13], which can
>   be explicit as in "www.example.com." or implicit as in
>   "www.example.com", is not considered a label in this specification.)"
>
> "An "internationalized label" is a label to which the ToASCII
>   operation (see section 4) can be applied without failing (with the
>   UseSTD3ASCIIRules flag unset).  ...
>   Although most Unicode characters can appear in
>   internationalized labels, ToASCII will fail for some input strings,
>   and such strings are not valid internationalized labels."
>
> "An "internationalized domain name" (IDN) is a domain name in which
>   every label is an internationalized label."
>
> [Section 4.1]
> "ToASCII consists of the following steps:
>
>   ...
>
>   8. Verify that the number of code points is in the range 1 to 63
>        inclusive."
>
>
> Here are the questions:
> 1. whether "example..com" is an valid IDN?
>     As dot is used as label separators, there are three labels,
> "example", "", "com".  Per RFC 3490, "" is not a valid label. Hence,
> "example..com" is not a valid IDN.
>
>     We need to address the issue in IDN.
>
> 2. whether "xyz." is an valid IDN?
>     It's an gray area, I think. We can treat the trailing "." as root
> label, or a label separator.
>     If the trailing "." is treated as label separator, "xyz." is invalid
> per RFC 3490.
>     if the trailing "." is treated as root label, what's the expected
> return value of IDN.toASCII("xyz.")?  I think the return value can be
> either "xyz." or "xyz".  The current implementation returns "xyz".
>
>     We may need not to update the implementation if tailing "." is
> treated as root label.
>
> 3. whether "." is an valid IDN?
>     It's an gray area again, I think.
>     As above, if the trailing "." is treated as root label, I think the
> return value can be either "." or "".  The current implementation throws
> a StringIndexOutOfBoundsException.
>
>     However, what empty domain name ("") really means?  I would prefer to
> return "." for "." instead.
>
>     We need to address the issue in IDN.
>
>
> Here comes the solution, the IDN.toASCII() returns:
> 1. "." for ".";
> 2. "xyz" for "xyz.";
> 3. IAE for "example..com".
>
> Does it make sense?
>
> Thanks,
> Xuelei
>
>
> On 8/7/2013 1:35 AM, Michael McMahon wrote:
>> I don't really understand the reason for the restriction in SNIHostName
>> But, I guess that is where it should be enforced if it is required.
>>
>> Michael.
>>
>> On 06/08/13 17:43, Dmitry Samersoff wrote:
>>> Xuelei,
>>>
>>> . (dot) is perfectly valid domain name and it means root domain so com.
>>> is valid domain name as well.
>>>
>>> It thinks to me that in context of methods your change we should ignore
>>> trailing dots, rather than throw exception.
>>>
>>> -Dmitry
>>>
>>>
>>>
>>> On 2013-08-06 15:44, Xuelei Fan wrote:
>>>> Hi,
>>>>
>>>> Please review the bug fix to strict the illegal input checking in IDN.
>>>>
>>>> webrev: http://cr.openjdk.java.net./~xuelei/8020842/webrev.00/
>>>>
>>>> Here is two test cases, which are expected to get IAE.
>>>>
>>>> Case 1:
>>>> String host = IDN.toASCII(".", IDN.USE_STD3_ASCII_RULES);
>>>> Exception in thread "main" java.lang.StringIndexOutOfBoundsException:
>>>> String index out of range: 0
>>>>           at java.lang.StringBuffer.charAt(StringBuffer.java:204)
>>>>           at java.net.IDN.toASCIIInternal(IDN.java:279)
>>>>           at java.net.IDN.toASCII(IDN.java:118)
>>>>
>>>> Case 2:
>>>> String host = IDN.toASCII("com.", IDN.USE_STD3_ASCII_RULES);
>>>>
>>>> Thanks,
>>>> Xuelei
>>>>




More information about the net-dev mailing list