[7u8] [Re: RFR [7u6]: 7166896: DocumentBuilder.parse(String uri) is not IPv6 enabled. It throws MalformedURLException

Joe Wang huizhe.wang at oracle.com
Tue Jul 10 18:50:07 UTC 2012


On 7/9/2012 10:59 PM, Paul Sandoz wrote:
> Hi Joe,
>
> What happens when someone logs a bug for system IDs containing IPv6 addresses and non-percent encoded international characters?

Exception would be expected just as if Xerces is used.

>
>
> On Jul 10, 2012, at 3:42 AM, Joe Wang wrote:
>> Hi Paul,
>>
>> I'm back from vacation.
>>
>> You're right. But such an error is also expected.  The original design never tried to out-do the java.net.URL.  If a system ID input fails URL, it shall result in an exception.
>>
>> The patch that supplied the extra encoding was provided to both Sun and Apache, and applied to Sun sources. However, it never went into the Apache code base (refer to https://issues.apache.org/jira/browse/XERCESJ-1156).  I thought of removing the patch, bringing our source in sync with that of Apache. But then I feared that we might get a regression since the patch has been in the source for so many years.
>>
>> Thus, this ugly solution (removing would be prettier) to leave the old change as is but use java.net.URL in all other cases.
>>
> java.net.URL is being used in all cases:

Except that an encoded url is the input when escapeNonUSAscii is used.

>
>   602         if (reader == null) {
>   603             stream = xmlInputSource.getByteStream();
>   604             if (stream == null) {
>   605                 URL location = new URL(escapeNonUSAscii(expandedSystemId));
>   606                 URLConnection connect = location.openConnection();
>   607                 if (!(connect instanceof HttpURLConnection)) {
>   608                     stream = connect.getInputStream();
>   609                 }
>
> If this is really about supporting non-percent encoded international characters in the system ID, then you can make a simple fix to support IPv6-based URLs in general: do not percent encoded *any* ascii characters.

When encoding an url, aren't reserved characters supposed to be encoded 
as well?

Joe

>
> Paul.
>
>
>> By the way, we can only consider this one for 7u8 now.
>>
>> Thanks,
>> Joe
>>



More information about the core-libs-dev mailing list