JDK-8019345, RFC3986, RFC2396 and java.net.URI

Peter Firmstone peter.firmstone at zeus.net.au
Sun Nov 10 12:04:47 UTC 2024


We've been using an RFC3986 URI implementation for over a decade, there 
were issues we had to work around regarding formatting, so we provided 
static methods to address them.  Significant performance benefits can be 
derived from strict normalization relating to identity.

Java doesn't implement RFC2396 strictly, as it has an expanded character 
set that doesn't require escaping and can result in more than one 
normalized form.   My understanding is its these types of corner cases 
regarding character escaping are what prevented Java's URI 
implementation from being upgraded to RFC3986.

We use RFC3986 for identity and we use URL for connections, currently 
the JDK still depends on URL for identity, which generally incurs the 
cost of network DNS lookup (at least once per URL).   When using Uri for 
identity, it allows for server replication for example, however when URL 
is used, as it resolves to an IP address, there may be a number of 
replicating servers that resolve from the same URI to an address 
range.   In reality identity should be determined by authentication, 
there's a high cost of using DNS to determine identity, when a suitable 
RFC3986 URI normalization implementation can infer it without incurring 
network calls.

Perhaps it might be an option to use a provider mechanism, allowing an 
RFC version to be selected?

Our implementation also has a utility method to return a URL instance 
for connections.

Javadoc from our RFC3986 Uri implementation (AL2.0):

/**
  * This class represents an immutable instance of a URI as defined by 
RFC 3986.
  * <p>
  * This class replaces java.net.URI functionality.
  * <p>
  * Unlike java.net.URI this class is not Serializable and hashCode and
  * equality is governed by strict RFC3986 normalisation. In addition 
"other"
  * characters allowed in java.net.URI as specified by javadoc, not 
specifically
  * allowed by RFC3986 are illegal and must be escaped.  This strict 
adherence
  * is essential to eliminate false negative or positive matches.
  * <p>
  * In addition to RFC3896 normalisation, on OS platforms with a \ file 
separator
  * the path is converted to UPPER CASE for comparison for file: schema, 
during
  * equals and hashCode calls.
  * <p>
  * IPv6 and IPvFuture host addresses must be enclosed in square 
brackets as per
  * RFC3986.  A zone delimiter %, if present, must be represented in 
escaped %25
  * form as per RFC6874.
  * <p>
  * In addition to RFC3986 normalization, IPv6 host addresses will be 
normalized
  * to comply with RFC 5952 A Recommendation for IPv6 Address Text 
Representation.
  * This is to ensure consistent equality between identical IPv6 addresses.
  *
  * @since 3.0.0
  */
public final class Uri implements Comparable<Uri> {

<SNIP>  Static factory methods for various cases:

     /**
      * Parses the given argument {@code rfc3986compliantURI} and 
creates an appropriate URI
      * instance.
      *
      * The parameter string is checked for compliance, an 
IllegalArgumentException
      * is thrown if the string is non compliant.
      *
      * @param rfc3986compliantURI
      *            the string which has to be parsed to create the URI 
instance.
      * @return the created instance representing the given URI.
      */
     public static Uri create(String rfc3986compliantURI) {
         Uri result = null;
         try {
             result = new Uri(rfc3986compliantURI);
         } catch (URISyntaxException e) {
             throw new IllegalArgumentException(e.getMessage());
         }
         return result;
     }

     /**
      * The parameter string doesn't contain any existing escape 
sequences, any
      * escape character % found is encoded as %25. Illegal characters are
      * escaped if possible.
      *
      * The Uri is normalised according to RFC3986.
      *
      * @param unescapedString URI in un-escaped string form
      * @return an RFC3986 compliant Uri.
      * @throws java.net.URISyntaxException if string cannot be escaped.
      */
     public static Uri escapeAndCreate(String unescapedString) throws 
URISyntaxException{
         return new Uri(quoteComponent(unescapedString, allLegalUnescaped));
     }

     /**
      * The parameter string may already contain escaped sequences, any 
illegal
      * characters are escaped and any that shouldn't be escaped are 
un-escaped.
      *
      * The escape character % is not re-encoded.
      * @param nonCompliantEscapedString URI in string from.
      * @return an RFC3986 compliant Uri.
      * @throws java.net.URISyntaxException if string cannot be escaped.
      */
     public static Uri parseAndCreate(String nonCompliantEscapedString) 
throws URISyntaxException{
         return new Uri(quoteComponent(nonCompliantEscapedString, 
allLegal));
     }

<SNIP>


/** Fixes windows file URI string by converting back slashes to forward
      * slashes and inserting a forward slash before the drive letter if 
it is
      * missing.  No normalisation or modification of case is performed.
      * @param uri String representation of URI
      * @return fixed URI String
      */
     public static String fixWindowsURI(String uri) {
         if (uri == null) return null;
         if (File.separatorChar != '\\') return uri;
         if ( uri.startsWith("file:") || uri.startsWith("FILE:")){
             char [] u = uri.toCharArray();
             int l = u.length;
             StringBuilder sb = new StringBuilder(uri.length()+1);
             for (int i=0; i<l; i++){
                 // Ensure we use forward slashes
                 if (u[i] == File.separatorChar) {
                     sb.append('/');
                     continue;
                 }
                 if (i == 5 && uri.startsWith(":", 6 )) {
                     // Windows drive letter without leading slashes 
doesn't comply
                     // with URI spec, fix it here
                     sb.append("/");
                 }
                 sb.append(u[i]);
             }
             return sb.toString();
         }
         return uri;
     }


<SNIP>


public static Uri filePathToUri(String path) throws URISyntaxException{
         String forwardSlash = "/";
         if (path == null || path.length() == 0) {
             // codebase is "file:"
             path = "*";
         }
         // Ensure compatibility with URLClassLoader, when directory
         // character is dropped by File.
         boolean directory = false;
         if (path.endsWith(forwardSlash)) directory = true;
         path = new File(path).getAbsolutePath();
         if (directory) {
             if (!(path.endsWith(File.separator))){
                 path = path + File.separator;
             }
         }
         if (File.separatorChar == '\\') {
             path = path.replace(File.separatorChar, '/');
         }
         path = fixWindowsURI("file:" + path);
         return Uri.escapeAndCreate(path); //$NON-NLS-1$
     }

<SNIP>

     /**
      * Converts this URI instance to a URL.
      *
      * @return the created URL representing the same resource as this URI.
      * @throws MalformedURLException
      *             if an error occurs while creating the URL or no protocol
      *             handler could be found.
      */
     public URL toURL() throws MalformedURLException {
         if (!absolute) {
             throw new 
IllegalArgumentException(Messages.getString("luni.91") + ": " 
//$NON-NLS-1$//$NON-NLS-2$
                     + toString());
         }
         if (opaque) return new URL(toString()); // Let the Handler 
parse it.
         String hst = host;
         StringBuilder sb = new StringBuilder();
         //userinfo will be rare, utilise sb, then clear it.
         if (userinfo != null){
             sb.append(userinfo).append('@').append(hst);
             hst = sb.toString();
             sb.delete(0, sb.length()-1);
         }
         // now lets create the file section of the URL.
         sb.append(path);
         if (query != null) sb.append('?').append(query);
         if (fragment != null) sb.append('#').append(fragment);
         String file = sb.toString(); //for code readability
         // deprecated to provide a warning against misuse, not for removal.
         @SuppressWarnings("deprecation")
         URL url = new URL(scheme, hst, port, file, null);
         return url;
     }

-- 
Regards,
  
Peter



More information about the net-dev mailing list