RFR: 8305763 : Parsing a URI with an underscore goes through a silent exception, negatively impacting performance

Daniel Fuchs dfuchs at openjdk.org
Wed Apr 12 10:34:34 UTC 2023


On Tue, 11 Apr 2023 18:00:05 GMT, Dhamoder Nalla <duke at openjdk.org> wrote:

> Issue 8305763 : Using underscores in the name for a URI triggers a silent exception in the java standard library, which consumes 5% of the CPU.
> 
> Exception:
> java.net.URISyntaxException: Illegal character in hostname at index N: xyz1_abcd.com
>     at java.base/java.net.URI$Parser.fail(URI.java:2943)
>     at java.base/java.net.URI$Parser.parseHostname(URI.java:3487)
>     at java.base/java.net.URI$Parser.parseServer(URI.java:3329)
>     
>     This exception is silent and does not produce any messages, except for ODP profiler, there is no other evidence that it’s happening (the stack trace above was printed after changes to Java library). The reason for this is because of how the URI creation is implemented in the java.net.URI class. There are two paths for creating a valid URI, and one of them goes through an exception.
> 
> We can see that if parseServer fails, there is still a way the authority gets assigned and we don’t throw an exception from the method. This means, not being able to parse the server is ok and the exception is silenced. In our case, the server parsing fails because we find an illegal character, as only alphanumeric and dash characters are allowed.

>From a quick look at the proposed change, I got the feeling that this change might not be appropriate: I suspect it will let `host` be assigned to the reg_name.
We want to preserve the long standing behavior that:

jshell> new URI("http://foo_bar:8080/").getHost()
$1 ==> null

Is this still the case after your proposed changes?

-------------

PR Comment: https://git.openjdk.org/jdk/pull/13430#issuecomment-1505042769


More information about the net-dev mailing list