RFR: 8248001: javadoc generates invalid HTML pages whose ftp:// links are broken

Hannes Wallnöfer hannesw at openjdk.java.net
Tue Aug 24 10:03:23 UTC 2021


On Fri, 20 Aug 2021 11:41:02 GMT, Daniel Fuchs <dfuchs at openjdk.org> wrote:

>> I assume the link is in an HTML document and goes in an HTML document. If you wanted to use java.net.URI, depending on where from `text` comes from and whereto it goes, you might need first to decode it using URLDecoder, and then you might need to re-encode it before spitting it out... That's a lot of operations where things could go wrong, especially if the link contains a query string.
>
> That said a stricter regexp (unless I'm mistaken) could be: `^[a-zA-Z][a-zA-Z0-9+-.]*:.+$`
> [ from RFC 2396:     scheme        = alpha *( alpha | digit | "+" | "-" | "." ) ]

I would normally opt for a generic regexp-based solution such as proposed by @dfuch, but there is a security aspect to this as well (e.g. script invocation), so I'd go with the more conservative approach here to just add `ftp:` protocol to the list.

-------------

PR: https://git.openjdk.java.net/jdk/pull/5198


More information about the javadoc-dev mailing list