[rfc][icedtea-web] Fix regression in recent PR1204 patch

Andrew Azores aazores at redhat.com
Wed Oct 2 14:18:11 PDT 2013


On 10/02/2013 01:17 PM, Omair Majid wrote:
> Hi Andrew,
>
> On 10/01/2013 05:16 PM, Andrew Azores wrote:
>> On 10/01/2013 12:01 PM, Omair Majid wrote:
>>> On 10/01/2013 11:45 AM, Andrew Azores wrote:
>>>> I just caught a regression in my own patch. The "path" portion of the
>>>> URI was already percent-encoded seeing as it was obtained with
>>>> URL#getPath(). This was then fed back into the constructor of a new URI,
>>>> and the .toURL() result of this URI was returned. This process repeated
>>>> the percent-encoding, so eg spaces would become "%2520" rather than
>>>> simply "%20".
>>> Could you include a unit test? Otherwise looks good to me.
>>>
>> Sure, added three new tests to ensure correct percent-encoding
>> behaviour. The SpacesAreEverywhere(Signed) reproducer(s) also catch this
>> regression, which is how I noticed it to begin with. I also went ahead
>> and refactored the existing unit tests a bit.
>>
>> I had to add some simple input validation to the getVersionedUrl call. I
>> realized that the older version of this method was not actually dealing
>> with percent encoding on its own - if you passed it an invalid URL
>> containing a space, it would happily accept this, work with it, and
>> return to you another invalid URL containing the same spaces. This seems
>> incorrect at first but I didn't want to go about changing the behaviour
>> of this method. If the method started handling percent-encoding, then it
>> could be passing back up URLs that actually do point to valid resources,
>> when in fact that isn't *really* the same URL it was asked to work with.
>> Rather than causing an explicit failure on invalid input, I kept the
>> behaviour the same, so that the invalid URL is simply returned, and the
>> same later point of failure should still occur as it does now. Of the
>> two new test cases, two of them contain spaces. The return value of the
>> method on these inputs is simply the same URL that was given in the
>> first place, as this is what the method used to do. The other new test
>> case ensures that a properly-encoded URL is not mistakenly re-encoded.
> I think this is the better approach. Thanks for fixing this.
>
> Not related to this patch, but have you considered that this is the
> wrong place for handling this percent encoding problem? These URLs (with
> mistakes) only appear in JNLP files, right? Maybe the file parser should
> be deciding what to do with incorrectly encoded urls and the rest of
> icedtea-web (including ResourceUrlCreator) should assume the urls are
> not malformed.

Well, I ended up running into another corner case (as I mentioned on 
IRC) and decided it was easier to ditch the URI class and go back to 
manually constructing URLs :( . This is sort of ugly IMO but it 
completely sidesteps the percent-encoding issues involved with using 
URI, which was quickly making that solution uglier and uglier anyway. 
It's still enhanced over the pre-PR1204 fix version though, as that one 
was missing some of the optional parts in a URL. It had no user info and 
no query string, notably. Now it does, and tests for them too!

>
>> On another note, I have a question about this method now that I'm
>> looking back into it again. It's only called from two places in the
>> entire codebase - one place is these unit tests. The other place is
>> ResourceUrlCreator#getUrls(). getUrls() and getVersionedUrl() are both
>> instance methods of the class. When getUrls calls getVersionedUrl, it
>> passes it a Resource reference - but this Resource is also a field of
>> the instance. So, why does getVersionedUrl take a parameter at all if
>> the same reference is already available to it? Or, on the other hand,
>> getVersionedUrl doesn't care at all about its containing instance - the
>> only information it needs to perform its duty is a Resource reference.
>> So then getVersionedUrl can be made static and passed the Resource
>> reference. I don't understand that there's any reason for it to be in
>> this in-between state. Anyone seeing something I'm not?
> I don't recall the details in this particular case, but I do think all
> three options are generally valid. Depending on what the code is doing
> (especially if it a mutation), passing the instance variable as an
> argument may make sense.
>
> In this case, I am leaning towards keeping it an instance method and not
> passing the Resource as argument.

That's what I was leaning toward as well. I've also put that refactor 
into this patch since it really only affects about three lines of code.

>
>> +++ b/netx/net/sourceforge/jnlp/cache/ResourceUrlCreator.java
>> +        } catch (UnsupportedEncodingException e) { // "UTF-8" is hardcoded, this won't happen
>> +            return resourceUrl;
> If this case can only happen if a developer makes a mistake, perhaps
> throwing a 'new IllegalStateException("UTF-8 is a hardcoded and
> always-supported encoding")' may make more sense?
>
>> +    private static boolean checkEncoding(String path) {
> It's a good idea to name methods so the name describes the return value.
> In this case, it would be nice to describe what is it checking, and what
> the return value indicates.
>
> Maybe call it 'isPathMalformed'? (or maybe invert the conditional and
> call it 'isPathValid'?)
>
>> +        } catch (UnsupportedEncodingException e) {
> Please don't ignore exceptions. If it should never happen, throw an
> AssertionError or IllegalStateException.

Okay, this is sounds good to me. Thanks for the advice.

>
>> +        return !(path.equals(decodedUrl) && path.equals(encodedUrl));
> In general, it's better to comment the 'why' instead of the 'what'. I
> see some comments in this file which say that method foo may return null
> and (personally) I don't find those style of comments too useful (the
> method has javadocs already, no?). But I do think describing why this
> compare provides the right result is a good idea.
>
>> +++ b/tests/netx/unit/net/sourceforge/jnlp/cache/ResourceUrlCreatorTest.java
>> +    private static final Version version_11 = new Version("1.1");
>> +    private static final Version version_20 = new Version("2.0");
>> +    private static final Version version_two = new Version("version two");
>> +    private static final DownloadOptions dlopts_noPack_useVersion = new DownloadOptions(false, true);
>> +    private static final DownloadOptions dlopts_noPack_noVersion = new DownloadOptions(false, false);
> It's generally common to have constants (which is what these static,
> final and immutable objects are) declared as UPPER_CASE.

Yea, I had quickly extracted those values out into constants and didn't 
really think about what I named them. Should've followed convention the 
first time :)

>
>> +    @Test
>> +    public void testPartiallyEncodedUrl() throws MalformedURLException {
>> +        URL result = getResultUrl("http://test.com/a malformed%20url.jar", null, dlopts_noPack_useVersion);
>> +        assertEquals("http://test.com/a malformed%20url.jar", result.toString());
>> +    }
> Instead of "test.com", have you considered using "example.com"? It's a
> real URL meant for examples.
>
> Cheers,
> Omair

This is done in the new patch. Doesn't matter at this point because no 
network requests are ever made by the tests, but anyway no harm done 
changing it.

I also ended up adding several new tests. There's one extremely tiny new 
helper method in ResourceUrlCreator - all it does is take a string, 
return empty string if the argument is null and otherwise return the 
same string. Just because URL.get* tends to return null for optional 
non-specified parameters. I didn't write tests for that one.

New ChangeLog:
PR1204 patch regression fix
* netx/net/sourceforge/jnlp/cache/ResourceUrlCreator.java: 
(getVersionedUrl) fix regression in previous PR1204 patch. Refactor to 
not take Resource parameter, use instance's field instead. 
(uriPartToString) new method
* 
tests/netx/unit/net/sourceforge/jnlp/cache/ResourceUrlCreatorTest.java: 
new tests for ResourceUrlCreator.getVersionedUrl

Thanks,

-- 
Andrew A

-------------- next part --------------
A non-text attachment was scrubbed...
Name: PR1204_regression_refactor.patch
Type: text/x-patch
Size: 14495 bytes
Desc: not available
Url : http://mail.openjdk.java.net/pipermail/distro-pkg-dev/attachments/20131002/4dadbe04/PR1204_regression_refactor.patch 


More information about the distro-pkg-dev mailing list