RFR - 8132734: java.util.jar.* changes to support multi-release jar files
Hi, Let’s try again, this time there are tests. Please review the following webrev that adds support for multi-release jars as specified in JEP-238. Issue: https://bugs.openjdk.java.net/browse/JDK-8132734 JEP 238: https://bugs.openjdk.java.net/browse/JDK-8047305 Webrev: http://cr.openjdk.java.net/~psandoz/multiversion-jar/jar-webrev/ A multi-release jar file is a jar file that contains a manifest with a main attribute named "Multi-Release", and also contains a directory "META-INF/versions" with subdirectories that contain versioned entries segregated by the major version of Java platform releases. A versioned entry, with a version n, in the "META-INF/versions/{n}" directory overrides the unversioned root entry as well as any entry with a version number i where i < n. The changes in this webrev implement an aliasing mechanism in JarEntry so that when a JarFile client retrieves a JarEntry, the data from the entry pointed to by the alias is returned. There are methods to configure whether or not aliasing is enabled, and if it is, which version of an entry the alias points to. When a JarFile is used by a class loader to load class resources, the default version retrieved is the runtime version of the Java platform (i.e. a version 9 entry is returned when the platform is JDK 9). This mechanism can be configured by System properties. Thanks, Steve
On 14/10/2015 17:07, Steve Drach wrote:
Hi,
Let’s try again, this time there are tests. Please review the following webrev that adds support for multi-release jars as specified in JEP-238. Good to have tests this time :-)
I see several JAR files in the webrev, shouldn't the tests create these so that we don't have to check in binary files? I realize the signed case might be a bit of work but there may be some security tests that could provide examples. -Alan
Let’s try again, this time there are tests. Please review the following webrev that adds support for multi-release jars as specified in JEP-238. Good to have tests this time :-)
I see several JAR files in the webrev, shouldn't the tests create these so that we don't have to check in binary files? I realize the signed case might be a bit of work but there may be some security tests that could provide examples.
The current test directory contains binary jar files. In fact in all the test directories, there are 52 binary .jar files. I added three more. I thought about generating the jar files but the problem I run into is that there are two test classes that use the same files and creating the files twice seemed to be a bit wasteful. I couldn’t figure out a reliable way to have one test class create the files and the other one just use the files.
On 14/10/2015 17:28, Steve Drach wrote:
The current test directory contains binary jar files. In fact in all the test directories, there are 52 binary .jar files. I know but we need to work to remove those.
I added three more. I thought about generating the jar files but the problem I run into is that there are two test classes that use the same files and creating the files twice seemed to be a bit wasteful. I couldn’t figure out a reliable way to have one test class create the files and the other one just use the files. I'm not aware of anything in jtreg that would give you the @BeforeSuite-like support. One could use file locks to have it be setup in one place but it's probably more trouble that it is worth. How expensive is it to create them?
-Alan
The current test directory contains binary jar files. In fact in all the test directories, there are 52 binary .jar files. I know but we need to work to remove those.
I figured that might be the response, but thought it was worth the try ;-)
I added three more. I thought about generating the jar files but the problem I run into is that there are two test classes that use the same files and creating the files twice seemed to be a bit wasteful. I couldn’t figure out a reliable way to have one test class create the files and the other one just use the files. I'm not aware of anything in jtreg that would give you the @BeforeSuite-like support. One could use file locks to have it be setup in one place but it's probably more trouble that it is worth. How expensive is it to create them?
I don’t know, but probably not that expensive, basically for multi-release jar it’s 3 compiles and writing to a jarOutputStream(). I’ll do it.
On 14 Oct 2015, at 21:12, Steve Drach <steve.drach@oracle.com> wrote:
The current test directory contains binary jar files. In fact in all the test directories, there are 52 binary .jar files. I know but we need to work to remove those.
I figured that might be the response, but thought it was worth the try ;-)
A reasonable way forward is to go with pre-baked jar files (it’s not making the current situation particular any worse) then lets try and clear up this aspect in a later push, hopefully using a common mechanism that can be used by all such tests that need to construct jar files. We need a CCC, so i suggest if tests are not ready and approved by the time the CCC is approved then we push what we have. Paul.
I added three more. I thought about generating the jar files but the problem I run into is that there are two test classes that use the same files and creating the files twice seemed to be a bit wasteful. I couldn’t figure out a reliable way to have one test class create the files and the other one just use the files. I'm not aware of anything in jtreg that would give you the @BeforeSuite-like support. One could use file locks to have it be setup in one place but it's probably more trouble that it is worth. How expensive is it to create them?
I don’t know, but probably not that expensive, basically for multi-release jar it’s 3 compiles and writing to a jarOutputStream(). I’ll do it.
The current test directory contains binary jar files. In fact in all the test directories, there are 52 binary .jar files. I know but we need to work to remove those.
I figured that might be the response, but thought it was worth the try ;-)
A reasonable way forward is to go with pre-baked jar files (it’s not making the current situation particular any worse) then lets try and clear up this aspect in a later push, hopefully using a common mechanism that can be used by all such tests that need to construct jar files.
+1 on the common mechanism. For building jars and other common tasks. It would also be nice to have some form of test sequencing or other way to indicate that multiple tests depend on a common set of artifacts and that set only needs to be created once.
On 10/14/2015 1:56 PM, Steve Drach wrote:
The current test directory contains binary jar files. In fact in all the test directories, there are 52 binary .jar files. I know but we need to work to remove those. I figured that might be the response, but thought it was worth the try ;-)
A reasonable way forward is to go with pre-baked jar files (it’s not making the current situation particular any worse) then lets try and clear up this aspect in a later push, hopefully using a common mechanism that can be used by all such tests that need to construct jar files. +1 on the common mechanism. For building jars and other common tasks.
It would also be nice to have some form of test sequencing or other way to indicate that multiple tests depend on a common set of artifacts and that set only needs to be created once.
You can put multiple @build and @run directives in a test. In particular, you can build and run multiple Java sources files, such as sources for different logical tests, from a single jtreg @test file. I believe you can use this capability to implement sequencing. HTH, -Joe
Steve, Any reason the JarEntry.get/setSize() are the only ZipEntry methods get overridden? -Sherman On 10/14/2015 09:07 AM, Steve Drach wrote:
Hi,
Let’s try again, this time there are tests. Please review the following webrev that adds support for multi-release jars as specified in JEP-238.
Issue: https://bugs.openjdk.java.net/browse/JDK-8132734 JEP 238: https://bugs.openjdk.java.net/browse/JDK-8047305 Webrev: http://cr.openjdk.java.net/~psandoz/multiversion-jar/jar-webrev/
A multi-release jar file is a jar file that contains a manifest with a main attribute named "Multi-Release", and also contains a directory "META-INF/versions" with subdirectories that contain versioned entries segregated by the major version of Java platform releases. A versioned entry, with a version n, in the "META-INF/versions/{n}" directory overrides the unversioned root entry as well as any entry with a version number i where i< n.
The changes in this webrev implement an aliasing mechanism in JarEntry so that when a JarFile client retrieves a JarEntry, the data from the entry pointed to by the alias is returned. There are methods to configure whether or not aliasing is enabled, and if it is, which version of an entry the alias points to.
When a JarFile is used by a class loader to load class resources, the default version retrieved is the runtime version of the Java platform (i.e. a version 9 entry is returned when the platform is JDK 9). This mechanism can be configured by System properties.
Thanks, Steve
Any reason the JarEntry.get/setSize() are the only ZipEntry methods get overridden?
It didn’t seem necessary. The root entries are the “public interface”, we’re just providing aliased entry contents.
-Sherman
On 10/14/2015 09:07 AM, Steve Drach wrote:
Hi,
Let’s try again, this time there are tests. Please review the following webrev that adds support for multi-release jars as specified in JEP-238.
Issue: https://bugs.openjdk.java.net/browse/JDK-8132734 JEP 238: https://bugs.openjdk.java.net/browse/JDK-8047305 Webrev: http://cr.openjdk.java.net/~psandoz/multiversion-jar/jar-webrev/
A multi-release jar file is a jar file that contains a manifest with a main attribute named "Multi-Release", and also contains a directory "META-INF/versions" with subdirectories that contain versioned entries segregated by the major version of Java platform releases. A versioned entry, with a version n, in the "META-INF/versions/{n}" directory overrides the unversioned root entry as well as any entry with a version number i where i< n.
The changes in this webrev implement an aliasing mechanism in JarEntry so that when a JarFile client retrieves a JarEntry, the data from the entry pointed to by the alias is returned. There are methods to configure whether or not aliasing is enabled, and if it is, which version of an entry the alias points to.
When a JarFile is used by a class loader to load class resources, the default version retrieved is the runtime version of the Java platform (i.e. a version 9 entry is returned when the platform is JDK 9). This mechanism can be configured by System properties.
Thanks, Steve
On 10/14/15 4:04 PM, Steve Drach wrote:
Any reason the JarEntry.get/setSize() are the only ZipEntry methods get overridden? It didn’t seem necessary. The root entries are the “public interface”, we’re just providing aliased entry contents.
It does not sound right. The "exported public interface" of a jar file, or a multi-release-jar file is NOT those root entries, but the name of those entries. As the updated/newly added spec says, "The returned JarEntry is the versioned entry corresponding to the given root entry NAME prefixed with the string META-INF/versions/{n}...", So the returned entry should be the one that represents the "versioned entry", with the content of the entry and the meta data of the entry (the compressed size, the various timestamps, the comment...) form the versioned entry, not the root one, if there is a matched versioned entry. The implementation seems not follow this spec, it return the entry that actually is for the root entry, with a link to the versioned one, which serves the purpose of getting the corresponding input stream correctly, and make the verifier work (by simply passing the linked versioned to the verifier). However, all the meta data, accessible from the JarEntry APIs are all "broken", the attributes, certificates and the codeSigners are from the root entry. If my reading is correct, I'm not sure how it can work if someone wants to "verify" an individual signed entry by himself via security APIs, with all meta data from the root entry and the data itself from the versioned entry. I'm not sure if it is a good idea, from performance perspective, to add a "versionEntry" field into the JarEntry to support this feature, given most of the jar files might not be multi-release-jar aware, and the Jar input& output streams dont work with a multi-release jar directly. Why should they all pay a runtime price for it. If we really have to add an extra field, the JarFileEntry might be a better place, and it might be desired to define a new subclass JarFileEntryMR to use when the MR is enabled, instead of adding directly into the existing JarFileEntry. -Sherman On 10/14/2015 09:07 AM, Steve Drach wrote:
Hi,
Let’s try again, this time there are tests. Please review the following webrev that adds support for multi-release jars as specified in JEP-238.
Issue: https://bugs.openjdk.java.net/browse/JDK-8132734 JEP 238: https://bugs.openjdk.java.net/browse/JDK-8047305 Webrev: http://cr.openjdk.java.net/~psandoz/multiversion-jar/jar-webrev/
A multi-release jar file is a jar file that contains a manifest with a main attribute named "Multi-Release", and also contains a directory "META-INF/versions" with subdirectories that contain versioned entries segregated by the major version of Java platform releases. A versioned entry, with a version n, in the "META-INF/versions/{n}" directory overrides the unversioned root entry as well as any entry with a version number i where i< n.
The changes in this webrev implement an aliasing mechanism in JarEntry so that when a JarFile client retrieves a JarEntry, the data from the entry pointed to by the alias is returned. There are methods to configure whether or not aliasing is enabled, and if it is, which version of an entry the alias points to.
When a JarFile is used by a class loader to load class resources, the default version retrieved is the runtime version of the Java platform (i.e. a version 9 entry is returned when the platform is JDK 9). This mechanism can be configured by System properties.
Thanks, Steve
On 15 Oct 2015, at 05:00, Xueming Shen <xueming.shen@oracle.com> wrote:
I'm not sure if it is a good idea, from performance perspective, to add a "versionEntry" field into the JarEntry to support this feature, given most of the jar files might not be multi-release-jar aware, and the Jar input& output streams dont work with a multi-release jar directly. Why should they all pay a runtime price for it. If we really have to add an extra field, the JarFileEntry might be a better place, and it might be desired to define a new subclass JarFileEntryMR to use when the MR is enabled, instead of adding directly into the existing JarFileEntry.
According to jol there is currently space available due to alignment. If there was not it would add about 4% in direct instance size. But the actual footprint is likely to be chunkier because of the string character storage for the name so the % increase in size would be smaller e.g. perhaps on average < 2% which might be ok given that i presume such entries are unlikely to be cached. So i am not concerned about the size. If there was a way to design it to avoid modification of existing classes all the better, but i dunno if it is possible. Steve will surely know more. Paul.
On 10/15/15 1:53 AM, Paul Sandoz wrote:
On 15 Oct 2015, at 05:00, Xueming Shen <xueming.shen@oracle.com> wrote:
I'm not sure if it is a good idea, from performance perspective, to add a "versionEntry" field into the JarEntry to support this feature, given most of the jar files might not be multi-release-jar aware, and the Jar input& output streams dont work with a multi-release jar directly. Why should they all pay a runtime price for it. If we really have to add an extra field, the JarFileEntry might be a better place, and it might be desired to define a new subclass JarFileEntryMR to use when the MR is enabled, instead of adding directly into the existing JarFileEntry.
According to jol there is currently space available due to alignment. If there was not it would add about 4% in direct instance size. But the actual footprint is likely to be chunkier because of the string character storage for the name so the % increase in size would be smaller e.g. perhaps on average < 2% which might be ok given that i presume such entries are unlikely to be cached.
So i am not concerned about the size. If there was a way to design it to avoid modification of existing classes all the better, but i dunno if it is possible. Steve will surely know more.
Paul.
Let's try it from another angle:-) Based on the webrev, no one need to and actually does create a JarEntry with a versionedEntry, except JarFile, and JarFile only creates its own version of JarEntry, the JarFileEntry. The only non-JarFile consumer of "versioned" JarEntry is the package private JarVerifier.getCoderSigners, and obviously it takes a JarFile together with the source JarEntry (again, if this jarEntry is not from A JarFile, it should never have a "versionedEntry") So why do you want to put this field into the super class JarEntry, not the JarFileEntry, don't see any benefit of doing that. While I'm writing this email, it appears to me that we might have a more "severe" issue with the general/base JarEntry class to hold the link to the "versionedEntry". The "general" JarEntry object is not associated with a specific JarFile (the JarFileEntry does). So there is no way for JarFile.getInputStream(ZipFile ze) to verify that the JarEntry passed in and its "versionedEntry" is actually belong to "this" JarFile in the following implementation, if the "ze" is just a JarEntry but NOT instanceof of a JarFileEntry 759 public synchronized InputStream getInputStream(ZipEntry ze) 760 throws IOException 761 { 762 maybeInstantiateVerifier(); 763 764 if (ze instanceof JarEntry) { 765 ZipEntry vze = ((JarEntry)ze).versionedEntry; 766 if (vze != null) { 767 return getInputStream(vze); 768 } 769 } 770 I think it's a bug of the implementation if we don't check, as the "versioned entry" is really associated to the jar file that creates it. Sure, as I said above, there is actually no way you can create a general JarEntry or a JarFileEntry with a "versionedEntry" from "outside", but it appears to be possible (have not tried, just in theory) to mess up the current mechanism by passing a "jar entry" from one JarFile instance to another one, if two JarFile instances are open on the same multi-release-jar, but with different "version setting" policy... But again, I still believe it might be a wrong approach to add such a "versionedEntry" into any of the JarEntry, including the JarFileEntry. As specified by the specification, the returned entry should be the jar entry pointing to the versioned entity in the Jar File, not the root one. The question I would like to ask is why do you even need the "root entry" at all, if there is a matched versioned one. It might be desired to have a mechanism to return the "base/root name" for such an entry, but it probably does not deserve a "dedicate" entry object. -Sherman
So why do you want to put this field into the super class JarEntry, not the JarFileEntry, don't see any benefit of doing that.
I changed it. I put the versionedEntry in JarFileEntry instead of JarEntry. Not too difficult although I had to add a package private method to support JarVerifier. It seems to work as all 27 jar tests pass (some are not specific to multi-release jars).
Hi, Let’s try again. I considered the issues brought up in the last review and was able to not only remove the versionedEntry altogether, I was able to greatly simplify the entire changeset. I removed all changes to JarEntry and JarVerifier, and added a name field and some simple methods to JarFileEntry. This solved a whole bunch of potential issues. I’m also creating the jar files used in the tests so there are no binaries in the changeset. One test, MultiReleaseJarURLConnection, has an error on windows because it can’t delete one of the created jar files. I don’t think I can do anything about it — JDK-4823678 is a 12 year old bug that describes the problem. If anyone has an idea on what I can do to make this a clean test, please let me know. Issue: https://bugs.openjdk.java.net/browse/JDK-8132734 <https://bugs.openjdk.java.net/browse/JDK-8132734> JEP 238: https://bugs.openjdk.java.net/browse/JDK-8047305 <https://bugs.openjdk.java.net/browse/JDK-8047305> Webrev: http://cr.openjdk.java.net/~psandoz/multiversion-jar/jar-webrev/ <http://cr.openjdk.java.net/~psandoz/multiversion-jar/jar-webrev/> Thanks, Steve
On Oct 15, 2015, at 8:47 PM, Xueming Shen <xueming.shen@oracle.com> wrote:
On 10/15/15 1:53 AM, Paul Sandoz wrote:
On 15 Oct 2015, at 05:00, Xueming Shen <xueming.shen@oracle.com> wrote:
I'm not sure if it is a good idea, from performance perspective, to add a "versionEntry" field into the JarEntry to support this feature, given most of the jar files might not be multi-release-jar aware, and the Jar input& output streams dont work with a multi-release jar directly. Why should they all pay a runtime price for it. If we really have to add an extra field, the JarFileEntry might be a better place, and it might be desired to define a new subclass JarFileEntryMR to use when the MR is enabled, instead of adding directly into the existing JarFileEntry.
According to jol there is currently space available due to alignment. If there was not it would add about 4% in direct instance size. But the actual footprint is likely to be chunkier because of the string character storage for the name so the % increase in size would be smaller e.g. perhaps on average < 2% which might be ok given that i presume such entries are unlikely to be cached.
So i am not concerned about the size. If there was a way to design it to avoid modification of existing classes all the better, but i dunno if it is possible. Steve will surely know more.
Paul.
Let's try it from another angle:-) Based on the webrev, no one need to and actually does create a JarEntry with a versionedEntry, except JarFile, and JarFile only creates its own version of JarEntry, the JarFileEntry.
The only non-JarFile consumer of "versioned" JarEntry is the package private JarVerifier.getCoderSigners, and obviously it takes a JarFile together with the source JarEntry (again, if this jarEntry is not from A JarFile, it should never have a "versionedEntry")
So why do you want to put this field into the super class JarEntry, not the JarFileEntry, don't see any benefit of doing that.
While I'm writing this email, it appears to me that we might have a more "severe" issue with the general/base JarEntry class to hold the link to the "versionedEntry". The "general" JarEntry object is not associated with a specific JarFile (the JarFileEntry does). So there is no way for JarFile.getInputStream(ZipFile ze) to verify that the JarEntry passed in and its "versionedEntry" is actually belong to "this" JarFile in the following implementation, if the "ze" is just a JarEntry but NOT instanceof of a JarFileEntry
759 public synchronized InputStream getInputStream(ZipEntry ze) 760 throws IOException 761 { 762 maybeInstantiateVerifier(); 763 764 if (ze instanceof JarEntry) { 765 ZipEntry vze = ((JarEntry)ze).versionedEntry; 766 if (vze != null) { 767 return getInputStream(vze); 768 } 769 } 770
I think it's a bug of the implementation if we don't check, as the "versioned entry" is really associated to the jar file that creates it. Sure, as I said above, there is actually no way you can create a general JarEntry or a JarFileEntry with a "versionedEntry" from "outside", but it appears to be possible (have not tried, just in theory) to mess up the current mechanism by passing a "jar entry" from one JarFile instance to another one, if two JarFile instances are open on the same multi-release-jar, but with different "version setting" policy...
But again, I still believe it might be a wrong approach to add such a "versionedEntry" into any of the JarEntry, including the JarFileEntry. As specified by the specification, the returned entry should be the jar entry pointing to the versioned entity in the Jar File, not the root one. The question I would like to ask is why do you even need the "root entry" at all, if there is a matched versioned one. It might be desired to have a mechanism to return the "base/root name" for such an entry, but it probably does not deserve a "dedicate" entry object.
-Sherman
Hi Steve, The reifiedEntry() approach probably can help the default JarVerifier work as expected, but if I read the code correctly I doubt you can get the expected CodSigner[] and Certificatte[] result from a "versioned" JarFileEntry, after having read all bytes from the input stream (obtained via jzf.getInputStream(JarFileEntry)), as the JarEntry spec suggests,. As we are passing the "reified" entry into the VerifierStream alone, without any reference to the original jar file entry. It seems impossible for the original jar file entry can trace back to the corresponding certificates and code signers. This might be fixed by passing in the original entry together into the JarVerifier, but I doubt we might have a bigger issue here. I suspect with this approach an external verifier will have no easy way to verify the digit signature of the jar entry via java.security APIs. I would assume this is doable right now with current JarFile APIs, via a JarFile object, a Manifest and a target JarEntry. The external can get the signature via name -> manifest->attributes->signature (basically just consider to move the JarVerifier and couple sun.security.util classes out and use it as user code)... but with this implementation the name now is the root entry, but the bytes you can read from the stream is from the versioned one. We might want to bring in Max to take a look if what I said is really a supported use scenario. I might be wrong, not a security expert :-) Btw, for a "normal" JarEntry/ZipEntry (not a JarFileEntry), shouldn't the getInputStream(ze) simply return the stream for the root entry? The current implementation of getJarEntry(ze) does not seem right, as it returns a "versioned" JarFileEntry. I don't think you want to pass this one into VerifierStream directly, right? Again, I think it might be desired (at least the spec is not updated to say anything about "version") to simply return the input stream for the root entry. One more "nitpick". searchForVersionedEntry() now lookups the versioned candidate via super.getEntry() from version to BASE_VERSION, if the version is the latest version 9, the base is 0, we are basically doing this search for each non-versioned-entry inside this multi-release-jar file 9 times everytime when the entry is asked. In worse case scenario, a multi-release-jar, with huge number of entries with a small portion are versioned to 9, and you are iterating it via "entries". Each lookup might be cheap, but it might be worth considering adding some optimization. Best, Sherman On 10/20/2015 5:16 PM, Steve Drach wrote:
Hi,
Let’s try again. I considered the issues brought up in the last review and was able to not only remove the versionedEntry altogether, I was able to greatly simplify the entire changeset. I removed all changes to JarEntry and JarVerifier, and added a name field and some simple methods to JarFileEntry. This solved a whole bunch of potential issues. I’m also creating the jar files used in the tests so there are no binaries in the changeset. One test, MultiReleaseJarURLConnection, has an error on windows because it can’t delete one of the created jar files. I don’t think I can do anything about it — JDK-4823678 is a 12 year old bug that describes the problem. If anyone has an idea on what I can do to make this a clean test, please let me know.
Issue: https://bugs.openjdk.java.net/browse/JDK-8132734 JEP 238: https://bugs.openjdk.java.net/browse/JDK-8047305 Webrev: http://cr.openjdk.java.net/~psandoz/multiversion-jar/jar-webrev/ <http://cr.openjdk.java.net/%7Epsandoz/multiversion-jar/jar-webrev/>
Thanks, Steve
On Oct 15, 2015, at 8:47 PM, Xueming Shen <xueming.shen@oracle.com <mailto:xueming.shen@oracle.com>> wrote:
On 10/15/15 1:53 AM, Paul Sandoz wrote:
On 15 Oct 2015, at 05:00, Xueming Shen <xueming.shen@oracle.com <mailto:xueming.shen@oracle.com>> wrote:
I'm not sure if it is a good idea, from performance perspective, to add a "versionEntry" field into the JarEntry to support this feature, given most of the jar files might not be multi-release-jar aware, and the Jar input& output streams dont work with a multi-release jar directly. Why should they all pay a runtime price for it. If we really have to add an extra field, the JarFileEntry might be a better place, and it might be desired to define a new subclass JarFileEntryMR to use when the MR is enabled, instead of adding directly into the existing JarFileEntry.
According to jol there is currently space available due to alignment. If there was not it would add about 4% in direct instance size. But the actual footprint is likely to be chunkier because of the string character storage for the name so the % increase in size would be smaller e.g. perhaps on average < 2% which might be ok given that i presume such entries are unlikely to be cached.
So i am not concerned about the size. If there was a way to design it to avoid modification of existing classes all the better, but i dunno if it is possible. Steve will surely know more.
Paul.
Let's try it from another angle:-) Based on the webrev, no one need to and actually does create a JarEntry with a versionedEntry, except JarFile, and JarFile only creates its own version of JarEntry, the JarFileEntry.
The only non-JarFile consumer of "versioned" JarEntry is the package private JarVerifier.getCoderSigners, and obviously it takes a JarFile together with the source JarEntry (again, if this jarEntry is not from A JarFile, it should never have a "versionedEntry")
So why do you want to put this field into the super class JarEntry, not the JarFileEntry, don't see any benefit of doing that.
While I'm writing this email, it appears to me that we might have a more "severe" issue with the general/base JarEntry class to hold the link to the "versionedEntry". The "general" JarEntry object is not associated with a specific JarFile (the JarFileEntry does). So there is no way for JarFile.getInputStream(ZipFile ze) to verify that the JarEntry passed in and its "versionedEntry" is actually belong to "this" JarFile in the following implementation, if the "ze" is just a JarEntry but NOT instanceof of a JarFileEntry
759 public synchronized InputStream getInputStream(ZipEntry ze) 760 throws IOException 761 { 762 maybeInstantiateVerifier(); 763 764 if (ze instanceof JarEntry) { 765 ZipEntry vze = ((JarEntry)ze).versionedEntry; 766 if (vze != null) { 767 return getInputStream(vze); 768 } 769 } 770
I think it's a bug of the implementation if we don't check, as the "versioned entry" is really associated to the jar file that creates it. Sure, as I said above, there is actually no way you can create a general JarEntry or a JarFileEntry with a "versionedEntry" from "outside", but it appears to be possible (have not tried, just in theory) to mess up the current mechanism by passing a "jar entry" from one JarFile instance to another one, if two JarFile instances are open on the same multi-release-jar, but with different "version setting" policy...
But again, I still believe it might be a wrong approach to add such a "versionedEntry" into any of the JarEntry, including the JarFileEntry. As specified by the specification, the returned entry should be the jar entry pointing to the versioned entity in the Jar File, not the root one. The question I would like to ask is why do you even need the "root entry" at all, if there is a matched versioned one. It might be desired to have a mechanism to return the "base/root name" for such an entry, but it probably does not deserve a "dedicate" entry object.
-Sherman
On Oct 21, 2015, at 3:17 PM, Xueming Shen <xueming.shen@oracle.com> wrote:
We might want to bring in Max to take a look if what I said is really a supported use scenario.
I haven't read Steve's latest code change. I will read if you think it's necessary. First, I think we agree that the multi-release jar file feature is only making use of the existing jar file format and does not intend to introduce any change to its specification. This means a JarFile signed by one JDK release should be verified by another JDK release. Ok, the next question is, should it modify the JarFile API? I hope not, because the JarFile API is the single entry we access a JarFile when we want to sign or verify it. I hope there is a brand new API for a multi-versioned jar file, probably a child class of JarFile, so that no matter you call getJarEntry() or entries() on it, you always get the versioned one and the unrelated ones are completely invisible. If this is not OK, maybe we can rename the current JarFile to RawJarFile and name the new API JarFile. Signing and verification will work on RawJarFile. Not sure if it's easy. --Max
Hi, We’ve published another webrev for review. Issue: https://bugs.openjdk.java.net/browse/JDK-8132734 JEP 238: https://bugs.openjdk.java.net/browse/JDK-8047305 Webrev: http://cr.openjdk.java.net/~psandoz/multiversion-jar/jar-webrev/ This one addresses the issues regarding CodeSigners, Certificates, verification, and other security issues raised in the last round, including whether third party verification is a supported use case. I also partially fixed a nitpick involving performance while searching for versioned entries, by putting in a cache for previously searched entries. And I found a way around the issue with windows being unable to delete jar files accessed through URL’s in one test. Steve
On Oct 21, 2015, at 12:54 AM, Wang Weijun <weijun.wang@oracle.com> wrote:
On Oct 21, 2015, at 3:17 PM, Xueming Shen <xueming.shen@oracle.com> wrote:
We might want to bring in Max to take a look if what I said is really a supported use scenario.
I haven't read Steve's latest code change. I will read if you think it's necessary.
First, I think we agree that the multi-release jar file feature is only making use of the existing jar file format and does not intend to introduce any change to its specification. This means a JarFile signed by one JDK release should be verified by another JDK release.
Ok, the next question is, should it modify the JarFile API? I hope not, because the JarFile API is the single entry we access a JarFile when we want to sign or verify it. I hope there is a brand new API for a multi-versioned jar file, probably a child class of JarFile, so that no matter you call getJarEntry() or entries() on it, you always get the versioned one and the unrelated ones are completely invisible.
If this is not OK, maybe we can rename the current JarFile to RawJarFile and name the new API JarFile. Signing and verification will work on RawJarFile.
Not sure if it's easy.
--Max
On Oct 21, 2015, at 12:17 AM, Xueming Shen <xueming.shen@oracle.com> wrote:
Hi Steve,
The reifiedEntry() approach probably can help the default JarVerifier work as expected, but if I read the code correctly I doubt you can get the expected CodSigner[] and Certificatte[] result from a "versioned" JarFileEntry, after having read all bytes from the input stream (obtained via jzf.getInputStream(JarFileEntry)), as the JarEntry spec suggests,. As we are passing the "reified" entry into the VerifierStream alone, without any reference to the original jar file entry. It seems impossible for the original jar file entry can trace back to the corresponding certificates and code signers. This might be fixed by passing in the original entry together into the JarVerifier, but I doubt we might have a bigger issue here. I suspect with this approach an external verifier will have no easy way to verify the digit signature of the jar entry via java.security APIs. I would assume this is doable right now with current JarFile APIs, via a JarFile object, a Manifest and a target JarEntry. The external can get the signature via name -> manifest->attributes->signature (basically just consider to move the JarVerifier and couple sun.security.util classes out and use it as user code)... but with this implementation the name now is the root entry, but the bytes you can read from the stream is from the versioned one. We might want to bring in Max to take a look if what I said is really a supported use scenario. I might be wrong, not a security expert :-)
Btw, for a "normal" JarEntry/ZipEntry (not a JarFileEntry), shouldn't the getInputStream(ze) simply return the stream for the root entry? The current implementation of getJarEntry(ze) does not seem right, as it returns a "versioned" JarFileEntry. I don't think you want to pass this one into VerifierStream directly, right? Again, I think it might be desired (at least the spec is not updated to say anything about "version") to simply return the input stream for the root entry.
One more "nitpick". searchForVersionedEntry() now lookups the versioned candidate via super.getEntry() from version to BASE_VERSION, if the version is the latest version 9, the base is 0, we are basically doing this search for each non-versioned-entry inside this multi-release-jar file 9 times everytime when the entry is asked. In worse case scenario, a multi-release-jar, with huge number of entries with a small portion are versioned to 9, and you are iterating it via "entries". Each lookup might be cheap, but it might be worth considering adding some optimization.
Best, Sherman
Hi Steve, I know I stared to sound like a broken record :-) But I would like to suggest the team one more time to reconsider the current decision of using the "set" methods to change the configuration setting/status of an existing JarFile to enable the multi-version support. public JarFile setVersioned(int version); public JarFile setRuntimeVersioned(); The main concern here is the current approach basically transfers the JarFile from a read-only/ immutable object with consistent behavior (to entry inquiry) to a mutable container of entries with possibility of inconsistent behavior. The newly introduced setVersioned/setRuntimeVersioned really have no way to guarantee A expected result from the updated version-enabled getEntry() method, as someone else might set an unexpected different "version" between your setting and getting, or even worse, in the middle of your entries() invocation, for example, in which you get part of your entries to version N and the rest to version M. So It might be desired to have the "versioned support" enabled in the constructor, so once you get that version enabled JarFile, it stays that way for its lifetime with consistent result for the entry inquiry, as the current API does. I do realize that there might be use case that the getEntry invoker might not have the access to the creation of the corresponding jar file (such as the use scenario in that JarURLConnection?), so you can't create a version-enabled JarFile at the very beginning via the constructor. But doesn't this also make my concern more real. If you don't have the control of the lifetime of that JarFile, you don't really have the control of who is setting or going to set the version of that mutable JarFile, right? An alternative might be to have change the setVersioned/setRuntimeVersioned() to public jarFile getVersioned(int version); public jarFile getRuntimeVersioned(); to return a new copy of the existing JarFile with the desired verisoning support. Yes, it might be too heavy from performance perspective :-) and we might have to do some tricky stuff (it would be easier if ZipJarFile is interface ...) to have a light wrapper class to delegate everything to the real one. That said, I'm fine to be told "the pros and cons were considered, and this is the best for the supported use scenario":-) In that case, it might deserve some wording in the spec notes to prepare the developer the possible unexpected. Thanks, Sherman On 10/26/15 10:26 AM, Steve Drach wrote:
Hi,
We’ve published another webrev for review.
Issue: https://bugs.openjdk.java.net/browse/JDK-8132734 JEP 238: https://bugs.openjdk.java.net/browse/JDK-8047305 Webrev: http://cr.openjdk.java.net/~psandoz/multiversion-jar/jar-webrev/
This one addresses the issues regarding CodeSigners, Certificates, verification, and other security issues raised in the last round, including whether third party verification is a supported use case. I also partially fixed a nitpick involving performance while searching for versioned entries, by putting in a cache for previously searched entries. And I found a way around the issue with windows being unable to delete jar files accessed through URL’s in one test.
Steve
On Oct 21, 2015, at 12:54 AM, Wang Weijun <weijun.wang@oracle.com> wrote:
On Oct 21, 2015, at 3:17 PM, Xueming Shen <xueming.shen@oracle.com> wrote:
We might want to bring in Max to take a look if what I said is really a supported use scenario. I haven't read Steve's latest code change. I will read if you think it's necessary.
First, I think we agree that the multi-release jar file feature is only making use of the existing jar file format and does not intend to introduce any change to its specification. This means a JarFile signed by one JDK release should be verified by another JDK release.
Ok, the next question is, should it modify the JarFile API? I hope not, because the JarFile API is the single entry we access a JarFile when we want to sign or verify it. I hope there is a brand new API for a multi-versioned jar file, probably a child class of JarFile, so that no matter you call getJarEntry() or entries() on it, you always get the versioned one and the unrelated ones are completely invisible.
If this is not OK, maybe we can rename the current JarFile to RawJarFile and name the new API JarFile. Signing and verification will work on RawJarFile.
Not sure if it's easy.
--Max On Oct 21, 2015, at 12:17 AM, Xueming Shen <xueming.shen@oracle.com> wrote:
Hi Steve,
The reifiedEntry() approach probably can help the default JarVerifier work as expected, but if I read the code correctly I doubt you can get the expected CodSigner[] and Certificatte[] result from a "versioned" JarFileEntry, after having read all bytes from the input stream (obtained via jzf.getInputStream(JarFileEntry)), as the JarEntry spec suggests,. As we are passing the "reified" entry into the VerifierStream alone, without any reference to the original jar file entry. It seems impossible for the original jar file entry can trace back to the corresponding certificates and code signers. This might be fixed by passing in the original entry together into the JarVerifier, but I doubt we might have a bigger issue here. I suspect with this approach an external verifier will have no easy way to verify the digit signature of the jar entry via java.security APIs. I would assume this is doable right now with current JarFile APIs, via a JarFile object, a Manifest and a target JarEntry. The external can get the signature via name -> manifest->attributes->signature (basically just consider to move the JarVerifier and couple sun.security.util classes out and use it as user code)... but with this implementation the name now is the root entry, but the bytes you can read from the stream is from the versioned one. We might want to bring in Max to take a look if what I said is really a supported use scenario. I might be wrong, not a security expert :-)
Btw, for a "normal" JarEntry/ZipEntry (not a JarFileEntry), shouldn't the getInputStream(ze) simply return the stream for the root entry? The current implementation of getJarEntry(ze) does not seem right, as it returns a "versioned" JarFileEntry. I don't think you want to pass this one into VerifierStream directly, right? Again, I think it might be desired (at least the spec is not updated to say anything about "version") to simply return the input stream for the root entry.
One more "nitpick". searchForVersionedEntry() now lookups the versioned candidate via super.getEntry() from version to BASE_VERSION, if the version is the latest version 9, the base is 0, we are basically doing this search for each non-versioned-entry inside this multi-release-jar file 9 times everytime when the entry is asked. In worse case scenario, a multi-release-jar, with huge number of entries with a small portion are versioned to 9, and you are iterating it via "entries". Each lookup might be cheap, but it might be worth considering adding some optimization.
Best, Sherman
FWIW I agree - it's just weird to have the behavior change after the fact like that. On 10/26/2015 11:37 PM, Xueming Shen wrote:
Hi Steve,
I know I stared to sound like a broken record :-) But I would like to suggest the team one more time to reconsider the current decision of using the "set" methods to change the configuration setting/status of an existing JarFile to enable the multi-version support.
public JarFile setVersioned(int version); public JarFile setRuntimeVersioned();
The main concern here is the current approach basically transfers the JarFile from a read-only/ immutable object with consistent behavior (to entry inquiry) to a mutable container of entries with possibility of inconsistent behavior. The newly introduced setVersioned/setRuntimeVersioned really have no way to guarantee A expected result from the updated version-enabled getEntry() method, as someone else might set an unexpected different "version" between your setting and getting, or even worse, in the middle of your entries() invocation, for example, in which you get part of your entries to version N and the rest to version M.
So It might be desired to have the "versioned support" enabled in the constructor, so once you get that version enabled JarFile, it stays that way for its lifetime with consistent result for the entry inquiry, as the current API does.
I do realize that there might be use case that the getEntry invoker might not have the access to the creation of the corresponding jar file (such as the use scenario in that JarURLConnection?), so you can't create a version-enabled JarFile at the very beginning via the constructor. But doesn't this also make my concern more real. If you don't have the control of the lifetime of that JarFile, you don't really have the control of who is setting or going to set the version of that mutable JarFile, right?
An alternative might be to have change the setVersioned/setRuntimeVersioned() to
public jarFile getVersioned(int version); public jarFile getRuntimeVersioned();
to return a new copy of the existing JarFile with the desired verisoning support. Yes, it might be too heavy from performance perspective :-) and we might have to do some tricky stuff (it would be easier if ZipJarFile is interface ...) to have a light wrapper class to delegate everything to the real one.
That said, I'm fine to be told "the pros and cons were considered, and this is the best for the supported use scenario":-) In that case, it might deserve some wording in the spec notes to prepare the developer the possible unexpected.
Thanks, Sherman
On 10/26/15 10:26 AM, Steve Drach wrote:
Hi,
We’ve published another webrev for review.
Issue: https://bugs.openjdk.java.net/browse/JDK-8132734 JEP 238: https://bugs.openjdk.java.net/browse/JDK-8047305 Webrev: http://cr.openjdk.java.net/~psandoz/multiversion-jar/jar-webrev/
This one addresses the issues regarding CodeSigners, Certificates, verification, and other security issues raised in the last round, including whether third party verification is a supported use case. I also partially fixed a nitpick involving performance while searching for versioned entries, by putting in a cache for previously searched entries. And I found a way around the issue with windows being unable to delete jar files accessed through URL’s in one test.
Steve
On Oct 21, 2015, at 12:54 AM, Wang Weijun <weijun.wang@oracle.com> wrote:
On Oct 21, 2015, at 3:17 PM, Xueming Shen <xueming.shen@oracle.com> wrote:
We might want to bring in Max to take a look if what I said is really a supported use scenario. I haven't read Steve's latest code change. I will read if you think it's necessary.
First, I think we agree that the multi-release jar file feature is only making use of the existing jar file format and does not intend to introduce any change to its specification. This means a JarFile signed by one JDK release should be verified by another JDK release.
Ok, the next question is, should it modify the JarFile API? I hope not, because the JarFile API is the single entry we access a JarFile when we want to sign or verify it. I hope there is a brand new API for a multi-versioned jar file, probably a child class of JarFile, so that no matter you call getJarEntry() or entries() on it, you always get the versioned one and the unrelated ones are completely invisible.
If this is not OK, maybe we can rename the current JarFile to RawJarFile and name the new API JarFile. Signing and verification will work on RawJarFile.
Not sure if it's easy.
--Max On Oct 21, 2015, at 12:17 AM, Xueming Shen <xueming.shen@oracle.com> wrote:
Hi Steve,
The reifiedEntry() approach probably can help the default JarVerifier work as expected, but if I read the code correctly I doubt you can get the expected CodSigner[] and Certificatte[] result from a "versioned" JarFileEntry, after having read all bytes from the input stream (obtained via jzf.getInputStream(JarFileEntry)), as the JarEntry spec suggests,. As we are passing the "reified" entry into the VerifierStream alone, without any reference to the original jar file entry. It seems impossible for the original jar file entry can trace back to the corresponding certificates and code signers. This might be fixed by passing in the original entry together into the JarVerifier, but I doubt we might have a bigger issue here. I suspect with this approach an external verifier will have no easy way to verify the digit signature of the jar entry via java.security APIs. I would assume this is doable right now with current JarFile APIs, via a JarFile object, a Manifest and a target JarEntry. The external can get the signature via name -> manifest->attributes->signature (basically just consider to move the JarVerifier and couple sun.security.util classes out and use it as user code)... but with this implementation the name now is the root entry, but the bytes you can read from the stream is from the versioned one. We might want to bring in Max to take a look if what I said is really a supported use scenario. I might be wrong, not a security expert :-)
Btw, for a "normal" JarEntry/ZipEntry (not a JarFileEntry), shouldn't the getInputStream(ze) simply return the stream for the root entry? The current implementation of getJarEntry(ze) does not seem right, as it returns a "versioned" JarFileEntry. I don't think you want to pass this one into VerifierStream directly, right? Again, I think it might be desired (at least the spec is not updated to say anything about "version") to simply return the input stream for the root entry.
One more "nitpick". searchForVersionedEntry() now lookups the versioned candidate via super.getEntry() from version to BASE_VERSION, if the version is the latest version 9, the base is 0, we are basically doing this search for each non-versioned-entry inside this multi-release-jar file 9 times everytime when the entry is asked. In worse case scenario, a multi-release-jar, with huge number of entries with a small portion are versioned to 9, and you are iterating it via "entries". Each lookup might be cheap, but it might be worth considering adding some optimization.
Best, Sherman
-- - DML
Please review the latest webrev that addresses the issue raised in Sherman’s comments below, with my comments interspersed in-line. The changes between this webrev and the last one are in the definition and use of the ismultiRelease() method and the introduction of a configuration lock, the boolean configured, that prevents setting the version after an entry has been read from the jar file. As a consequence of the configuration lock, I had to modify a couple tests and add a new one. Issue: https://bugs.openjdk.java.net/browse/JDK-8132734 JEP 238: https://bugs.openjdk.java.net/browse/JDK-8047305 Webrev: http://cr.openjdk.java.net/~psandoz/multiversion-jar/jar-webrev/
On Oct 26, 2015, at 9:37 PM, Xueming Shen <xueming.shen@oracle.com> wrote:
Hi Steve,
I know I stared to sound like a broken record :-) But I would like to suggest the team one more time to reconsider the current decision of using the "set" methods to change the configuration setting/status of an existing JarFile to enable the multi-version support.
public JarFile setVersioned(int version); public JarFile setRuntimeVersioned();
We did reconsider it, more later.
The main concern here is the current approach basically transfers the JarFile from a read-only/ immutable object with consistent behavior (to entry inquiry) to a mutable container of entries with possibility of inconsistent behavior. The newly introduced setVersioned/setRuntimeVersioned really have no way to guarantee A expected result from the updated version-enabled getEntry() method, as someone else might set an unexpected different "version" between your setting and getting, or even worse, in the middle of your entries() invocation, for example, in which you get part of your entries to version N and the rest to version M.
We’ve introduced a configuration lock. One can change the versioning strategy as often as they wish before reading an entry. However if setVersioned() is called after an entry has been read, an IllegalStateException is thrown.
So It might be desired to have the "versioned support" enabled in the constructor, so once you get that version enabled JarFile, it stays that way for its lifetime with consistent result for the entry inquiry, as the current API does.
We can’t do it in the constructor because we need to read the manifest, and doing that in the constructor leads to a stack overflow for some instances, such as deploy, where JarFile is subclassed and the manifest is lazily read. Basically, the manifest is not available during construction.
I do realize that there might be use case that the getEntry invoker might not have the access to the creation of the corresponding jar file (such as the use scenario in that JarURLConnection?), so you can't create a version-enabled JarFile at the very beginning via the constructor. But doesn't this also make my concern more real. If you don't have the control of the lifetime of that JarFile, you don't really have the control of who is setting or going to set the version of that mutable JarFile, right?
An alternative might be to have change the setVersioned/setRuntimeVersioned() to
public jarFile getVersioned(int version); public jarFile getRuntimeVersioned();
to return a new copy of the existing JarFile with the desired verisoning support.
That’s a problem because we would then have two instances of JarFile, and we’d need to remember to close the underlying instance. An alternative is to use factory methods to create the versioned JarFile. That works, but we have to add about six of them to correspond to the current set of constructors. But the big problem here is, as you mention, the JarFile returned from JarURLConnection. From experiments it looks to me that the underlying cached JarFile is unlinked from the Unix filesystem and thus unavailable for replication. This code seems very OS dependent as well as potentially fragile since subclasses can implement the abstract getJarFile() method any way they wish. A configuration lock seems to be a good compromise. We can mutate the JarFile versioning strategy as much as we want until we read an entry, and then no more mutations are allowed. For what it is worth, I don’t see a real use case to set the versioning strategy more than once, even before an entry is read. I do that in the tests, but it’s difficult to imagine it anywhere else. The intention is to construct a JarFile and then immediately set the versioning strategy.
Yes, it might be too heavy from performance perspective :-) and we might have to do some tricky stuff (it would be easier if ZipJarFile is interface ...) to have a light wrapper class to delegate everything to the real one.
That said, I'm fine to be told "the pros and cons were considered, and this is the best for the supported use scenario":-)
Yes, that is the case.
In that case, it might deserve some wording in the spec notes to prepare the developer the possible unexpected.
Thanks, Sherman
On 10/26/15 10:26 AM, Steve Drach wrote:
Hi,
We’ve published another webrev for review.
Issue: https://bugs.openjdk.java.net/browse/JDK-8132734 JEP 238: https://bugs.openjdk.java.net/browse/JDK-8047305 Webrev: http://cr.openjdk.java.net/~psandoz/multiversion-jar/jar-webrev/
This one addresses the issues regarding CodeSigners, Certificates, verification, and other security issues raised in the last round, including whether third party verification is a supported use case. I also partially fixed a nitpick involving performance while searching for versioned entries, by putting in a cache for previously searched entries. And I found a way around the issue with windows being unable to delete jar files accessed through URL’s in one test.
Steve
On Oct 21, 2015, at 12:54 AM, Wang Weijun <weijun.wang@oracle.com> wrote:
On Oct 21, 2015, at 3:17 PM, Xueming Shen <xueming.shen@oracle.com> wrote:
We might want to bring in Max to take a look if what I said is really a supported use scenario. I haven't read Steve's latest code change. I will read if you think it's necessary.
First, I think we agree that the multi-release jar file feature is only making use of the existing jar file format and does not intend to introduce any change to its specification. This means a JarFile signed by one JDK release should be verified by another JDK release.
Ok, the next question is, should it modify the JarFile API? I hope not, because the JarFile API is the single entry we access a JarFile when we want to sign or verify it. I hope there is a brand new API for a multi-versioned jar file, probably a child class of JarFile, so that no matter you call getJarEntry() or entries() on it, you always get the versioned one and the unrelated ones are completely invisible.
If this is not OK, maybe we can rename the current JarFile to RawJarFile and name the new API JarFile. Signing and verification will work on RawJarFile.
Not sure if it's easy.
--Max On Oct 21, 2015, at 12:17 AM, Xueming Shen <xueming.shen@oracle.com> wrote:
Hi Steve,
The reifiedEntry() approach probably can help the default JarVerifier work as expected, but if I read the code correctly I doubt you can get the expected CodSigner[] and Certificatte[] result from a "versioned" JarFileEntry, after having read all bytes from the input stream (obtained via jzf.getInputStream(JarFileEntry)), as the JarEntry spec suggests,. As we are passing the "reified" entry into the VerifierStream alone, without any reference to the original jar file entry. It seems impossible for the original jar file entry can trace back to the corresponding certificates and code signers. This might be fixed by passing in the original entry together into the JarVerifier, but I doubt we might have a bigger issue here. I suspect with this approach an external verifier will have no easy way to verify the digit signature of the jar entry via java.security APIs. I would assume this is doable right now with current JarFile APIs, via a JarFile object, a Manifest and a target JarEntry. The external can get the signature via name -> manifest->attributes->signature (basically just consider to move the JarVerifier and couple sun.security.util classes out and use it as user code)... but with this implementation the name now is the root entry, but the bytes you can read from the stream is from the versioned one. We might want to bring in Max to take a look if what I said is really a supported use scenario. I might be wrong, not a security expert :-)
Btw, for a "normal" JarEntry/ZipEntry (not a JarFileEntry), shouldn't the getInputStream(ze) simply return the stream for the root entry? The current implementation of getJarEntry(ze) does not seem right, as it returns a "versioned" JarFileEntry. I don't think you want to pass this one into VerifierStream directly, right? Again, I think it might be desired (at least the spec is not updated to say anything about "version") to simply return the input stream for the root entry.
One more "nitpick". searchForVersionedEntry() now lookups the versioned candidate via super.getEntry() from version to BASE_VERSION, if the version is the latest version 9, the base is 0, we are basically doing this search for each non-versioned-entry inside this multi-release-jar file 9 times everytime when the entry is asked. In worse case scenario, a multi-release-jar, with huge number of entries with a small portion are versioned to 9, and you are iterating it via "entries". Each lookup might be cheap, but it might be worth considering adding some optimization.
Best, Sherman
Curious that you added a new method called jarFile.getRuntimeVersionedEntry(entryName). Is this the *only* method you would call for a multi-release jar? If so, is it still necessary to modify the old getEntry() method? Thanks Max
On Nov 4, 2015, at 1:11 AM, Steve Drach <steve.drach@oracle.com> wrote:
Webrev: http://cr.openjdk.java.net/~psandoz/multiversion-jar/jar-webrev/
On Nov 3, 2015, at 5:24 PM, Wang Weijun <weijun.wang@oracle.com> wrote:
Curious that you added a new method called jarFile.getRuntimeVersionedEntry(entryName).
It’s new to JarFile, but not new to the changeset, it’s been in there since the reviews started.
Is this the *only* method you would call for a multi-release jar?
No, it’s a method we had to add to support runtime versioning for the Class.getResource() method. We either had to set runtime versioning for all JarFiles obtained from a JarURLConnection or set it specifically for a resource entry retrieved by the Class.getResource() method, that uses a JarURLConnection. It’s a rather convoluted path through the system, but it you follow it, you’ll see what we did and how — look at the changes for JarURLConnection. It’s a rather specialized method that I wish could be private, but it can’t be. Having said that, someone might find a use for it. It’s equivalent to JarFile.setRuntimeVersioned().getJarEntry(), but it doesn’t set a permanent versioning strategy for the JarFile. See MultiReleaseJarProperties testRuntimeVersioning test.
If so, is it still necessary to modify the old getEntry() method?
Yes, for general entry retrieval as well as for loading classes.
Thanks Max
On Nov 4, 2015, at 1:11 AM, Steve Drach <steve.drach@oracle.com> wrote:
Webrev: http://cr.openjdk.java.net/~psandoz/multiversion-jar/jar-webrev/
Hi Steve, Hi Steve, I don’t think we need to cache versioned entries (as we discussed a while back). For class loading it’s likely to increase memory costs without any performance benefit (if anything a performance decrease). It’s easy to add back later on if we have data that suggests otherwise. We can reduce the search space for searching for versioned entries by setting the lower bound of the base version to one minus the first Java major release that has runtime support for multi-release jar files i.e. 8 as it currently stands. 312 * @throws IllegalStateException if called after an entry has been read … after an entry has been obtained (see ….) You might need some further clarification in setVersioned/setRuntimeVersioned itself describing the lifecycle e.g. this method may be called one or more times after construction and before the first operation that obtains an entry, after which the jar file configuration’s is considered immutable, and subsequent calls to this method will result in an ISE. Paul.
On 3 Nov 2015, at 18:11, Steve Drach <steve.drach@oracle.com> wrote:
Please review the latest webrev that addresses the issue raised in Sherman’s comments below, with my comments interspersed in-line. The changes between this webrev and the last one are in the definition and use of the ismultiRelease() method and the introduction of a configuration lock, the boolean configured, that prevents setting the version after an entry has been read from the jar file. As a consequence of the configuration lock, I had to modify a couple tests and add a new one.
Issue: https://bugs.openjdk.java.net/browse/JDK-8132734 JEP 238: https://bugs.openjdk.java.net/browse/JDK-8047305 Webrev: http://cr.openjdk.java.net/~psandoz/multiversion-jar/jar-webrev/
On Nov 4, 2015, at 1:01 AM, Paul Sandoz <paul.sandoz@oracle.com> wrote:
Hi Steve,
Hi Steve,
I don’t think we need to cache versioned entries (as we discussed a while back). For class loading it’s likely to increase memory costs without any performance benefit (if anything a performance decrease). It’s easy to add back later on if we have data that suggests otherwise.
Okay, I’ll take that code out. As you said, it’s easy to add back if needed.
We can reduce the search space for searching for versioned entries by setting the lower bound of the base version to one minus the first Java major release that has runtime support for multi-release jar files i.e. 8 as it currently stands.
Yes, I can do that. I’ll need to change the test jar file from versions 8 and 9 to versions 9 and 10 because we won’t see the version 8 classes in the current test jar and it’s nice to have a test jar with more than one version.
312 * @throws IllegalStateException if called after an entry has been read
… after an entry has been obtained (see ….)
You might need some further clarification in setVersioned/setRuntimeVersioned itself describing the lifecycle e.g.
this method may be called one or more times after construction and before the first operation that obtains an entry, after which the jar file configuration’s is considered immutable, and subsequent calls to this method will result in an ISE.
Okay.
Paul.
On 3 Nov 2015, at 18:11, Steve Drach <steve.drach@oracle.com> wrote:
Please review the latest webrev that addresses the issue raised in Sherman’s comments below, with my comments interspersed in-line. The changes between this webrev and the last one are in the definition and use of the ismultiRelease() method and the introduction of a configuration lock, the boolean configured, that prevents setting the version after an entry has been read from the jar file. As a consequence of the configuration lock, I had to modify a couple tests and add a new one.
Issue: https://bugs.openjdk.java.net/browse/JDK-8132734 JEP 238: https://bugs.openjdk.java.net/browse/JDK-8047305 Webrev: http://cr.openjdk.java.net/~psandoz/multiversion-jar/jar-webrev/
Hi, Here’s a new webrev that addresses the issues Paul brought up. The versioned entry cache has been removed, the search space has been reduced, the documentation for setVersioned(int) and setRuntimeVersioned() has been updated to clarify when IllegalStateException is thrown, and the tests have been changed to accommodate a jar file with versions 9 and 10, rather than 8 and 9. Issue: https://bugs.openjdk.java.net/browse/JDK-8132734 <https://bugs.openjdk.java.net/browse/JDK-8132734> JEP 238: https://bugs.openjdk.java.net/browse/JDK-8047305 <https://bugs.openjdk.java.net/browse/JDK-8047305> Webrev: http://cr.openjdk.java.net/~psandoz/multiversion-jar/jar-webrev/ <http://cr.openjdk.java.net/~psandoz/multiversion-jar/jar-webrev/> Steve
On Nov 4, 2015, at 1:01 AM, Paul Sandoz <paul.sandoz@oracle.com> wrote:
Hi Steve,
Hi Steve,
I don’t think we need to cache versioned entries (as we discussed a while back). For class loading it’s likely to increase memory costs without any performance benefit (if anything a performance decrease). It’s easy to add back later on if we have data that suggests otherwise.
We can reduce the search space for searching for versioned entries by setting the lower bound of the base version to one minus the first Java major release that has runtime support for multi-release jar files i.e. 8 as it currently stands.
312 * @throws IllegalStateException if called after an entry has been read
… after an entry has been obtained (see ….)
You might need some further clarification in setVersioned/setRuntimeVersioned itself describing the lifecycle e.g.
this method may be called one or more times after construction and before the first operation that obtains an entry, after which the jar file configuration’s is considered immutable, and subsequent calls to this method will result in an ISE.
Paul.
On 3 Nov 2015, at 18:11, Steve Drach <steve.drach@oracle.com> wrote:
Please review the latest webrev that addresses the issue raised in Sherman’s comments below, with my comments interspersed in-line. The changes between this webrev and the last one are in the definition and use of the ismultiRelease() method and the introduction of a configuration lock, the boolean configured, that prevents setting the version after an entry has been read from the jar file. As a consequence of the configuration lock, I had to modify a couple tests and add a new one.
Issue: https://bugs.openjdk.java.net/browse/JDK-8132734 JEP 238: https://bugs.openjdk.java.net/browse/JDK-8047305 Webrev: http://cr.openjdk.java.net/~psandoz/multiversion-jar/jar-webrev/
Hi Steve, This looks good to me (i only browsed the test code). It’s been around the block a few times :-) IMO, baring any major issues, it’s time to push and we can clean up any ancillary issues with later pushes. Paul.
On 5 Nov 2015, at 18:10, Steve Drach <steve.drach@oracle.com> wrote:
Hi,
Here’s a new webrev that addresses the issues Paul brought up. The versioned entry cache has been removed, the search space has been reduced, the documentation for setVersioned(int) and setRuntimeVersioned() has been updated to clarify when IllegalStateException is thrown, and the tests have been changed to accommodate a jar file with versions 9 and 10, rather than 8 and 9.
Issue: https://bugs.openjdk.java.net/browse/JDK-8132734 <https://bugs.openjdk.java.net/browse/JDK-8132734> JEP 238: https://bugs.openjdk.java.net/browse/JDK-8047305 <https://bugs.openjdk.java.net/browse/JDK-8047305> Webrev: http://cr.openjdk.java.net/~psandoz/multiversion-jar/jar-webrev/ <http://cr.openjdk.java.net/~psandoz/multiversion-jar/jar-webrev/>
Steve
Hi Sherman. Would you please give this another pass? I want make sure all your concerns are met. Thanks Steve
On Nov 6, 2015, at 12:44 AM, Paul Sandoz <paul.sandoz@oracle.com> wrote:
Hi Steve,
This looks good to me (i only browsed the test code). It’s been around the block a few times :-) IMO, baring any major issues, it’s time to push and we can clean up any ancillary issues with later pushes.
Paul.
On 5 Nov 2015, at 18:10, Steve Drach <steve.drach@oracle.com> wrote:
Hi,
Here’s a new webrev that addresses the issues Paul brought up. The versioned entry cache has been removed, the search space has been reduced, the documentation for setVersioned(int) and setRuntimeVersioned() has been updated to clarify when IllegalStateException is thrown, and the tests have been changed to accommodate a jar file with versions 9 and 10, rather than 8 and 9.
Issue: https://bugs.openjdk.java.net/browse/JDK-8132734 <https://bugs.openjdk.java.net/browse/JDK-8132734> JEP 238: https://bugs.openjdk.java.net/browse/JDK-8047305 <https://bugs.openjdk.java.net/browse/JDK-8047305> Webrev: http://cr.openjdk.java.net/~psandoz/multiversion-jar/jar-webrev/ <http://cr.openjdk.java.net/~psandoz/multiversion-jar/jar-webrev/>
Steve
Hi Steve, My apology for slow response. I was sick in bed since last Sat... I've scanned the update, here are my comments (1) The newly added "configured" obviously will help in most use scenario, but (yes, there is always a but :-)), given the related code is not synchronized, I'm pretty sure there is race condition somewhere, for example betwen Ln#388 -- Ln#396, if there is a "faster" second getEntry catches up there, a base/root entry will return, but a versioned will return next time if the isMultiRelease is set to true and there is a matched versioned entry for it. (2) Now the BASE_VERSION has been raised to 8, the logic in setVersioned(n) will actually block any version num < 8, as now it is forced to be the BASE_VERSION, any lookup for versioned will stop at BASE_VERSION. Is it a specified behavior? adjust the BASE_VERSION to the version being set 'n' might be a more reasonable approach? (3) Again one more security concern you might want to confirm with vuln team (probably during their review). With current implementation, in use scenario a) it's multi-release jar b) is multi-release jar enabled c) all base/root entries are signed d) the set of versioned entries are not signed. the implementation returns versioned entries successfully, they are unsigned. Is this a concern, as arguably, the user is asking entry with the root name, and the corresponding entry for that root name is signed. But we return a linked versioned-entry, and it's not signed. This might not be an issue at all. Please confirm this when doing security review. -Sherman On 11/6/15 9:23 AM, Steve Drach wrote:
Hi Sherman.
Would you please give this another pass? I want make sure all your concerns are met.
Thanks Steve
On Nov 6, 2015, at 12:44 AM, Paul Sandoz <paul.sandoz@oracle.com> wrote:
Hi Steve,
This looks good to me (i only browsed the test code). It’s been around the block a few times :-) IMO, baring any major issues, it’s time to push and we can clean up any ancillary issues with later pushes.
Paul.
On 5 Nov 2015, at 18:10, Steve Drach <steve.drach@oracle.com> wrote:
Hi,
Here’s a new webrev that addresses the issues Paul brought up. The versioned entry cache has been removed, the search space has been reduced, the documentation for setVersioned(int) and setRuntimeVersioned() has been updated to clarify when IllegalStateException is thrown, and the tests have been changed to accommodate a jar file with versions 9 and 10, rather than 8 and 9.
Issue: https://bugs.openjdk.java.net/browse/JDK-8132734 <https://bugs.openjdk.java.net/browse/JDK-8132734> JEP 238: https://bugs.openjdk.java.net/browse/JDK-8047305 <https://bugs.openjdk.java.net/browse/JDK-8047305> Webrev: http://cr.openjdk.java.net/~psandoz/multiversion-jar/jar-webrev/ <http://cr.openjdk.java.net/~psandoz/multiversion-jar/jar-webrev/>
Steve
On 6 Nov 2015, at 22:50, Xueming Shen <xueming.shen@oracle.com> wrote:
Hi Steve,
My apology for slow response. I was sick in bed since last Sat…
Oh no, i hope you are getting better.
I've scanned the update, here are my comments
(1) The newly added "configured" obviously will help in most use scenario, but (yes, there is always a but :-)), given the related code is not synchronized, I'm pretty sure there is race condition somewhere, for example betwen Ln#388 -- Ln#396, if there is a "faster" second getEntry catches up there, a base/root entry will return, but a versioned will return next time if the isMultiRelease is set to true and there is a matched versioned entry for it.
Steve and I were having a hard time determining the exact nature of what is actually thread safe in JarFile, it appears to be rather delicately balanced. We need to carefully rethink this aspect. It would be so much easier if we could sort this out in the constructor, alas some cases seem to make it very tricky to do that.
(2) Now the BASE_VERSION has been raised to 8, the logic in setVersioned(n) will actually block any version num < 8, as now it is forced to be the BASE_VERSION, any lookup for versioned will stop at BASE_VERSION. Is it a specified behavior?
It should be.
adjust the BASE_VERSION to the version being set 'n' might be a more reasonable approach?
It does not make much sense to support versioned entries below the Java version that supports multi-release JARs, since such versioned entries will be ignored when the JarFile is processed on a corresponding JDK of that version, and when processed on a Java version that supports multi-release they effectively shadow to earlier or unversioned entries, thus it might induce unintended differences in behaviour if such entries differ in content. Paul.
(3) Again one more security concern you might want to confirm with vuln team (probably during their review). With current implementation, in use scenario a) it's multi-release jar b) is multi-release jar enabled c) all base/root entries are signed d) the set of versioned entries are not signed.
the implementation returns versioned entries successfully, they are unsigned. Is this a concern, as arguably, the user is asking entry with the root name, and the corresponding entry for that root name is signed. But we return a linked versioned-entry, and it's not signed. This might not be an issue at all. Please confirm this when doing security review.
-Sherman
On 11/6/15 9:23 AM, Steve Drach wrote:
Hi Sherman.
Would you please give this another pass? I want make sure all your concerns are met.
Thanks Steve
On Nov 6, 2015, at 12:44 AM, Paul Sandoz <paul.sandoz@oracle.com> wrote:
Hi Steve,
This looks good to me (i only browsed the test code). It’s been around the block a few times :-) IMO, baring any major issues, it’s time to push and we can clean up any ancillary issues with later pushes.
Paul.
On 5 Nov 2015, at 18:10, Steve Drach <steve.drach@oracle.com> wrote:
Hi,
Here’s a new webrev that addresses the issues Paul brought up. The versioned entry cache has been removed, the search space has been reduced, the documentation for setVersioned(int) and setRuntimeVersioned() has been updated to clarify when IllegalStateException is thrown, and the tests have been changed to accommodate a jar file with versions 9 and 10, rather than 8 and 9.
Issue: https://bugs.openjdk.java.net/browse/JDK-8132734 <https://bugs.openjdk.java.net/browse/JDK-8132734> JEP 238: https://bugs.openjdk.java.net/browse/JDK-8047305 <https://bugs.openjdk.java.net/browse/JDK-8047305> Webrev: http://cr.openjdk.java.net/~psandoz/multiversion-jar/jar-webrev/ <http://cr.openjdk.java.net/~psandoz/multiversion-jar/jar-webrev/>
Steve
Hi Sherman, Thanks for looking at this.
(1) The newly added "configured" obviously will help in most use scenario, but (yes, there is always a but :-)), given the related code is not synchronized, I'm pretty sure there is race condition somewhere, for example betwen Ln#388 -- Ln#396, if there is a "faster" second getEntry catches up there, a base/root entry will return, but a versioned will return next time if the isMultiRelease is set to true and there is a matched versioned entry for it.
It seems to me we could synchronize/lock the code after line 392 with little loss in performance, although the recursive nature of getManifest might cause a problem. I need to think about it a bit. I guess JarFile was thread safe since getEntry was idempotent
(2) Now the BASE_VERSION has been raised to 8, the logic in setVersioned(n) will actually bl any version num < 8, as now it is forced to be the BASE_VERSION, any lookup for versioned will stop at BASE_VERSION. Is it a specified behavior? adjust the BASE_VERSION to the version being set 'n' might be a more reasonable approach?
(3) Again one more security concern you might want to confirm with vuln team (probably during their review). With current implementation, in use scenario a) it's multi-release jar b) is multi-release jar enabled c) all base/root entries are signed d) the set of versioned entries are not signed.
the implementation returns versioned entries successfully, they are unsigned. Is this a concern, as arguably, the user is asking entry with the root name, and the corresponding entry for that root name is signed. But we return a linked versioned-entry, and it's not signed. This might not be an issue at all. Please confirm this when doing security review.
-Sherman
On 11/6/15 9:23 AM, Steve Drach wrote:
Hi Sherman.
Would you please give this another pass? I want make sure all your concerns are met.
Thanks Steve
On Nov 6, 2015, at 12:44 AM, Paul Sandoz <paul.sandoz@oracle.com> wrote:
Hi Steve,
This looks good to me (i only browsed the test code). It’s been around the block a few times :-) IMO, baring any major issues, it’s time to push and we can clean up any ancillary issues with later pushes.
Paul.
On 5 Nov 2015, at 18:10, Steve Drach <steve.drach@oracle.com> wrote:
Hi,
Here’s a new webrev that addresses the issues Paul brought up. The versioned entry cache has been removed, the search space has been reduced, the documentation for setVersioned(int) and setRuntimeVersioned() has been updated to clarify when IllegalStateException is thrown, and the tests have been changed to accommodate a jar file with versions 9 and 10, rather than 8 and 9.
Issue: https://bugs.openjdk.java.net/browse/JDK-8132734 <https://bugs.openjdk.java.net/browse/JDK-8132734> JEP 238: https://bugs.openjdk.java.net/browse/JDK-8047305 <https://bugs.openjdk.java.net/browse/JDK-8047305> Webrev: http://cr.openjdk.java.net/~psandoz/multiversion-jar/jar-webrev/ <http://cr.openjdk.java.net/~psandoz/multiversion-jar/jar-webrev/>
Steve
Note to readers, I sent an earlier incomplete version of this email due to fat finger syndrome. Hopefully this response will be complete. Hi Sherman, Thanks for looking at this.
(1) The newly added "configured" obviously will help in most use scenario, but (yes, there is always a but :-)), given the related code is not synchronized, I'm pretty sure there is race condition somewhere, for example betwen Ln#388 -- Ln#396, if there is a "faster" second getEntry catches up there, a base/root entry will return, but a versioned will return next time if the isMultiRelease is set to true and there is a matched versioned entry for it.
I guess JarFile was thread safe since getEntry was idempotent before I changed it. It seems to me we could synchronize/lock the code after line 392 with little loss in performance, although the recursive nature of getManifest might cause a problem. As Paul said, we need to rethink this.
(2) Now the BASE_VERSION has been raised to 8, the logic in setVersioned(n) will actually bl any version num < 8, as now it is forced to be the BASE_VERSION, any lookup for versioned will stop at BASE_VERSION. Is it a specified behavior? adjust the BASE_VERSION to the version being set 'n' might be a more reasonable approach?
Paul’s explanation is correct. The ‘set’ methods establish an upper bound whereas the BASE_VERSION is a lower bound.
(3) Again one more security concern you might want to confirm with vuln team (probably during their review). With current implementation, in use scenario a) it's multi-release jar b) is multi-release jar enabled c) all base/root entries are signed d) the set of versioned entries are not signed.
the implementation returns versioned entries successfully, they are unsigned. Is this a concern, as arguably, the user is asking entry with the root name, and the corresponding entry for that root name is signed. But we return a linked versioned-entry, and it's not signed. This might not be an issue at all. Please confirm this when doing security review.
I know I’ve discussed this with Max several times. Whenever the contents of a signed jar file as entries added to it, it should be resigned. So, if folks do the right thing, your scenario shouldn’t occur, although I don’t think there is a way to guarantee that. I will point this out to the vulnerability team. Again, thanks for your comments. Steve
-Sherman
On 11/6/15 9:23 AM, Steve Drach wrote:
Hi Sherman.
Would you please give this another pass? I want make sure all your concerns are met.
Thanks Steve
On Nov 6, 2015, at 12:44 AM, Paul Sandoz <paul.sandoz@oracle.com> wrote:
Hi Steve,
This looks good to me (i only browsed the test code). It’s been around the block a few times :-) IMO, baring any major issues, it’s time to push and we can clean up any ancillary issues with later pushes.
Paul.
On 5 Nov 2015, at 18:10, Steve Drach <steve.drach@oracle.com> wrote:
Hi,
Here’s a new webrev that addresses the issues Paul brought up. The versioned entry cache has been removed, the search space has been reduced, the documentation for setVersioned(int) and setRuntimeVersioned() has been updated to clarify when IllegalStateException is thrown, and the tests have been changed to accommodate a jar file with versions 9 and 10, rather than 8 and 9.
Issue: https://bugs.openjdk.java.net/browse/JDK-8132734 <https://bugs.openjdk.java.net/browse/JDK-8132734> JEP 238: https://bugs.openjdk.java.net/browse/JDK-8047305 <https://bugs.openjdk.java.net/browse/JDK-8047305> Webrev: http://cr.openjdk.java.net/~psandoz/multiversion-jar/jar-webrev/ <http://cr.openjdk.java.net/~psandoz/multiversion-jar/jar-webrev/>
Steve
On 05/11/2015 17:10, Steve Drach wrote:
Hi,
Here’s a new webrev that addresses the issues Paul brought up. The versioned entry cache has been removed, the search space has been reduced, the documentation for setVersioned(int) and setRuntimeVersioned() has been updated to clarify when IllegalStateException is thrown, and the tests have been changed to accommodate a jar file with versions 9 and 10, rather than 8 and 9.
Issue: https://bugs.openjdk.java.net/browse/JDK-8132734 <https://bugs.openjdk.java.net/browse/JDK-8132734> JEP 238: https://bugs.openjdk.java.net/browse/JDK-8047305 <https://bugs.openjdk.java.net/browse/JDK-8047305> Webrev: http://cr.openjdk.java.net/~psandoz/multiversion-jar/jar-webrev/ <http://cr.openjdk.java.net/~psandoz/multiversion-jar/jar-webrev/>
I assume adding #rtversioned to JAR URLs will need discussion. I don't know if you are planning to include this in the first push or not but I assume it will require changes to the JAR URL scheme defined in java.net.JarURLConnection. I'm sure there is lots of code in the wild that parses JAR URLs. I looked through JarFile and don't seem anything obviously wrong. The property jdk.util.jar.multirelease is a bit unusual, an alternative name for the property might be jdk.util.jar.enableMultiRelease with "true", "false" and "force" as possible values. I skimmed over the tests and good to see that you pulled back from checking in binary JAR files. As JarFile is Closeable then you could use try-with-resources to ensure the JAR files are closed, otherwise test failures will leave files open. -Alan.
On 8 Nov 2015, at 16:51, Alan Bateman <Alan.Bateman@oracle.com> wrote:
On 05/11/2015 17:10, Steve Drach wrote:
Hi,
Here’s a new webrev that addresses the issues Paul brought up. The versioned entry cache has been removed, the search space has been reduced, the documentation for setVersioned(int) and setRuntimeVersioned() has been updated to clarify when IllegalStateException is thrown, and the tests have been changed to accommodate a jar file with versions 9 and 10, rather than 8 and 9.
Issue: https://bugs.openjdk.java.net/browse/JDK-8132734 <https://bugs.openjdk.java.net/browse/JDK-8132734> JEP 238: https://bugs.openjdk.java.net/browse/JDK-8047305 <https://bugs.openjdk.java.net/browse/JDK-8047305> Webrev: http://cr.openjdk.java.net/~psandoz/multiversion-jar/jar-webrev/ <http://cr.openjdk.java.net/~psandoz/multiversion-jar/jar-webrev/>
I assume adding #rtversioned to JAR URLs will need discussion. I don't know if you are planning to include this in the first push or not but I assume it will require changes to the JAR URL scheme defined in java.net.JarURLConnection. I'm sure there is lots of code in the wild that parses JAR URLs.
I was wondering if it might be possible to consider this a mostly internal contract since the URL class loading functionality (URLClassPath.java) creates such URLs for processing by JarURLConnection in some code paths (getResourceAsStream IIRC). Since JarURLConnection caches JarFiles we need to distinguish between runtime class loading behaviour and independently created URLs. A fragment is the most unobtrusive way to do this, and i think a reasonably accurate use of the URL syntax. I expect we will have to say something since these URLs will pop out from ClassLoader.getResource. Paul.
I looked through JarFile and don't seem anything obviously wrong. The property jdk.util.jar.multirelease is a bit unusual, an alternative name for the property might be jdk.util.jar.enableMultiRelease with "true", "false" and "force" as possible values.
I skimmed over the tests and good to see that you pulled back from checking in binary JAR files. As JarFile is Closeable then you could use try-with-resources to ensure the JAR files are closed, otherwise test failures will leave files open.
-Alan.
On 08/11/2015 19:36, Paul Sandoz wrote:
:
I was wondering if it might be possible to consider this a mostly internal contract since the URL class loading functionality (URLClassPath.java) creates such URLs for processing by JarURLConnection in some code paths (getResourceAsStream IIRC). Since JarURLConnection caches JarFiles we need to distinguish between runtime class loading behaviour and independently created URLs.
A fragment is the most unobtrusive way to do this, and i think a reasonably accurate use of the URL syntax.
I can go along with using a URL fragment although the proposed value (rtversionsed) is a little bit strange. As to whether this is implementation vs. JAR URL spec then I assume it needs to be spec so that libraries can create URLs that will use runtime versioning when access the JAR. I just read the mails about the configured setting, I just wonder if there could be any interaction with JAR cache as it's not keyed on whether it is runtime versioned. -Alan
On 8 Nov 2015, at 21:21, Alan Bateman <Alan.Bateman@oracle.com> wrote:
On 08/11/2015 19:36, Paul Sandoz wrote:
:
I was wondering if it might be possible to consider this a mostly internal contract since the URL class loading functionality (URLClassPath.java) creates such URLs for processing by JarURLConnection in some code paths (getResourceAsStream IIRC). Since JarURLConnection caches JarFiles we need to distinguish between runtime class loading behaviour and independently created URLs.
A fragment is the most unobtrusive way to do this, and i think a reasonably accurate use of the URL syntax.
I can go along with using a URL fragment although the proposed value (rtversionsed) is a little bit strange.
The name is derived from the requirement that if the fragment is present a call to getRuntimeVersionedEntry is required (or the equivalent of), thus the URL is a reference to a runtime versioned entry.
As to whether this is implementation vs. JAR URL spec then I assume it needs to be spec so that libraries can create URLs that will use runtime versioning when access the JAR.
Yeah, i don’t think we can avoid it.
I just read the mails about the configured setting, I just wonder if there could be any interaction with JAR cache as it's not keyed on whether it is runtime versioned.
Do you mean caching by JarFileFactory? AFAICT it’s only used by JarURLConnection and that never configures JarFiles, which is why we need the fragment. Paul.
On 08/11/2015 20:38, Paul Sandoz wrote:
: The name is derived from the requirement that if the fragment is present a call to getRuntimeVersionedEntry is required (or the equivalent of), thus the URL is a reference to a runtime versioned entry. I was just wondering about something shorter, like #runtime.
: Do you mean caching by JarFileFactory? AFAICT it’s only used by JarURLConnection and that never configures JarFiles, which is why we need the fragment.
Okay, I think ignore my comment, it can use getRuntimeVersionedEntry (as it does) and so doesn't need to configure the JarFile. If it did need to configure the JarFile then it couldn't work reliably of course. -Alan
On 9 Nov 2015, at 09:19, Alan Bateman <Alan.Bateman@oracle.com> wrote:
On 08/11/2015 20:38, Paul Sandoz wrote:
: The name is derived from the requirement that if the fragment is present a call to getRuntimeVersionedEntry is required (or the equivalent of), thus the URL is a reference to a runtime versioned entry. I was just wondering about something shorter, like #runtime.
No objections to that. Paul.
It’s nice to see these issues get resolved while I was sleeping ;-)
On 08/11/2015 20:38, Paul Sandoz wrote:
: The name is derived from the requirement that if the fragment is present a call to getRuntimeVersionedEntry is required (or the equivalent of), thus the URL is a reference to a runtime versioned entry. I was just wondering about something shorter, like #runtime.
No objections to that.
I’ll change the fragment to #runtime
As to whether this is implementation vs. JAR URL spec then I assume it needs to be spec so that libraries can create URLs that will use runtime versioning when access the JAR.
Yeah, i don’t think we can avoid it.
Ok. I guess I should put the information in JarURLConnection.
As to whether this is implementation vs. JAR URL spec then I assume it needs to be spec so that libraries can create URLs that will use runtime versioning when access the JAR.
Yeah, i don’t think we can avoid it.
Ok. I guess I should put the information in JarURLConnection.
Actually it’s a bit strange documenting the #runtime fragment in JarURLConnection, because none of the changes are in that class. The changes are in the sun.net.www.protocol.jar.JarURLConnection class that is not publicly documented.
On 9 Nov 2015, at 21:53, Steve Drach <steve.drach@oracle.com> wrote:
As to whether this is implementation vs. JAR URL spec then I assume it needs to be spec so that libraries can create URLs that will use runtime versioning when access the JAR.
Yeah, i don’t think we can avoid it.
Ok. I guess I should put the information in JarURLConnection.
Actually it’s a bit strange documenting the #runtime fragment in JarURLConnection, because none of the changes are in that class. The changes are in the sun.net.www.protocol.jar.JarURLConnection class that is not publicly documented.
You can document it as a “should” requirement for subclasses of j.n.JarURLConnection. For example: Subclasses of JarURLConnection that support multi-release JAR files should support URLs that refer to JAR runtime versioned entries. Such a URL is a URL referring to an entry modified to include the fragment “runtime”. Subclasses that do not support multi-release JAR files should ignore the fragment and process such URLs as if the fragment were not present. Then you can refer to an example. There is a chance that the fragment might perturb existing subclasses, likely only if the URL parsing is buggy. Grepcode does not report many external subclasses: http://grepcode.com/search/usages?id=repository.grepcode.com$java$root@jdk$o... It’s more common to register a URLStreamHandlerFactory via URL.setURLStreamHandlerFactory. I suspect that is used more for supporting different URL schemes than for overriding the “jar” scheme. This is a very delicate area, override the “jar" scheme might be risky given the interaction with class loaders. IMO the URLs returned from JDK ClassLoaders should be using URL schemes that are not overridable, but it’s probably too late to change that. Paul.
On 10/11/2015 09:06, Paul Sandoz wrote:
:
It’s more common to register a URLStreamHandlerFactory via URL.setURLStreamHandlerFactory. I suspect that is used more for supporting different URL schemes than for overriding the “jar” scheme. This is a very delicate area, override the “jar" scheme might be risky given the interaction with class loaders. IMO the URLs returned from JDK ClassLoaders should be using URL schemes that are not overridable, but it’s probably too late to change that.
We had to re-examine this area in preparation for modules and java.net.URL has the following in its javadoc: * Some protocol handlers, for example those used for loading platform * classes or classes on the class path, may not be overridden. The details * of such restrictions, and when those restrictions apply (during * initialization of the runtime for example), are implementation specific * and therefore not specified. At things stand then the file and jrt protocol handlers are not overridable. The jar protocol handler can be overridden but not via the ServiceLoader mechanism (for obvious reasons) and only after the VM/runtime has been initialized. The Java Plugin and Java Web Start in Oracle's builds is the only case that I'm aware of where the jar protocol handler is overridden. -Alan
Hi, Please review the new webrev that addresses the issues raised by Sherman and Alan in the last iteration. In particular: - fixed the race condition in isMultiRelease() and another one with the variables ‘version’ and ‘configured’ - changed the fragment for JarURLConnection runtime versioning from ‘rtversioned’ to ‘runtime’, and created documentation for it - used try with resources to open JarFile in all the tests Issue: https://bugs.openjdk.java.net/browse/JDK-8132734 <https://bugs.openjdk.java.net/browse/JDK-8132734> JEP 238: https://bugs.openjdk.java.net/browse/JDK-8047305 <https://bugs.openjdk.java.net/browse/JDK-8047305> Webrev: http://cr.openjdk.java.net/~psandoz/multiversion-jar/jar-webrev/ <http://cr.openjdk.java.net/~psandoz/multiversion-jar/jar-webrev/> Thanks, Steve
On 11/11/2015 16:44, Steve Drach wrote:
Hi,
Please review the new webrev that addresses the issues raised by Sherman and Alan in the last iteration. In particular: - fixed the race condition in isMultiRelease() and another one with the variables ‘version’ and ‘configured’ - changed the fragment for JarURLConnection runtime versioning from ‘rtversioned’ to ‘runtime’, and created documentation for it - used try with resources to open JarFile in all the tests
Issue: https://bugs.openjdk.java.net/browse/JDK-8132734 JEP 238: https://bugs.openjdk.java.net/browse/JDK-8047305 Webrev: http://cr.openjdk.java.net/~psandoz/multiversion-jar/jar-webrev/ <http://cr.openjdk.java.net/%7Epsandoz/multiversion-jar/jar-webrev/>
The updated webrev looks must better. In JarURLConnection then it would be good if the reference to multi-release JARs should link to the description in the JarFile spec. In the previous round then we were discussing renaming the jdk.util.jar.multirelease property. Has there been any more on that? The test MultiReleaseJarURLConnection uses @library /lib/testlibrary/java/util/jar so it's reaching across the file system to use the infrastructure of the JarFile tests. It might be clearer to move the test to the JarFile directory. It would be nice if we could reduce some of the really long lines if possible, just to make future side-by-side a bit easier (avoid horizontal scrolling). -Alan.
Hi Alan, Thanks for looking at this.
Please review the new webrev that addresses the issues raised by Sherman and Alan in the last iteration. In particular: - fixed the race condition in isMultiRelease() and another one with the variables ‘version’ and ‘configured’ - changed the fragment for JarURLConnection runtime versioning from ‘rtversioned’ to ‘runtime’, and created documentation for it - used try with resources to open JarFile in all the tests
Issue: <https://bugs.openjdk.java.net/browse/JDK-8132734>https://bugs.openjdk.java.net/browse/JDK-8132734 <https://bugs.openjdk.java.net/browse/JDK-8132734> JEP 238: https://bugs.openjdk.java.net/browse/JDK-8047305 <https://bugs.openjdk.java.net/browse/JDK-8047305> Webrev: http://cr.openjdk.java.net/~psandoz/multiversion-jar/jar-webrev/ <http://cr.openjdk.java.net/%7Epsandoz/multiversion-jar/jar-webrev/>
The updated webrev looks must better. In JarURLConnection then it would be good if the reference to multi-release JARs should link to the description in the JarFile spec.
Sure, just put a “see JarFile” comment in?
In the previous round then we were discussing renaming the jdk.util.jar.multirelease property. Has there been any more on that?
Less of a discussion than a suggestion ;-) I didn’t change it because I wanted to see how important it was — I figured if you didn’t comment it was okay. But you did comment, so I guess it’s important. I’ll change it.
The test MultiReleaseJarURLConnection uses @library /lib/testlibrary/java/util/jar so it's reaching across the file system to use the infrastructure of the JarFile tests. It might be clearer to move the test to the JarFile directory.
I can do that. I didn’t think that was the right directory to put it in. Seems like it should be with the other JarURLConnection tests, that way I could just run jtreg on the directory and get all the JarURLConnection tests run, including the multi-release jar test.
It would be nice if we could reduce some of the really long lines if possible, just to make future side-by-side a bit easier (avoid horizontal scrolling).
I’ll try.
-Alan.
On 11/11/15 8:44 AM, Steve Drach wrote:
Hi,
Please review the new webrev that addresses the issues raised by Sherman and Alan in the last iteration. In particular: - fixed the race condition in isMultiRelease() and another one with the variables ‘version’ and ‘configured’ - changed the fragment for JarURLConnection runtime versioning from ‘rtversioned’ to ‘runtime’, and created documentation for it - used try with resources to open JarFile in all the tests
Issue: https://bugs.openjdk.java.net/browse/JDK-8132734 JEP 238: https://bugs.openjdk.java.net/browse/JDK-8047305 Webrev: http://cr.openjdk.java.net/~psandoz/multiversion-jar/jar-webrev/ <http://cr.openjdk.java.net/%7Epsandoz/multiversion-jar/jar-webrev/>
Hi Steve, Seems like the "version" is now a little confused. (1) getVersioned() says "or 0 if this jar file is processed as if it is an unversioned or is not a multi-release jar file". The implementation now returns 8 (BASE_VERSION is 8 now) if not a multi-release jar (?). (2) setVersioned() says "If the specified version is 0 then this is configured to be an unversioned jar file". The implementation seems treat anything < BASE_VERSION as such? Sure, technically it's still correct. Given the BASE_VERSION is a private field, any thing between 0 and BASE_VERSION becomes unspecified gray area. (3) getRuntimeVersionedEntry() spec says "If the JarFile is a multi-release jar file and is configured to be processed as such, then ...". However the implementation if (RUNTIME_VERSION > BASE_VERSION && !name.startsWith(META_INF) && isMultiRelease()) { ... } does not appears to check the logic "is configured to be ..."? And 391 if (manifestRead) { 392 return isMultiRelease; 393 } 394 synchronized (this) { 395 // this needs to be set prior to getManifest() call to prevent recursive loop 396 manifestRead = true; 397 try { 398 Manifest man = getManifest(); 399 isMultiRelease = (man != null) && man.getMainAttributes().containsKey(MULTI_RELEASE); 400 } catch (IOException x) { 401 isMultiRelease = false; 402 } 403 } 404 return isMultiRelease; don't we have a race condition if #391/2 is outside the synchronized block? for sample, one thread is inside sync block and set the manifestRead = true, but before isMultiRelease = true/false, and the second thread gets to #391. thanks, -Sherman
Hi Sherman, Thanks for looking at this. Comments in-line below.
On Nov 17, 2015, at 9:49 AM, Xueming Shen <xueming.shen@oracle.com> wrote:
On 11/11/15 8:44 AM, Steve Drach wrote:
Hi,
Please review the new webrev that addresses the issues raised by Sherman and Alan in the last iteration. In particular: - fixed the race condition in isMultiRelease() and another one with the variables ‘version’ and ‘configured’ - changed the fragment for JarURLConnection runtime versioning from ‘rtversioned’ to ‘runtime’, and created documentation for it - used try with resources to open JarFile in all the tests
Issue: <https://bugs.openjdk.java.net/browse/JDK-8132734>https://bugs.openjdk.java.net/browse/JDK-8132734 <https://bugs.openjdk.java.net/browse/JDK-8132734> JEP 238: https://bugs.openjdk.java.net/browse/JDK-8047305 <https://bugs.openjdk.java.net/browse/JDK-8047305> Webrev: http://cr.openjdk.java.net/~psandoz/multiversion-jar/jar-webrev/ <http://cr.openjdk.java.net/%7Epsandoz/multiversion-jar/jar-webrev/>
Hi Steve,
Seems like the "version" is now a little confused.
More likely it’s my confusion.
(1) getVersioned() says "or 0 if this jar file is processed as if it is an unversioned or is not a multi-release jar file". The implementation now returns 8 (BASE_VERSION is 8 now) if not a multi-release jar (?).
Yep. It should return 0 if version <= BASE_VERSION
(2) setVersioned() says "If the specified version is 0 then this is configured to be an unversioned jar file". The implementation seems treat anything < BASE_VERSION as such? Sure, technically it's still correct. Given the BASE_VERSION is a private field, any thing between 0 and BASE_VERSION becomes unspecified gray area.
As you say, it’s technically correct. We expect people will set version to 0 if they want unversioned processing. BASE_VERSION is not something clients should be concerned with, or even know about. So, yes, they can set versions to something less than BASE_VERSION but it doesn’t really matter, we know what they mean even if they don’t ;-) It’s a bit of a loose interpretation of the spec that doesn’t cause any harm. It doesn’t seem as useful if we strictly enforce this by throw an IllegalArgumentException for example.
(3) getRuntimeVersionedEntry() spec says "If the JarFile is a multi-release jar file and is configured to be processed as such, then ...". However the implementation
if (RUNTIME_VERSION > BASE_VERSION && !name.startsWith(META_INF) && isMultiRelease()) { ... }
does not appears to check the logic "is configured to be …"?
Yes, clearly an error in the spec.
And
391 if (manifestRead) { 392 return isMultiRelease; 393 } 394 synchronized (this) { 395 // this needs to be set prior to getManifest() call to prevent recursive loop 396 manifestRead = true; 397 try { 398 Manifest man = getManifest(); 399 isMultiRelease = (man != null) && man.getMainAttributes().containsKey(MULTI_RELEASE); 400 } catch (IOException x) { 401 isMultiRelease = false; 402 } 403 } 404 return isMultiRelease; don't we have a race condition if #391/2 is outside the synchronized block? for sample, one thread is inside sync block and set the manifestRead = true, but before isMultiRelease = true/false, and the second thread gets to #391.
You’re right. once upon a time, there was a third boolean to deal with the potential loop, but I took it out and tried to overload manifestRead with dual meaning. I need to put the third variable back in. This is a surprisingly complex method, hard to get right.
thanks, -Sherman
Hi, Please review the latest iteration of the webrev for runtime support of multi-release jar files. Issue: https://bugs.openjdk.java.net/browse/JDK-8132734 JEP 238: https://bugs.openjdk.java.net/browse/JDK-8047305 Webrev: http://cr.openjdk.java.net/~psandoz/multiversion-jar/jar-webrev/ I believe I addressed all issues brought up in the last review as follows.
In JarURLConnection then it would be good if the reference to multi-release JARs should link to the description in the JarFile spec.
I included a comment in JarURLConnection referencing JarFile for more information regarding multi-release jar files.
In the previous round then we were discussing renaming the jdk.util.jar.multirelease property. Has there been any more on that?
I changed the property to jdk.util.jar.enableMultiRelease={true | false | force} as suggested. I assume camel case is okay here.
The test MultiReleaseJarURLConnection uses @library /lib/testlibrary/java/util/jar so it's reaching across the file system to use the infrastructure of the JarFile tests. It might be clearer to move the test to the JarFile directory.
I’m not sure what it means to reach across the file system. The bulk of the multi-release tests are in jdk/test/java/util/jar/JarFile with the MultiReleaseJarURLConnection test in jdk/test/sun/net/www/protocol/jar. Both test directories refer to the third directory, jdk/test/lib/testlibrary/java/util/jar, for common resources. All three directories are in the jdk repo. I don’t see anything obviously incorrect about where things are, the tests are in the same hierarchy as the components they test. So pending further information, I left it as it was.
It would be nice if we could reduce some of the really long lines if possible, just to make future side-by-side a bit easier (avoid horizontal scrolling).
I put the right margin for code at 100 and wrapped lines exceeding that.
(1) getVersioned() says "or 0 if this jar file is processed as if it is an unversioned or is not a multi-release jar file". The implementation now returns 8 (BASE_VERSION is 8 now) if not a multi-release jar (?).
I fixed it to return 0.
(2) setVersioned() says "If the specified version is 0 then this is configured to be an unversioned jar file". The implementation seems treat anything < BASE_VERSION as such? Sure, technically it's still correct. Given the BASE_VERSION is a private field, any thing between 0 and BASE_VERSION becomes unspecified gray area.
I clarified it in the documentation. It now says "If the specified version is less than 9, then this JarFile is configured to be processed as if it is an unversioned jar file.” I also made some additional clarifying documentation changes in overall documentation for JarFile.
(3) getRuntimeVersionedEntry() spec says "If the JarFile is a multi-release jar file and is configured to be processed as such, then ...". However the implementation
if (RUNTIME_VERSION > BASE_VERSION && !name.startsWith(META_INF) && isMultiRelease()) { ... }
does not appears to check the logic "is configured to be …”?
I took the “is configured to be…” phrase out of the documentation.
And
391 if (manifestRead) { 392 return isMultiRelease; 393 } 394 synchronized (this) { 395 // this needs to be set prior to getManifest() call to prevent recursive loop 396 manifestRead = true; 397 try { 398 Manifest man = getManifest(); 399 isMultiRelease = (man != null) && man.getMainAttributes().containsKey(MULTI_RELEASE); 400 } catch (IOException x) { 401 isMultiRelease = false; 402 } 403 } 404 return isMultiRelease; don't we have a race condition if #391/2 is outside the synchronized block? for sample, one thread is inside sync block and set the manifestRead = true, but before isMultiRelease = true/false, and the second thread gets to #391.
We rewrote the isMultirelease() method to eliminate the race. We also simplified it, removing the recursive invocation, with a small change to the getManEntry() method. Thanks, Steve
Hi Steve, The change looks fine. thanks, Sherman On 11/18/2015 01:27 PM, Steve Drach wrote:
Hi,
Please review the latest iteration of the webrev for runtime support of multi-release jar files.
Issue: https://bugs.openjdk.java.net/browse/JDK-8132734 JEP 238: https://bugs.openjdk.java.net/browse/JDK-8047305 Webrev: http://cr.openjdk.java.net/~psandoz/multiversion-jar/jar-webrev/
I believe I addressed all issues brought up in the last review as follows.
In JarURLConnection then it would be good if the reference to multi-release JARs should link to the description in the JarFile spec. I included a comment in JarURLConnection referencing JarFile for more information regarding multi-release jar files.
In the previous round then we were discussing renaming the jdk.util.jar.multirelease property. Has there been any more on that? I changed the property to jdk.util.jar.enableMultiRelease={true | false | force} as suggested. I assume camel case is okay here.
The test MultiReleaseJarURLConnection uses @library /lib/testlibrary/java/util/jar so it's reaching across the file system to use the infrastructure of the JarFile tests. It might be clearer to move the test to the JarFile directory. I’m not sure what it means to reach across the file system. The bulk of the multi-release tests are in jdk/test/java/util/jar/JarFile with the MultiReleaseJarURLConnection test in jdk/test/sun/net/www/protocol/jar. Both test directories refer to the third directory, jdk/test/lib/testlibrary/java/util/jar, for common resources. All three directories are in the jdk repo. I don’t see anything obviously incorrect about where things are, the tests are in the same hierarchy as the components they test. So pending further information, I left it as it was.
It would be nice if we could reduce some of the really long lines if possible, just to make future side-by-side a bit easier (avoid horizontal scrolling). I put the right margin for code at 100 and wrapped lines exceeding that.
(1) getVersioned() says "or 0 if this jar file is processed as if it is an unversioned or is not a multi-release jar file". The implementation now returns 8 (BASE_VERSION is 8 now) if not a multi-release jar (?). I fixed it to return 0.
(2) setVersioned() says "If the specified version is 0 then this is configured to be an unversioned jar file". The implementation seems treat anything< BASE_VERSION as such? Sure, technically it's still correct. Given the BASE_VERSION is a private field, any thing between 0 and BASE_VERSION becomes unspecified gray area. I clarified it in the documentation. It now says "If the specified version is less than 9, then this JarFile is configured to be processed as if it is an unversioned jar file.” I also made some additional clarifying documentation changes in overall documentation for JarFile.
(3) getRuntimeVersionedEntry() spec says "If the JarFile is a multi-release jar file and is configured to be processed as such, then ...". However the implementation
if (RUNTIME_VERSION> BASE_VERSION&& !name.startsWith(META_INF)&& isMultiRelease()) { ... }
does not appears to check the logic "is configured to be …”? I took the “is configured to be…” phrase out of the documentation.
And
391 if (manifestRead) { 392 return isMultiRelease; 393 } 394 synchronized (this) { 395 // this needs to be set prior to getManifest() call to prevent recursive loop 396 manifestRead = true; 397 try { 398 Manifest man = getManifest(); 399 isMultiRelease = (man != null)&& man.getMainAttributes().containsKey(MULTI_RELEASE); 400 } catch (IOException x) { 401 isMultiRelease = false; 402 } 403 } 404 return isMultiRelease; don't we have a race condition if #391/2 is outside the synchronized block? for sample, one thread is inside sync block and set the manifestRead = true, but before isMultiRelease = true/false, and the second thread gets to #391. We rewrote the isMultirelease() method to eliminate the race. We also simplified it, removing the recursive invocation, with a small change to the getManEntry() method.
Thanks, Steve
On 18/11/2015 21:27, Steve Drach wrote:
Hi,
Please review the latest iteration of the webrev for runtime support of multi-release jar files.
I think the latest looks okay and addressed all the issues that I was concerned with. -Alan
Hi, CCC had some suggestions to improve the code that was previously approved during review. As a consequence, I had to make significant changes to the API, and I believe the code need further review. Please review the latest iteration of the webrev for runtime support of multi-release jar files. Issue: https://bugs.openjdk.java.net/browse/JDK-8132734 <https://bugs.openjdk.java.net/browse/JDK-8132734> JEP 238: https://bugs.openjdk.java.net/browse/JDK-8047305 <https://bugs.openjdk.java.net/browse/JDK-8047305> Webrev: http://cr.openjdk.java.net/~sdrach/8132734/webrev.02/ <http://cr.openjdk.java.net/~sdrach/8132734/webrev.02/> Here’s a list of things that were changed: 1. Removed setVersion, setRuntimeVersioned, and getRuntimeVersionedEntry. Also removed the configuration lock. 2. Added a Release enum to represent releases. 3. Added a new constructor that uses the enum as one of it’s arguments. This is the only way to make multi-release JarFile objects. 4. Added new code to provide runtime versioning for class loaders using the new constructor. 4. Made the “version” int final, removing the volatile modifier. 5. Made the two remaining public methods, getVersion and isMultiRelease final. 6. Added implSpec to getEntry and getJarEntry so potential subclassers are aware of how things work. 7. Rewrote and added tests to deal with the new constructor and the Release enum. 8. Added a simple http server to test getting jar files from both a local files system and from the network. Thanks, Steve
Hi, This is a repeat of the RFR I sent last Wed (Jan 13). CCC had some suggestions to improve the code that was previously approved during review. As a consequence, I had to make significant changes to the API, and I believe the code need further review. Please review the latest iteration of the webrev for runtime support of multi-release jar files. Issue: https://bugs.openjdk.java.net/browse/JDK-8132734 <https://bugs.openjdk.java.net/browse/JDK-8132734> JEP 238: https://bugs.openjdk.java.net/browse/JDK-8047305 <https://bugs.openjdk.java.net/browse/JDK-8047305> Webrev: http://cr.openjdk.java.net/~sdrach/8132734/webrev.02/ <http://cr.openjdk.java.net/~sdrach/8132734/webrev.02/> Here’s a list of things that were changed: 1. Removed setVersion, setRuntimeVersioned, and getRuntimeVersionedEntry. Also removed the configuration lock. 2. Added a Release enum to represent releases. 3. Added a new constructor that uses the enum as one of it’s arguments. This is the only way to make multi-release JarFile objects. 4. Added new code to provide runtime versioning for class loaders using the new constructor. 4. Made the “version” int final, removing the volatile modifier. 5. Made the two remaining public methods, getVersion and isMultiRelease final. 6. Added implSpec to getEntry and getJarEntry so potential subclassers are aware of how things work. 7. Rewrote and added tests to deal with the new constructor and the Release enum. 8. Added a simple http server to test getting jar files from both a local files system and from the network. Thanks, Steve
On 20/01/2016 17:56, Steve Drach wrote:
Hi,
This is a repeat of the RFR I sent last Wed (Jan 13).
:
Webrev: http://cr.openjdk.java.net/~sdrach/8132734/webrev.02/ <http://cr.openjdk.java.net/~sdrach/8132734/webrev.02/>
Overall I think the API looks much better. For Release then I have to admit that I dislike _9 and wonder if other options were considered? javax.lang.model.SourceVersion uses the RELEASE_xx convention for example. Also I wonder about Release.ROOT and whether Release.UNVERSIONED was considered? In general the phrase "root entry" in the javadoc makes me think the root or top-most directory. An alternative that might be clearer is to say "unversioned entry" and define that term clearly in the class description. The javadoc for Release.RUNTIME looks like it will have a javadoc link to jdk.Version but that is a JDK-specific API. Could the text starting "The effective runtime ..." move to an @implNote? I assume @since will be added to Release before this is pushed. The new JarFile ctor doesn't seem to handle the case that version is null when multi release is forced. Also I assume it should specify @throws SecurityException. Could the legacy JarFile ctor be changed to this(file, verify, mode, Release.ROOT) to avoid duplication? I don't have time to do a detailed pass over the updated tests but I wonder if SimpleHttpServer is really a candidate to put in the testlibrary tree. It looks like it is very specific to multi-release JARs and so I would expect to be co-located with those tests rather than being a hazard in the testlibrary tree. A small comment is that it would be good to fix the really long lines before these changes are pushed. This will help future side-by-side reviews where it would be otherwise annoying to have to do horizontal scrolling to view diffs. -Alan.
Thank you for the review Alan. See comments in line below.
Overall I think the API looks much better.
With the advantage of being much simpler too.
For Release then I have to admit that I dislike _9 and wonder if other options were considered? javax.lang.model.SourceVersion uses the RELEASE_xx convention for example.
I suspected this is a bike shed candidate. I think Release._9 is nicer and it conveys the same information in a less cluttered way than Release.RELEASE_9.
Also I wonder about Release.ROOT and whether Release.UNVERSIONED was considered? In general the phrase "root entry" in the javadoc makes me think the root or top-most directory. An alternative that might be clearer is to say "unversioned entry" and define that term clearly in the class description.
The entries in a legacy jar (the only entries) or in the unversioned section of a multi-release jar are directly under the top-most directory
The javadoc for Release.RUNTIME looks like it will have a javadoc link to jdk.Version but that is a JDK-specific API. Could the text starting "The effective runtime ..." move to an @implNote?
Okay
I assume @since will be added to Release before this is pushed.
Yes.
The new JarFile ctor doesn't seem to handle the case that version is null when multi release is forced. Also I assume it should specify @throws SecurityException.
Both will be fixed.
Could the legacy JarFile ctor be changed to this(file, verify, mode, Release.ROOT) to avoid duplication?
Yes.
I don't have time to do a detailed pass over the updated tests but I wonder if SimpleHttpServer is really a candidate to put in the testlibrary tree. It looks like it is very specific to multi-release JARs and so I would expect to be co-located with those tests rather than being a hazard in the testlibrary tree.
It’s in the testlibrary under java/util/jar with the other multi-release specific test “helper” classes. I could make it even more specific by putting it under a java/util/jar/multi-release directory
A small comment is that it would be good to fix the really long lines before these changes are pushed. This will help future side-by-side reviews where it would be otherwise annoying to have to do horizontal scrolling to view diffs.
Do we really have to stick with 80 column hollerith card semantics? Even that was changed to 96 columns about 50 years ago. The one line, other than some “fixmes" that will be removed when JEP 223 is integrated, that exceeds 96 characters long will be changed by wrapping it to 94 columns.
On 21/01/2016 18:02, Steve Drach wrote:
: I suspected this is a bike shed candidate. I think Release._9 is nicer and it conveys the same information in a less cluttered way than Release.RELEASE_9. Yes a bike shed, I'm just saying that Release._9 looks odd/inconsistent when we have SourceVersion.RELEASE_9 elsewhere. Maybe there has been discussion on this topic already. With a static import then RELEASE_9 isn't too bad.
: The entries in a legacy jar (the only entries) or in the unversioned section of a multi-release jar are directly under the top-most directory All I'm saying is that Release.ROOT doesn't feel quite right, esp. when ROOT is defined as the unversioned entries.
:
I don't have time to do a detailed pass over the updated tests but I wonder if SimpleHttpServer is really a candidate to put in the testlibrary tree. It looks like it is very specific to multi-release JARs and so I would expect to be co-located with those tests rather than being a hazard in the testlibrary tree. It’s in the testlibrary under java/util/jar with the other multi-release specific test “helper” classes. I could make it even more specific by putting it under a java/util/jar/multi-release directory Yes, it needs to move to somewhere specific because it's not general purpose.
: Do we really have to stick with 80 column hollerith card semantics? Even that was changed to 96 columns about 50 years ago. The one line, other than some “fixmes" that will be removed when JEP 223 is integrated, that exceeds 96 characters long will be changed by wrapping it to 94 columns. I didn't mention 80. If you looks at the sdiffs for URLClassPath and JarFile when the outliers should be obvious. All I can suggest is to keep thing consistent with the existing code where possible.
-Alan
I suspected this is a bike shed candidate. I think Release._9 is nicer and it conveys the same information in a less cluttered way than Release.RELEASE_9. Yes a bike shed, I'm just saying that Release._9 looks odd/inconsistent when we have SourceVersion.RELEASE_9 elsewhere. Maybe there has been discussion on this topic already. With a static import then RELEASE_9 isn't too bad.
I’ll leave this as an open issue for awhile in case I get another reviewer that feels as strongly about it you do, or as I do.
: The entries in a legacy jar (the only entries) or in the unversioned section of a multi-release jar are directly under the top-most directory All I'm saying is that Release.ROOT doesn't feel quite right, esp. when ROOT is defined as the unversioned entries.
How about Release.BASE for base entries?
:
I don't have time to do a detailed pass over the updated tests but I wonder if SimpleHttpServer is really a candidate to put in the testlibrary tree. It looks like it is very specific to multi-release JARs and so I would expect to be co-located with those tests rather than being a hazard in the testlibrary tree. It’s in the testlibrary under java/util/jar with the other multi-release specific test “helper” classes. I could make it even more specific by putting it under a java/util/jar/multi-release directory Yes, it needs to move to somewhere specific because it's not general purpose.
I moved it to the same file as the test itself, making it a package-private class
Do we really have to stick with 80 column hollerith card semantics? Even that was changed to 96 columns about 50 years ago. The one line, other than some “fixmes" that will be removed when JEP 223 is integrated, that exceeds 96 characters long will be changed by wrapping it to 94 columns. I didn't mention 80. If you looks at the sdiffs for URLClassPath and JarFile when the outliers should be obvious. All I can suggest is to keep thing consistent with the existing code where possible.
I only found one instance in JarFile and one instance in URLClassPath that seemed too long. I wrapped them both to be less than 95 columns. Even for short lines when using sdiff I sometimes need to “left scroll” — not sure that making it sdiff friendly is a reasonable constraint.
On Jan 21, 2016, at 3:49 PM, Steve Drach <steve.drach@oracle.com> wrote:
I suspected this is a bike shed candidate. I think Release._9 is nicer and it conveys the same information in a less cluttered way than Release.RELEASE_9. Yes a bike shed, I'm just saying that Release._9 looks odd/inconsistent when we have SourceVersion.RELEASE_9 elsewhere. Maybe there has been discussion on this topic already. With a static import then RELEASE_9 isn't too bad.
I’ll leave this as an open issue for awhile in case I get another reviewer that feels as strongly about it you do, or as I do.
I only started looking at some files on the webrev. Release._9 catches my attention too and it looks very odd. I think RELEASE_9 is a much better constant name than _9. Mandy
On 22 Jan 2016, at 01:39, Mandy Chung <mandy.chung@oracle.com> wrote:
On Jan 21, 2016, at 3:49 PM, Steve Drach <steve.drach@oracle.com> wrote:
I suspected this is a bike shed candidate. I think Release._9 is nicer and it conveys the same information in a less cluttered way than Release.RELEASE_9. Yes a bike shed, I'm just saying that Release._9 looks odd/inconsistent when we have SourceVersion.RELEASE_9 elsewhere. Maybe there has been discussion on this topic already. With a static import then RELEASE_9 isn't too bad.
I’ll leave this as an open issue for awhile in case I get another reviewer that feels as strongly about it you do, or as I do.
I only started looking at some files on the webrev. Release._9 catches my attention too and it looks very odd. I think RELEASE_9 is a much better constant name than _9.
While there is a some naming activity over what to call such constants, i think the use a ‘_’ as the first character of a public API artefact should be strongly discouraged, such usages are more commonly associated with internal artefacts or generated code and using such a style for public artefacts sets a “bad" precedence IMO (first use in the Java APIs AFAICT). There needs to be a really strong justification for such public use and at the moment i don’t see one here. Paul.
Hi Alan, et. al., I’ve released a new webrev that addresses all the issues you raised. http://cr.openjdk.java.net/~sdrach/8132734/webrev.03/index.html <http://cr.openjdk.java.net/~sdrach/8132734/webrev.03/index.html> Specifically:
For Release then I have to admit that I dislike _9 and wonder if other options were considered? javax.lang.model.SourceVersion uses the RELEASE_xx convention for example.
Changed to VERSION_9, i.e. Release.VERSION_9
Also I wonder about Release.ROOT and whether Release.UNVERSIONED was considered? In general the phrase "root entry" in the javadoc makes me think the root or top-most directory. An alternative that might be clearer is to say "unversioned entry" and define that term clearly in the class description.
Changed to BASE, i.e. Release.BASE
The javadoc for Release.RUNTIME looks like it will have a javadoc link to jdk.Version but that is a JDK-specific API. Could the text starting "The effective runtime ..." move to an @implNote?
Done
I assume @since will be added to Release before this is pushed.
Done
The new JarFile ctor doesn't seem to handle the case that version is null when multi release is forced. Also I assume it should specify @throws SecurityException.
Yes, both done and added more @throws clauses
Could the legacy JarFile ctor be changed to this(file, verify, mode, Release.ROOT) to avoid duplication?
Done
I don't have time to do a detailed pass over the updated tests but I wonder if SimpleHttpServer is really a candidate to put in the testlibrary tree. It looks like it is very specific to multi-release JARs and so I would expect to be co-located with those tests rather than being a hazard in the testlibrary tree.
Moved SimpleHttpServer into the test that uses it, MultiReleaseJarHttpProperties
A small comment is that it would be good to fix the really long lines before these changes are pushed. This will help future side-by-side reviews where it would be otherwise annoying to have to do horizontal scrolling to view diffs.
I think I fixed this
On 22/01/2016 23:10, Steve Drach wrote:
Hi Alan, et. al.,
I’ve released a new webrev that addresses all the issues you raised.
http://cr.openjdk.java.net/~sdrach/8132734/webrev.03/index.html <http://cr.openjdk.java.net/%7Esdrach/8132734/webrev.03/index.html>
Specifically:
For Release then I have to admit that I dislike _9 and wonder if other options were considered? javax.lang.model.SourceVersion uses the RELEASE_xx convention for example.
Changed to VERSION_9, i.e. Release.VERSION_9
Also I wonder about Release.ROOT and whether Release.UNVERSIONED was considered? In general the phrase "root entry" in the javadoc makes me think the root or top-most directory. An alternative that might be clearer is to say "unversioned entry" and define that term clearly in the class description.
Changed to BASE, i.e. Release.BASE
This looks better. Release.BASE is probably okay although it still feels like Release.UNVERSIONED, esp. when it is defined as "Represents unversioned entries". I'm still wondering about the phrase "root entry" as it continues to give the impression (to me anyway) that it's a resource in the root directory. I think "root" works in the JEP because it deals with simple resources like A.class and B.class that are in the root directory but it's confusing when there resources with a slash in the name. Add to this is the META-INF/versions/<n> directories which are roots for the version specific resources. I think part of the confusion is that the first mention of "root entry" is in the second paragraph where it has "overrides the unversioned root entry" without defining what it means. In summary, I'm wondering whether you would be up for change the terminology so that "root entry" isn't in the javadoc? -Alan.
On Jan 25, 2016, at 8:54 AM, Alan Bateman <Alan.Bateman@oracle.com> wrote:
Somehow I missed this, sorry for the delayed response.
Changed to BASE, i.e. Release.BASE
This looks better. Release.BASE is probably okay although it still feels like Release.UNVERSIONED, esp. when it is defined as "Represents unversioned entries”.
Base entries imply to me the entries that are the “base” of the jar file. All multi-release jar files have to have a set of base entries that, as a whole, export the public API of the jar file (whether it’s multi-release or not). Versioned entries “override” base entries. I could have said “Represents base entries” but that seems a little circular. Actually base entries are the set of root entries minus the set of entries in the META-INF/versions directory ;-)
I'm still wondering about the phrase "root entry" as it continues to give the impression (to me anyway) that it's a resource in the root directory.
To me they are a resource in the root directory, but I see your point.
I think "root" works in the JEP because it deals with simple resources like A.class and B.class that are in the root directory but it's confusing when there resources with a slash in the name. Add to this is the META-INF/versions/<n> directories which are roots for the version specific resources. I think part of the confusion is that the first mention of "root entry" is in the second paragraph where it has "overrides the unversioned root entry" without defining what it means. In summary, I'm wondering whether you would be up for change the terminology so that "root entry" isn't in the javadoc?
Let me see what I can do.
I'm still wondering about the phrase "root entry" as it continues to give the impression (to me anyway) that it's a resource in the root directory. I think "root" works in the JEP because it deals with simple resources like A.class and B.class that are in the root directory but it's confusing when there resources with a slash in the name. Add to this is the META-INF/versions/<n> directories which are roots for the version specific resources. I think part of the confusion is that the first mention of "root entry" is in the second paragraph where it has "overrides the unversioned root entry" without defining what it means. In summary, I'm wondering whether you would be up for change the terminology so that "root entry" isn't in the javadoc?
I’ve released a new webrev, http://cr.openjdk.java.net/~sdrach/8132734/webrev.04/index.html <http://cr.openjdk.java.net/~sdrach/8132734/webrev.04/index.html> that addresses the above issue.
+1 Paul.
On 28 Jan 2016, at 00:37, Steve Drach <steve.drach@oracle.com> wrote:
I'm still wondering about the phrase "root entry" as it continues to give the impression (to me anyway) that it's a resource in the root directory. I think "root" works in the JEP because it deals with simple resources like A.class and B.class that are in the root directory but it's confusing when there resources with a slash in the name. Add to this is the META-INF/versions/<n> directories which are roots for the version specific resources. I think part of the confusion is that the first mention of "root entry" is in the second paragraph where it has "overrides the unversioned root entry" without defining what it means. In summary, I'm wondering whether you would be up for change the terminology so that "root entry" isn't in the javadoc?
I’ve released a new webrev, http://cr.openjdk.java.net/~sdrach/8132734/webrev.04/index.html <http://cr.openjdk.java.net/~sdrach/8132734/webrev.04/index.html> that addresses the above issue.
Steve, What's the purpose of having a dedicated "JarFile.runtimeVersioned"? Based on its only usages at ln#356 and #381, it appears, shouldn't getVersion() simply returns Release.valueOf(version)? sherman On 01/27/2016 03:37 PM, Steve Drach wrote:
I'm still wondering about the phrase "root entry" as it continues to give the impression (to me anyway) that it's a resource in the root directory. I think "root" works in the JEP because it deals with simple resources like A.class and B.class that are in the root directory but it's confusing when there resources with a slash in the name. Add to this is the META-INF/versions/<n> directories which are roots for the version specific resources. I think part of the confusion is that the first mention of "root entry" is in the second paragraph where it has "overrides the unversioned root entry" without defining what it means. In summary, I'm wondering whether you would be up for change the terminology so that "root entry" isn't in the javadoc? I’ve released a new webrev, http://cr.openjdk.java.net/~sdrach/8132734/webrev.04/index.html<http://cr.openjdk.java.net/~sdrach/8132734/webrev.04/index.html> that addresses the above issue.
What's the purpose of having a dedicated "JarFile.runtimeVersioned”?
Consistency, so getVersion returns the same version as the version set when JarFile was constructed.
Based on its only usages at ln#356 and #381, it appears, shouldn't getVersion() simply returns Release.valueOf(version)?
That might be confusing. Also, having it consistent helps with testing.
sherman
On 01/27/2016 03:37 PM, Steve Drach wrote:
I'm still wondering about the phrase "root entry" as it continues to give the impression (to me anyway) that it's a resource in the root directory. I think "root" works in the JEP because it deals with simple resources like A.class and B.class that are in the root directory but it's confusing when there resources with a slash in the name. Add to this is the META-INF/versions/<n> directories which are roots for the version specific resources. I think part of the confusion is that the first mention of "root entry" is in the second paragraph where it has "overrides the unversioned root entry" without defining what it means. In summary, I'm wondering whether you would be up for change the terminology so that "root entry" isn't in the javadoc? I’ve released a new webrev, http://cr.openjdk.java.net/~sdrach/8132734/webrev.04/index.html<http://cr.openjdk.java.net/~sdrach/8132734/webrev.04/index.html> that addresses the above issue.
On 01/28/2016 11:28 AM, Steve Drach wrote:
What's the purpose of having a dedicated "JarFile.runtimeVersioned”? Consistency, so getVersion returns the same version as the version set when JarFile was constructed.
Based on its only usages at ln#356 and #381, it appears, shouldn't getVersion() simply returns Release.valueOf(version)? That might be confusing. Also, having it consistent helps with testing.
144 private final int version; 145 private final boolean runtimeVersioned; ... 355 this.version = MULTI_RELEASE_FORCED ? RUNTIME_VERSION : version.value(); 356 this.runtimeVersioned = version == Release.RUNTIME; ... 381 return runtimeVersioned ? Release.RUNTIME : Release.valueOf(version); What I meant is that if "version" is final and RUNTIME_VERSION is static final, why do we need the middle-man "runtimeVersioned". And it appears the implementation is doing at 381 is always actually to return Release.valueOf(version)? It returns Release.valueOf(version) if "runtimeVersioned" is false. But in case of "runtimeVersioned" is true, it means "version == Release.RUNTIME" at ln#356, so you can also return Release.valueOf(version)? I'm confused, what did I miss? sherman
sherman
On 01/27/2016 03:37 PM, Steve Drach wrote:
I'm still wondering about the phrase "root entry" as it continues to give the impression (to me anyway) that it's a resource in the root directory. I think "root" works in the JEP because it deals with simple resources like A.class and B.class that are in the root directory but it's confusing when there resources with a slash in the name. Add to this is the META-INF/versions/<n> directories which are roots for the version specific resources. I think part of the confusion is that the first mention of "root entry" is in the second paragraph where it has "overrides the unversioned root entry" without defining what it means. In summary, I'm wondering whether you would be up for change the terminology so that "root entry" isn't in the javadoc? I’ve released a new webrev, http://cr.openjdk.java.net/~sdrach/8132734/webrev.04/index.html<http://cr.openjdk.java.net/~sdrach/8132734/webrev.04/index.html> that addresses the above issue.
On 01/28/2016 12:26 PM, Xueming Shen wrote:
On 01/28/2016 11:28 AM, Steve Drach wrote:
What's the purpose of having a dedicated "JarFile.runtimeVersioned”? Consistency, so getVersion returns the same version as the version set when JarFile was constructed.
Based on its only usages at ln#356 and #381, it appears, shouldn't getVersion() simply returns Release.valueOf(version)? That might be confusing. Also, having it consistent helps with testing.
144 private final int version; 145 private final boolean runtimeVersioned; ... 355 this.version = MULTI_RELEASE_FORCED ? RUNTIME_VERSION : version.value(); 356 this.runtimeVersioned = version == Release.RUNTIME; ... 381 return runtimeVersioned ? Release.RUNTIME : Release.valueOf(version);
What I meant is that if "version" is final and RUNTIME_VERSION is static final, why do we need the middle-man "runtimeVersioned". And it appears the implementation is doing at 381 is always actually to return Release.valueOf(version)?
It returns Release.valueOf(version) if "runtimeVersioned" is false. But in case of "runtimeVersioned" is true, it means "version == Release.RUNTIME" at ln#356, so you can also return Release.valueOf(version)? I'm confused, what did I miss?
my bad. the "version" at #356 is not that final one.
sherman
sherman
On 01/27/2016 03:37 PM, Steve Drach wrote:
I'm still wondering about the phrase "root entry" as it continues to give the impression (to me anyway) that it's a resource in the root directory. I think "root" works in the JEP because it deals with simple resources like A.class and B.class that are in the root directory but it's confusing when there resources with a slash in the name. Add to this is the META-INF/versions/<n> directories which are roots for the version specific resources. I think part of the confusion is that the first mention of "root entry" is in the second paragraph where it has "overrides the unversioned root entry" without defining what it means. In summary, I'm wondering whether you would be up for change the terminology so that "root entry" isn't in the javadoc? I’ve released a new webrev, http://cr.openjdk.java.net/~sdrach/8132734/webrev.04/index.html<http://cr.openjdk.java.net/~sdrach/8132734/webrev.04/index.html> that addresses the above issue.
What's the purpose of having a dedicated "JarFile.runtimeVersioned”? Consistency, so getVersion returns the same version as the version set when JarFile was constructed.
Based on its only usages at ln#356 and #381, it appears, shouldn't getVersion() simply returns Release.valueOf(version)? That might be confusing. Also, having it consistent helps with testing.
144 private final int version; 145 private final boolean runtimeVersioned; ... 355 this.version = MULTI_RELEASE_FORCED ? RUNTIME_VERSION : version.value(); 356 this.runtimeVersioned = version == Release.RUNTIME;
Here “version” is a local argument and not equal to this.version. It can take on the values Release.BASE, Release.VERSION_9, and Release.RUNTIME. In the future we will see additional values like Release.VERSION_10 and Release.VERSION_11, etc. but this.version can only have values of 8 and 9, and in the future 10, 11, etc. this.version is a private implementation value and not publicly available.
... 381 return runtimeVersioned ? Release.RUNTIME : Release.valueOf(version);
the private method Release.valueOf(version) can only return Release.BASE and Release.VERSION_9
What I meant is that if "version" is final and RUNTIME_VERSION is static final, why do we need the middle-man "runtimeVersioned”.
So that getVersion will return Release.RUNTIME if the JarFile object was constructed with a Release.RUNTIME argument.
And it appears the implementation is doing at 381 is always actually to return Release.valueOf(version)?
Except that it will return Release.RUNTIME if the private boolean runtimeVersioned is true
It returns Release.valueOf(version) if "runtimeVersioned" is false. But in case of "runtimeVersioned" is true, it means "version == Release.RUNTIME" at ln#356,
Yes, that what it means, keeping in mind the local variable version is not equal to this.version
so you can also return Release.valueOf(version)?
You could but then one could construct a JarFile object with Release.RUNTIME as it’s version, but getVersion would return Release.VERSION_9. That’s what I meant by consistency, or in this case inconsistency.
I'm confused, what did I miss?
I think you are focussing on a minor implementation detail. Internally versions are small integer values, but externally, publicly, they are values of the Release Enum.
sherman
sherman
On 01/27/2016 03:37 PM, Steve Drach wrote:
I'm still wondering about the phrase "root entry" as it continues to give the impression (to me anyway) that it's a resource in the root directory. I think "root" works in the JEP because it deals with simple resources like A.class and B.class that are in the root directory but it's confusing when there resources with a slash in the name. Add to this is the META-INF/versions/<n> directories which are roots for the version specific resources. I think part of the confusion is that the first mention of "root entry" is in the second paragraph where it has "overrides the unversioned root entry" without defining what it means. In summary, I'm wondering whether you would be up for change the terminology so that "root entry" isn't in the javadoc? I’ve released a new webrev, http://cr.openjdk.java.net/~sdrach/8132734/webrev.04/index.html<http://cr.openjdk.java.net/~sdrach/8132734/webrev.04/index.html> that addresses the above issue.
Steve, 78 * <p>Class loaders that utilize {@code JarFile} to load classes from the 79 * contents of {@code JarFile} entries should construct the {@code JarFile} 80 * by invoking the {@link JarFile#JarFile(File, boolean, int, Release)} 81 * constructor with the value {@code Release.RUNTIME} assigned to the last 82 * argument. This assures that classes compatible with the major 83 * version of the running JVM are loaded from multi-release jar files. What should be the expected result in scenario that the "multi-release" jar file actually puts all latest version entries in the base and early version entries in the "versions/n" directory? For example, in jdk 10, someone generates a mr-jar file with all classes for jdk 10 at root, but a set of jdk9 implementation/ version entries under "versions/9" (assume we don't have class major version compatible issue, both compiled with early version of javac). Those jdk9 version classes will always get picked, even open with "Relese.VERSION_10"? 59 * "META-INF/versions" directory. The versioned entries are partitioned by the 60 * major version of Java platform releases, starting with release 9. A 61 * versioned entry, with a version {@code n}, {@code 8 < n}, in the 62 * "META-INF/versions/{n}" directory overrides the base entry as well as any 63 * entry with a version number {@code i} where {@code 8 < i < n}. So the "versions/{n} always overrides the "base entry", as actually we don't have any info regarding the "version" of those base entries. Maybe I'm missing something here? -Sherman On 01/28/2016 11:09 AM, Xueming Shen wrote:
Steve,
What's the purpose of having a dedicated "JarFile.runtimeVersioned"? Based on its only usages at ln#356 and #381, it appears, shouldn't getVersion() simply returns Release.valueOf(version)?
sherman
On 01/27/2016 03:37 PM, Steve Drach wrote:
I'm still wondering about the phrase "root entry" as it continues to give the impression (to me anyway) that it's a resource in the root directory. I think "root" works in the JEP because it deals with simple resources like A.class and B.class that are in the root directory but it's confusing when there resources with a slash in the name. Add to this is the META-INF/versions/<n> directories which are roots for the version specific resources. I think part of the confusion is that the first mention of "root entry" is in the second paragraph where it has "overrides the unversioned root entry" without defining what it means. In summary, I'm wondering whether you would be up for change the terminology so that "root entry" isn't in the javadoc? I’ve released a new webrev, http://cr.openjdk.java.net/~sdrach/8132734/webrev.04/index.html<http://cr.openjdk.java.net/~sdrach/8132734/webrev.04/index.html> that addresses the above issue.
78 * <p>Class loaders that utilize {@code JarFile} to load classes from the 79 * contents of {@code JarFile} entries should construct the {@code JarFile} 80 * by invoking the {@link JarFile#JarFile(File, boolean, int, Release)} 81 * constructor with the value {@code Release.RUNTIME} assigned to the last 82 * argument. This assures that classes compatible with the major 83 * version of the running JVM are loaded from multi-release jar files.
What should be the expected result in scenario that the "multi-release" jar file actually puts all latest version entries in the base and early version entries in the "versions/n" directory? For example, in jdk 10, someone generates a mr-jar file with all classes for jdk 10 at root, but a set of jdk9 implementation/ version entries under "versions/9" (assume we don't have class major version compatible issue, both compiled with early version of javac). Those jdk9 version classes will always get picked, even open with "Relese.VERSION_10”?
There really is no way to compel a jar file creator to make a valid multi-release jar file, and there is no way to check a multi-release jar file to assure it’s valid. While one might be tempted to use the class file version as an indicator of what release it belongs to, that is not necessarily accurate or meaningful. Likewise, resource files can not be partitioned into release version based on content. While the new jar tool will have the capability to assure versioned classes have the same public api as base entry classes and that identical classes aren’t stored in both the base and versioned sections, it too depends on the input being correctly classified by the jar tool user. Garbage in, garbage out.
59 * "META-INF/versions" directory. The versioned entries are partitioned by the 60 * major version of Java platform releases, starting with release 9. A 61 * versioned entry, with a version {@code n}, {@code 8 < n}, in the 62 * "META-INF/versions/{n}" directory overrides the base entry as well as any 63 * entry with a version number {@code i} where {@code 8 < i < n}.
So the "versions/{n} always overrides the "base entry", as actually we don't have any info regarding the "version" of those base entries. Maybe I'm missing something here?
You are not missing anything. If one constructs a JarFile with a Release other than Release.BASE, a base entry will be overridden by a versioned entry if one is found. The bottom line is that the creator of a multi-release jar file must take responsibility for creating a valid multi-release jar. Jar tool will help point out some inconsistencies but can’t actually determine if an entry has been assigned to the right release version. And, of course, one does not need jar tool to construct a multi-release jar file — see jdk/test/lib/testlibrary/java/util/jar/CreateMultiReleaseTestJars.java
-Sherman
On 01/28/2016 11:09 AM, Xueming Shen wrote:
Steve,
What's the purpose of having a dedicated "JarFile.runtimeVersioned"? Based on its only usages at ln#356 and #381, it appears, shouldn't getVersion() simply returns Release.valueOf(version)?
sherman
On 01/27/2016 03:37 PM, Steve Drach wrote:
I'm still wondering about the phrase "root entry" as it continues to give the impression (to me anyway) that it's a resource in the root directory. I think "root" works in the JEP because it deals with simple resources like A.class and B.class that are in the root directory but it's confusing when there resources with a slash in the name. Add to this is the META-INF/versions/<n> directories which are roots for the version specific resources. I think part of the confusion is that the first mention of "root entry" is in the second paragraph where it has "overrides the unversioned root entry" without defining what it means. In summary, I'm wondering whether you would be up for change the terminology so that "root entry" isn't in the javadoc? I’ve released a new webrev, http://cr.openjdk.java.net/~sdrach/8132734/webrev.04/index.html<http://cr.openjdk.java.net/~sdrach/8132734/webrev.04/index.html> that addresses the above issue.
On 27/01/2016 23:37, Steve Drach wrote:
I'm still wondering about the phrase "root entry" as it continues to give the impression (to me anyway) that it's a resource in the root directory. I think "root" works in the JEP because it deals with simple resources like A.class and B.class that are in the root directory but it's confusing when there resources with a slash in the name. Add to this is the META-INF/versions/<n> directories which are roots for the version specific resources. I think part of the confusion is that the first mention of "root entry" is in the second paragraph where it has "overrides the unversioned root entry" without defining what it means. In summary, I'm wondering whether you would be up for change the terminology so that "root entry" isn't in the javadoc?
I’ve released a new webrev, http://cr.openjdk.java.net/~sdrach/8132734/webrev.04/index.html <http://cr.openjdk.java.net/%7Esdrach/8132734/webrev.04/index.html> that addresses the above issue.
Thank you, the javadoc is much clearer and readable now. The first mention of multi-release JARs is in the first paragraph where it has "for processing multi-release jar files". I would be tempted to drop this from the first paragraph because the second paragraph introduces the concept. In the second paragraph is has "The versioned entries are partitioned by the major version of Java platform releases, starting with release 9" and then it goes on to explain how versioned entries are partitioned. This has potential to confuse the reader as it gives an initial impression that the oldest version entry is 9 but then the says 8 < n. I realize the text is trying to say that Java SE 9 is the first release to support this but it could be confused. I would be tempted to just drop the mention of release 9 in this paragraph. This may have come up before but JarFile has two sets of constructors, one takes the file path as a String, the other as a File. I just wondering about introduce a second constructor so that they match. One other thing that I've been wondering about is the stream() and entries() methods. Has there been any discussion about these doing filter? Maybe it is too expensive and best left to the user of the API? Part of the context for asking is modular JARs of course. -Alan
I’ve released a new webrev, http://cr.openjdk.java.net/~sdrach/8132734/webrev.04/index.html <http://cr.openjdk.java.net/%7Esdrach/8132734/webrev.04/index.html> that addresses the above issue.
Thank you, the javadoc is much clearer and readable now.
The first mention of multi-release JARs is in the first paragraph where it has "for processing multi-release jar files". I would be tempted to drop this from the first paragraph because the second paragraph introduces the concept.
It makes sense to leave it in there because of the context. It’s one of 2 things that differentiate JarFile from ZipFile .
In the second paragraph is has "The versioned entries are partitioned by the major version of Java platform releases, starting with release 9" and then it goes on to explain how versioned entries are partitioned. This has potential to confuse the reader as it gives an initial impression that the oldest version entry is 9 but then the says 8 < n. I realize the text is trying to say that Java SE 9 is the first release to support this but it could be confused. I would be tempted to just drop the mention of release 9 in this paragraph.
I’ll do that.
This may have come up before but JarFile has two sets of constructors, one takes the file path as a String, the other as a File. I just wondering about introduce a second constructor so that they match.
We felt that one constructor was sufficient on the theory that it’s use will be infrequent, at least initially. And it’s easy to map a String to a File.
One other thing that I've been wondering about is the stream() and entries() methods. Has there been any discussion about these doing filter?
Not that I’m aware of.
Maybe it is too expensive and best left to the user of the API? Part of the context for asking is modular JARs of course.
Perhaps you can elaborate.
On 29 Jan 2016, at 18:27, Steve Drach <steve.drach@oracle.com> wrote:
This may have come up before but JarFile has two sets of constructors, one takes the file path as a String, the other as a File. I just wondering about introduce a second constructor so that they match.
We felt that one constructor was sufficient on the theory that it’s use will be infrequent, at least initially. And it’s easy to map a String to a File.
Right, my preference is to stick to one for the moment and add another later if really needed.
One other thing that I've been wondering about is the stream() and entries() methods. Has there been any discussion about these doing filter?
Not that I’m aware of.
Maybe it is too expensive and best left to the user of the API? Part of the context for asking is modular JARs of course.
Perhaps you can elaborate.
Alan’s point is that traversing using entries()/stream() always returns the versioned entries (if any) rather than all entries, thus in a sense filters. My assumption was the traversal should by default be consistent with a calls to getEntry, thus: jarFile.stream().forEach(e -> { JarEntry je = jarFile.getJarEntry(e.getName()); assert e.equals(je); }); There might need to be another stream method that returns all entries. Paul.
One other thing that I've been wondering about is the stream() and entries() methods. Has there been any discussion about these doing filter?
Not that I’m aware of.
Maybe it is too expensive and best left to the user of the API? Part of the context for asking is modular JARs of course.
Perhaps you can elaborate.
Alan’s point is that traversing using entries()/stream() always returns the versioned entries (if any) rather than all entries, thus in a sense filters.
My assumption was the traversal should by default be consistent with a calls to getEntry, thus:
jarFile.stream().forEach(e -> { JarEntry je = jarFile.getJarEntry(e.getName()); assert e.equals(je); });
There might need to be another stream method that returns all entries.
JarFile.entries() and JarFile.stream() return all the entries; they are not filtered. Your example would work if the JarFile is constructed without a Release argument or with the Release.BASE argument. The assertion would fail if it’s constructed with any other Release argument. Adding a filter and a map to the stream pipeline would make your example succeed with all values of Release.
On 29/01/2016 17:39, Paul Sandoz wrote:
: Alan’s point is that traversing using entries()/stream() always returns the versioned entries (if any) rather than all entries, thus in a sense filters.
My assumption was the traversal should by default be consistent with a calls to getEntry, thus:
jarFile.stream().forEach(e -> { JarEntry je = jarFile.getJarEntry(e.getName()); assert e.equals(je); });
There might need to be another stream method that returns all entries.
Right, I'm mostly just wondering if entries()/streams() should override the entries in the stream with versioned entries and filter out the META-INF/versions/ tree. If I've gone to trouble of specifying the a Release then it seems the right thing to do. On the other hand, it comes at a cost and there will be use-cases like "get the names of all entries" that would be more efficient to just build on the current entries()/stream(). I'm loath to suggest this might need a new method but it might be one of the options to consider here. Minimally there is a javadoc to specify on how these methods behave when the JAR is multi-release and opened by specifying a release. -Alan
On Jan 30, 2016, at 12:00 AM, Alan Bateman <Alan.Bateman@oracle.com> wrote:
On 29/01/2016 17:39, Paul Sandoz wrote:
: Alan’s point is that traversing using entries()/stream() always returns the versioned entries (if any) rather than all entries, thus in a sense filters.
My assumption was the traversal should by default be consistent with a calls to getEntry, thus:
jarFile.stream().forEach(e -> { JarEntry je = jarFile.getJarEntry(e.getName()); assert e.equals(je); });
There might need to be another stream method that returns all entries.
Right, I'm mostly just wondering if entries()/streams() should override the entries in the stream with versioned entries and filter out the META-INF/versions/ tree.
I don’t think so. That kind of behavior might be difficult to understand. Returning all the entries provides some flexibility. One can write code like this: jarfile.stream().map(JarEntry::getName).filter(s -> !s.startsWith(“META-INF”)).map(JarFile::getJarEntry).etc to get the versioned results for any version you specify when constructing the JarFile.
If I've gone to trouble of specifying the a Release then it seems the right thing to do. On the other hand, it comes at a cost and there will be use-cases like "get the names of all entries" that would be more efficient to just build on the current entries()/stream(). I'm loath to suggest this might need a new method but it might be one of the options to consider here. Minimally there is a javadoc to specify on how these methods behave when the JAR is multi-release and opened by specifying a release.
How’s this? diff -r 68867430065b src/java.base/share/classes/java/util/jar/JarFile.java --- a/src/java.base/share/classes/java/util/jar/JarFile.java Fri Jan 29 12:34:44 2016 -0800 +++ b/src/java.base/share/classes/java/util/jar/JarFile.java Mon Feb 01 09:48:05 2016 -0800 @@ -576,9 +576,11 @@ } /** - * Returns an enumeration of the jar file entries. + * Returns an enumeration of all the jar file entries. Constructing this + * JarFile with the {@link JarFile#JarFile(File, boolean, int, Release)} + * constructor does not modify the behavior of this method. * - * @return an enumeration of the jar file entries + * @return an enumeration of the all jar file entries * @throws IllegalStateException * may be thrown if the jar file has been closed */ @@ -587,11 +589,13 @@ } /** - * Returns an ordered {@code Stream} over the jar file entries. + * Returns an ordered {@code Stream} over all the jar file entries. * Entries appear in the {@code Stream} in the order they appear in - * the central directory of the jar file. + * the central directory of the jar file. Constructing this + * JarFile with the {@link JarFile#JarFile(File, boolean, int, Release)} + * constructor does not modify the behavior of this method. * - * @return an ordered {@code Stream} of entries in this jar file + * @return an ordered {@code Stream} of all entries in this jar file * @throws IllegalStateException if the jar file has been closed * @since 1.8 */
On 02/01/2016 09:58 AM, Steve Drach wrote:
On Jan 30, 2016, at 12:00 AM, Alan Bateman<Alan.Bateman@oracle.com> wrote:
On 29/01/2016 17:39, Paul Sandoz wrote:
: Alan’s point is that traversing using entries()/stream() always returns the versioned entries (if any) rather than all entries, thus in a sense filters.
My assumption was the traversal should by default be consistent with a calls to getEntry, thus:
jarFile.stream().forEach(e -> { JarEntry je = jarFile.getJarEntry(e.getName()); assert e.equals(je); });
There might need to be another stream method that returns all entries.
Right, I'm mostly just wondering if entries()/streams() should override the entries in the stream with versioned entries and filter out the META-INF/versions/ tree. I don’t think so. That kind of behavior might be difficult to understand. Returning all the entries provides some flexibility. One can write code like this:
jarfile.stream().map(JarEntry::getName).filter(s -> !s.startsWith(“META-INF”)).map(JarFile::getJarEntry).etc
to get the versioned results for any version you specify when constructing the JarFile.
The current specification treats those class files under meta-inf/releases like kind of "metadata" of those base entries. Ideally those files should not even be individual "files", but part of their corresponding entries. The consumer of the MR-Jar should not need to be aware of these version-ed entries at all to use this MR-jar file to load classes/resources. From this point of view, these entries probably should be "invisible" from entries()/stream(), when the jar file is opened with "version-enabled". And all returned entries should have all their "data" (size, csize, timestamps, crc ...) pointed to the corresponding version-ed entries, withe the only exception is the "name". On the other hand it might be desired to keep JarFile.entries()/stream() asis to match their "zip file" perspective, to return "all" raw entries. Then it might also be desired to have an alternative "versioned streamVersion()" ... something like public Stream<JarEntry> stream(Release r); ? -sherman
If I've gone to trouble of specifying the a Release then it seems the right thing to do. On the other hand, it comes at a cost and there will be use-cases like "get the names of all entries" that would be more efficient to just build on the current entries()/stream(). I'm loath to suggest this might need a new method but it might be one of the options to consider here. Minimally there is a javadoc to specify on how these methods behave when the JAR is multi-release and opened by specifying a release. How’s this?
diff -r 68867430065b src/java.base/share/classes/java/util/jar/JarFile.java --- a/src/java.base/share/classes/java/util/jar/JarFile.java Fri Jan 29 12:34:44 2016 -0800 +++ b/src/java.base/share/classes/java/util/jar/JarFile.java Mon Feb 01 09:48:05 2016 -0800 @@ -576,9 +576,11 @@ }
/** - * Returns an enumeration of the jar file entries. + * Returns an enumeration of all the jar file entries. Constructing this + * JarFile with the {@link JarFile#JarFile(File, boolean, int, Release)} + * constructor does not modify the behavior of this method. * - * @return an enumeration of the jar file entries + * @return an enumeration of the all jar file entries * @throws IllegalStateException * may be thrown if the jar file has been closed */ @@ -587,11 +589,13 @@ }
/** - * Returns an ordered {@code Stream} over the jar file entries. + * Returns an ordered {@code Stream} over all the jar file entries. * Entries appear in the {@code Stream} in the order they appear in - * the central directory of the jar file. + * the central directory of the jar file. Constructing this + * JarFile with the {@link JarFile#JarFile(File, boolean, int, Release)} + * constructor does not modify the behavior of this method. * - * @return an ordered {@code Stream} of entries in this jar file + * @return an ordered {@code Stream} of all entries in this jar file * @throws IllegalStateException if the jar file has been closed * @since 1.8 */
Alan’s point is that traversing using entries()/stream() always returns the versioned entries (if any) rather than all entries, thus in a sense filters.
My assumption was the traversal should by default be consistent with a calls to getEntry, thus:
jarFile.stream().forEach(e -> { JarEntry je = jarFile.getJarEntry(e.getName()); assert e.equals(je); });
There might need to be another stream method that returns all entries.
Right, I'm mostly just wondering if entries()/streams() should override the entries in the stream with versioned entries and filter out the META-INF/versions/ tree. I don’t think so. That kind of behavior might be difficult to understand. Returning all the entries provides some flexibility. One can write code like this:
jarfile.stream().map(JarEntry::getName).filter(s -> !s.startsWith(“META-INF”)).map(JarFile::getJarEntry).etc
to get the versioned results for any version you specify when constructing the JarFile.
The current specification treats those class files under meta-inf/releases like kind of "metadata" of those base entries. Ideally those files should not even be individual "files", but part of their corresponding entries. The consumer of the MR-Jar should not need to be aware of these version-ed entries at all to use this MR-jar file to load classes/resources. From this point of view, these entries probably should be "invisible" from entries()/stream(), when the jar file is opened with "version-enabled". And all returned entries should have all their "data" (size, csize, timestamps, crc ...) pointed to the corresponding version-ed entries, withe the only exception is the "name".
On the other hand it might be desired to keep JarFile.entries()/stream() asis to match their "zip file" perspective, to return "all" raw entries. Then it might also be desired to have an alternative "versioned streamVersion()" …
It seems to that we have two reasonable alternatives: (1) return all entries, and (2) return all entries except those under the “META-INF/versions/“ directory and for any entries returned, return their versioned equivalent if it exists. If we choose alternative 2, we can still get alternative 1 by asking for JarFile.super.entries() and JarFile.super.stream(). Or we can do it both ways, leaving entries()/stream() as is and adding two new methods, versionedEntries() and versionedStream().
something like
public Stream<JarEntry> stream(Release r); ?
We should not parametrize the methods with a Release, because what does it mean if we construct the JarFile with one Release but specify a different Release for the stream argument. Parameterizing methods with a Release object feels like we’re starting to slide down a slippery slope. I think adding the two new methods is the “right” solution, but I’d like some consensus here.
-sherman
If I've gone to trouble of specifying the a Release then it seems the right thing to do. On the other hand, it comes at a cost and there will be use-cases like "get the names of all entries" that would be more efficient to just build on the current entries()/stream(). I'm loath to suggest this might need a new method but it might be one of the options to consider here. Minimally there is a javadoc to specify on how these methods behave when the JAR is multi-release and opened by specifying a release. How’s this?
diff -r 68867430065b src/java.base/share/classes/java/util/jar/JarFile.java --- a/src/java.base/share/classes/java/util/jar/JarFile.java Fri Jan 29 12:34:44 2016 -0800 +++ b/src/java.base/share/classes/java/util/jar/JarFile.java Mon Feb 01 09:48:05 2016 -0800 @@ -576,9 +576,11 @@ }
/** - * Returns an enumeration of the jar file entries. + * Returns an enumeration of all the jar file entries. Constructing this + * JarFile with the {@link JarFile#JarFile(File, boolean, int, Release)} + * constructor does not modify the behavior of this method. * - * @return an enumeration of the jar file entries + * @return an enumeration of the all jar file entries * @throws IllegalStateException * may be thrown if the jar file has been closed */ @@ -587,11 +589,13 @@ }
/** - * Returns an ordered {@code Stream} over the jar file entries. + * Returns an ordered {@code Stream} over all the jar file entries. * Entries appear in the {@code Stream} in the order they appear in - * the central directory of the jar file. + * the central directory of the jar file. Constructing this + * JarFile with the {@link JarFile#JarFile(File, boolean, int, Release)} + * constructor does not modify the behavior of this method. * - * @return an ordered {@code Stream} of entries in this jar file + * @return an ordered {@code Stream} of all entries in this jar file * @throws IllegalStateException if the jar file has been closed * @since 1.8 */
I’m sorry, I didn’t look at the code close enough before I started talking ;-) Right now entries()/stream() returns all entries and if the JarFile is constructed with a Release object != Release.BASE, the base entries that are returned are the versioned entries. I think this behavior is a bit confusing and we should just return all entries without regard to versioning. Then create the two new methods for specific versioned entries.
On Feb 1, 2016, at 12:18 PM, Steve Drach <steve.drach@oracle.com> wrote:
Alan’s point is that traversing using entries()/stream() always returns the versioned entries (if any) rather than all entries, thus in a sense filters.
My assumption was the traversal should by default be consistent with a calls to getEntry, thus:
jarFile.stream().forEach(e -> { JarEntry je = jarFile.getJarEntry(e.getName()); assert e.equals(je); });
There might need to be another stream method that returns all entries.
Right, I'm mostly just wondering if entries()/streams() should override the entries in the stream with versioned entries and filter out the META-INF/versions/ tree. I don’t think so. That kind of behavior might be difficult to understand. Returning all the entries provides some flexibility. One can write code like this:
jarfile.stream().map(JarEntry::getName).filter(s -> !s.startsWith(“META-INF”)).map(JarFile::getJarEntry).etc
to get the versioned results for any version you specify when constructing the JarFile.
The current specification treats those class files under meta-inf/releases like kind of "metadata" of those base entries. Ideally those files should not even be individual "files", but part of their corresponding entries. The consumer of the MR-Jar should not need to be aware of these version-ed entries at all to use this MR-jar file to load classes/resources. From this point of view, these entries probably should be "invisible" from entries()/stream(), when the jar file is opened with "version-enabled". And all returned entries should have all their "data" (size, csize, timestamps, crc ...) pointed to the corresponding version-ed entries, withe the only exception is the "name".
On the other hand it might be desired to keep JarFile.entries()/stream() asis to match their "zip file" perspective, to return "all" raw entries. Then it might also be desired to have an alternative "versioned streamVersion()" …
It seems to that we have two reasonable alternatives: (1) return all entries, and (2) return all entries except those under the “META-INF/versions/“ directory and for any entries returned, return their versioned equivalent if it exists. If we choose alternative 2, we can still get alternative 1 by asking for JarFile.super.entries() and JarFile.super.stream().
Or we can do it both ways, leaving entries()/stream() as is and adding two new methods, versionedEntries() and versionedStream().
something like
public Stream<JarEntry> stream(Release r); ?
We should not parametrize the methods with a Release, because what does it mean if we construct the JarFile with one Release but specify a different Release for the stream argument. Parameterizing methods with a Release object feels like we’re starting to slide down a slippery slope.
I think adding the two new methods is the “right” solution, but I’d like some consensus here.
-sherman
If I've gone to trouble of specifying the a Release then it seems the right thing to do. On the other hand, it comes at a cost and there will be use-cases like "get the names of all entries" that would be more efficient to just build on the current entries()/stream(). I'm loath to suggest this might need a new method but it might be one of the options to consider here. Minimally there is a javadoc to specify on how these methods behave when the JAR is multi-release and opened by specifying a release. How’s this?
diff -r 68867430065b src/java.base/share/classes/java/util/jar/JarFile.java --- a/src/java.base/share/classes/java/util/jar/JarFile.java Fri Jan 29 12:34:44 2016 -0800 +++ b/src/java.base/share/classes/java/util/jar/JarFile.java Mon Feb 01 09:48:05 2016 -0800 @@ -576,9 +576,11 @@ }
/** - * Returns an enumeration of the jar file entries. + * Returns an enumeration of all the jar file entries. Constructing this + * JarFile with the {@link JarFile#JarFile(File, boolean, int, Release)} + * constructor does not modify the behavior of this method. * - * @return an enumeration of the jar file entries + * @return an enumeration of the all jar file entries * @throws IllegalStateException * may be thrown if the jar file has been closed */ @@ -587,11 +589,13 @@ }
/** - * Returns an ordered {@code Stream} over the jar file entries. + * Returns an ordered {@code Stream} over all the jar file entries. * Entries appear in the {@code Stream} in the order they appear in - * the central directory of the jar file. + * the central directory of the jar file. Constructing this + * JarFile with the {@link JarFile#JarFile(File, boolean, int, Release)} + * constructor does not modify the behavior of this method. * - * @return an ordered {@code Stream} of entries in this jar file + * @return an ordered {@code Stream} of all entries in this jar file * @throws IllegalStateException if the jar file has been closed * @since 1.8 */
I believe the answer should be based on your viewpoint of what "is" a JAR. Do you consider the JAR to be a dumb ZIP container or an artifact of the Java runtime? If it's the former, then return everything because the version folder is meaningless in this perspective. Otherwise, treat the JAR according to how Java runtime would load it, which is the versioned entries. Whatever your answer is, that should be the behavior for entries()/stream(). Those who don't like the behavior should be given a second set of methods to customize the return. Cheers, Paul On Mon, Feb 1, 2016 at 2:29 PM, Steve Drach <steve.drach@oracle.com> wrote:
I’m sorry, I didn’t look at the code close enough before I started talking ;-) Right now entries()/stream() returns all entries and if the JarFile is constructed with a Release object != Release.BASE, the base entries that are returned are the versioned entries. I think this behavior is a bit confusing and we should just return all entries without regard to versioning. Then create the two new methods for specific versioned entries.
On Feb 1, 2016, at 12:18 PM, Steve Drach <steve.drach@oracle.com> wrote:
Alan’s point is that traversing using entries()/stream() always returns the versioned entries (if any) rather than all entries, thus in a sense filters.
My assumption was the traversal should by default be consistent with a calls to getEntry, thus:
jarFile.stream().forEach(e -> { JarEntry je = jarFile.getJarEntry(e.getName()); assert e.equals(je); });
There might need to be another stream method that returns all entries.
Right, I'm mostly just wondering if entries()/streams() should override the entries in the stream with versioned entries and filter out the META-INF/versions/ tree. I don’t think so. That kind of behavior might be difficult to understand. Returning all the entries provides some flexibility. One can write code like this:
jarfile.stream().map(JarEntry::getName).filter(s -> !s.startsWith(“META-INF”)).map(JarFile::getJarEntry).etc
to get the versioned results for any version you specify when constructing the JarFile.
The current specification treats those class files under meta-inf/releases like kind of "metadata" of those base entries. Ideally those files should not even be individual "files", but part of their corresponding entries. The consumer of the MR-Jar should not need to be aware of these version-ed entries at all to use this MR-jar file to load classes/resources. From this point of view, these entries probably should be "invisible" from entries()/stream(), when the jar file is opened with "version-enabled". And all returned entries should have all their "data" (size, csize, timestamps, crc ...) pointed to the corresponding version-ed entries, withe the only exception is the "name".
On the other hand it might be desired to keep JarFile.entries()/stream() asis to match their "zip file" perspective, to return "all" raw entries. Then it might also be desired to have an alternative "versioned streamVersion()" …
It seems to that we have two reasonable alternatives: (1) return all entries, and (2) return all entries except those under the “META-INF/versions/“ directory and for any entries returned, return their versioned equivalent if it exists. If we choose alternative 2, we can still get alternative 1 by asking for JarFile.super.entries() and JarFile.super.stream().
Or we can do it both ways, leaving entries()/stream() as is and adding two new methods, versionedEntries() and versionedStream().
something like
public Stream<JarEntry> stream(Release r); ?
We should not parametrize the methods with a Release, because what does it mean if we construct the JarFile with one Release but specify a different Release for the stream argument. Parameterizing methods with a Release object feels like we’re starting to slide down a slippery slope.
I think adding the two new methods is the “right” solution, but I’d like some consensus here.
-sherman
If I've gone to trouble of specifying the a Release then it seems the
right thing to do. On the other hand, it comes at a cost and there will be use-cases like "get the names of all entries" that would be more efficient to just build on the current entries()/stream(). I'm loath to suggest this might need a new method but it might be one of the options to consider here. Minimally there is a javadoc to specify on how these methods behave when the JAR is multi-release and opened by specifying a release.
How’s this?
diff -r 68867430065b src/java.base/share/classes/java/util/jar/JarFile.java --- a/src/java.base/share/classes/java/util/jar/JarFile.java Fri Jan 29 12:34:44 2016 -0800 +++ b/src/java.base/share/classes/java/util/jar/JarFile.java Mon Feb 01 09:48:05 2016 -0800 @@ -576,9 +576,11 @@ }
/** - * Returns an enumeration of the jar file entries. + * Returns an enumeration of all the jar file entries. Constructing this + * JarFile with the {@link JarFile#JarFile(File, boolean, int, Release)} + * constructor does not modify the behavior of this method. * - * @return an enumeration of the jar file entries + * @return an enumeration of the all jar file entries * @throws IllegalStateException * may be thrown if the jar file has been closed */ @@ -587,11 +589,13 @@ }
/** - * Returns an ordered {@code Stream} over the jar file entries. + * Returns an ordered {@code Stream} over all the jar file entries. * Entries appear in the {@code Stream} in the order they appear in - * the central directory of the jar file. + * the central directory of the jar file. Constructing this + * JarFile with the {@link JarFile#JarFile(File, boolean, int, Release)} + * constructor does not modify the behavior of this method. * - * @return an ordered {@code Stream} of entries in this jar file + * @return an ordered {@code Stream} of all entries in this jar file * @throws IllegalStateException if the jar file has been closed * @since 1.8 */
I think the following is a reasonable solution. If the JarFile is constructed with any of the 5 constructors that do not contain a Release argument, then entries()/stream() returns the set of all entries in the jar file including those under the META-INF/versions/directory. The base entries are “raw”, they are not aliases for versioned entries. If the JarFile is constructed with the 1 constructor that does include a Release argument, then entries()/stream() returns the set of appropriately versioned entries that would result from invoking getEntry()/getJarEntry() with the name of each base entry. The entries in the tree rooted at the directory META-INF/versions/ are not returned.
On Feb 1, 2016, at 12:29 PM, Steve Drach <steve.drach@oracle.com> wrote:
I’m sorry, I didn’t look at the code close enough before I started talking ;-) Right now entries()/stream() returns all entries and if the JarFile is constructed with a Release object != Release.BASE, the base entries that are returned are the versioned entries. I think this behavior is a bit confusing and we should just return all entries without regard to versioning. Then create the two new methods for specific versioned entries.
On Feb 1, 2016, at 12:18 PM, Steve Drach <steve.drach@oracle.com> wrote:
Alan’s point is that traversing using entries()/stream() always returns the versioned entries (if any) rather than all entries, thus in a sense filters.
My assumption was the traversal should by default be consistent with a calls to getEntry, thus:
jarFile.stream().forEach(e -> { JarEntry je = jarFile.getJarEntry(e.getName()); assert e.equals(je); });
There might need to be another stream method that returns all entries.
Right, I'm mostly just wondering if entries()/streams() should override the entries in the stream with versioned entries and filter out the META-INF/versions/ tree. I don’t think so. That kind of behavior might be difficult to understand. Returning all the entries provides some flexibility. One can write code like this:
jarfile.stream().map(JarEntry::getName).filter(s -> !s.startsWith(“META-INF”)).map(JarFile::getJarEntry).etc
to get the versioned results for any version you specify when constructing the JarFile.
The current specification treats those class files under meta-inf/releases like kind of "metadata" of those base entries. Ideally those files should not even be individual "files", but part of their corresponding entries. The consumer of the MR-Jar should not need to be aware of these version-ed entries at all to use this MR-jar file to load classes/resources. From this point of view, these entries probably should be "invisible" from entries()/stream(), when the jar file is opened with "version-enabled". And all returned entries should have all their "data" (size, csize, timestamps, crc ...) pointed to the corresponding version-ed entries, withe the only exception is the "name".
On the other hand it might be desired to keep JarFile.entries()/stream() asis to match their "zip file" perspective, to return "all" raw entries. Then it might also be desired to have an alternative "versioned streamVersion()" …
It seems to that we have two reasonable alternatives: (1) return all entries, and (2) return all entries except those under the “META-INF/versions/“ directory and for any entries returned, return their versioned equivalent if it exists. If we choose alternative 2, we can still get alternative 1 by asking for JarFile.super.entries() and JarFile.super.stream().
Or we can do it both ways, leaving entries()/stream() as is and adding two new methods, versionedEntries() and versionedStream().
something like
public Stream<JarEntry> stream(Release r); ?
We should not parametrize the methods with a Release, because what does it mean if we construct the JarFile with one Release but specify a different Release for the stream argument. Parameterizing methods with a Release object feels like we’re starting to slide down a slippery slope.
I think adding the two new methods is the “right” solution, but I’d like some consensus here.
-sherman
If I've gone to trouble of specifying the a Release then it seems the right thing to do. On the other hand, it comes at a cost and there will be use-cases like "get the names of all entries" that would be more efficient to just build on the current entries()/stream(). I'm loath to suggest this might need a new method but it might be one of the options to consider here. Minimally there is a javadoc to specify on how these methods behave when the JAR is multi-release and opened by specifying a release. How’s this?
diff -r 68867430065b src/java.base/share/classes/java/util/jar/JarFile.java --- a/src/java.base/share/classes/java/util/jar/JarFile.java Fri Jan 29 12:34:44 2016 -0800 +++ b/src/java.base/share/classes/java/util/jar/JarFile.java Mon Feb 01 09:48:05 2016 -0800 @@ -576,9 +576,11 @@ }
/** - * Returns an enumeration of the jar file entries. + * Returns an enumeration of all the jar file entries. Constructing this + * JarFile with the {@link JarFile#JarFile(File, boolean, int, Release)} + * constructor does not modify the behavior of this method. * - * @return an enumeration of the jar file entries + * @return an enumeration of the all jar file entries * @throws IllegalStateException * may be thrown if the jar file has been closed */ @@ -587,11 +589,13 @@ }
/** - * Returns an ordered {@code Stream} over the jar file entries. + * Returns an ordered {@code Stream} over all the jar file entries. * Entries appear in the {@code Stream} in the order they appear in - * the central directory of the jar file. + * the central directory of the jar file. Constructing this + * JarFile with the {@link JarFile#JarFile(File, boolean, int, Release)} + * constructor does not modify the behavior of this method. * - * @return an ordered {@code Stream} of entries in this jar file + * @return an ordered {@code Stream} of all entries in this jar file * @throws IllegalStateException if the jar file has been closed * @since 1.8 */
I think the following is a reasonable solution.
If the JarFile is constructed with any of the 5 constructors that do not contain a Release argument, then entries()/stream() returns the set of all entries in the jar file including those under the META-INF/versions/directory. The base entries are “raw”, they are not aliases for versioned entries.
If the JarFile is constructed with the 1 constructor that does include a Release argument, then entries()/stream() returns the set of appropriately versioned entries that would result from invoking getEntry()/getJarEntry() with the name of each base entry. The entries in the tree rooted at the directory META-INF/versions/ are not returned.
I created a new webrev, http://cr.openjdk.java.net/~sdrach/8132734/webrev.05/ <http://cr.openjdk.java.net/~sdrach/8132734/webrev.05/>, that implements what I outlined above. In particular I enhanced the JarEntryIterator class and I added additional commentary to the entries() and stream() methods. I also added a new test, MultiReleaseJarIterators, to test entries() and stream().
On Feb 1, 2016, at 12:29 PM, Steve Drach <steve.drach@oracle.com> wrote:
I’m sorry, I didn’t look at the code close enough before I started talking ;-) Right now entries()/stream() returns all entries and if the JarFile is constructed with a Release object != Release.BASE, the base entries that are returned are the versioned entries. I think this behavior is a bit confusing and we should just return all entries without regard to versioning. Then create the two new methods for specific versioned entries.
On Feb 1, 2016, at 12:18 PM, Steve Drach <steve.drach@oracle.com> wrote:
> Alan’s point is that traversing using entries()/stream() always returns the versioned entries (if any) rather than all entries, thus in a sense filters. > > My assumption was the traversal should by default be consistent with a calls to getEntry, thus: > > jarFile.stream().forEach(e -> { > JarEntry je = jarFile.getJarEntry(e.getName()); > assert e.equals(je); > }); > > There might need to be another stream method that returns all entries. > Right, I'm mostly just wondering if entries()/streams() should override the entries in the stream with versioned entries and filter out the META-INF/versions/ tree. I don’t think so. That kind of behavior might be difficult to understand. Returning all the entries provides some flexibility. One can write code like this:
jarfile.stream().map(JarEntry::getName).filter(s -> !s.startsWith(“META-INF”)).map(JarFile::getJarEntry).etc
to get the versioned results for any version you specify when constructing the JarFile.
The current specification treats those class files under meta-inf/releases like kind of "metadata" of those base entries. Ideally those files should not even be individual "files", but part of their corresponding entries. The consumer of the MR-Jar should not need to be aware of these version-ed entries at all to use this MR-jar file to load classes/resources. From this point of view, these entries probably should be "invisible" from entries()/stream(), when the jar file is opened with "version-enabled". And all returned entries should have all their "data" (size, csize, timestamps, crc ...) pointed to the corresponding version-ed entries, withe the only exception is the "name".
On the other hand it might be desired to keep JarFile.entries()/stream() asis to match their "zip file" perspective, to return "all" raw entries. Then it might also be desired to have an alternative "versioned streamVersion()" …
It seems to that we have two reasonable alternatives: (1) return all entries, and (2) return all entries except those under the “META-INF/versions/“ directory and for any entries returned, return their versioned equivalent if it exists. If we choose alternative 2, we can still get alternative 1 by asking for JarFile.super.entries() and JarFile.super.stream().
Or we can do it both ways, leaving entries()/stream() as is and adding two new methods, versionedEntries() and versionedStream().
something like
public Stream<JarEntry> stream(Release r); ?
We should not parametrize the methods with a Release, because what does it mean if we construct the JarFile with one Release but specify a different Release for the stream argument. Parameterizing methods with a Release object feels like we’re starting to slide down a slippery slope.
I think adding the two new methods is the “right” solution, but I’d like some consensus here.
-sherman
If I've gone to trouble of specifying the a Release then it seems the right thing to do. On the other hand, it comes at a cost and there will be use-cases like "get the names of all entries" that would be more efficient to just build on the current entries()/stream(). I'm loath to suggest this might need a new method but it might be one of the options to consider here. Minimally there is a javadoc to specify on how these methods behave when the JAR is multi-release and opened by specifying a release. How’s this?
diff -r 68867430065b src/java.base/share/classes/java/util/jar/JarFile.java --- a/src/java.base/share/classes/java/util/jar/JarFile.java Fri Jan 29 12:34:44 2016 -0800 +++ b/src/java.base/share/classes/java/util/jar/JarFile.java Mon Feb 01 09:48:05 2016 -0800 @@ -576,9 +576,11 @@ }
/** - * Returns an enumeration of the jar file entries. + * Returns an enumeration of all the jar file entries. Constructing this + * JarFile with the {@link JarFile#JarFile(File, boolean, int, Release)} + * constructor does not modify the behavior of this method. * - * @return an enumeration of the jar file entries + * @return an enumeration of the all jar file entries * @throws IllegalStateException * may be thrown if the jar file has been closed */ @@ -587,11 +589,13 @@ }
/** - * Returns an ordered {@code Stream} over the jar file entries. + * Returns an ordered {@code Stream} over all the jar file entries. * Entries appear in the {@code Stream} in the order they appear in - * the central directory of the jar file. + * the central directory of the jar file. Constructing this + * JarFile with the {@link JarFile#JarFile(File, boolean, int, Release)} + * constructor does not modify the behavior of this method. * - * @return an ordered {@code Stream} of entries in this jar file + * @return an ordered {@code Stream} of all entries in this jar file * @throws IllegalStateException if the jar file has been closed * @since 1.8 */
On 03/02/2016 02:24, Steve Drach wrote:
I created a new webrev, http://cr.openjdk.java.net/~sdrach/8132734/webrev.05/ <http://cr.openjdk.java.net/%7Esdrach/8132734/webrev.05/>, that implements what I outlined above. In particular I enhanced the JarEntryIterator class and I added additional commentary to the entries() and stream() methods. I also added a new test, MultiReleaseJarIterators, to test entries() and stream().
I think having stream and entries do this is right although I would like to see some performance data if possible. Also I would expect that if a JAR file is not mult-release but the library opens it with Release.RUNTIME to perform the same as opening the JAR file with the Release-less constructors. I think the javadoc will need to also need to make it clear whether entries with names starting with META-INF/versions/ are returned. I see you've added @since 9 to the existing methods, I assume you didn't mean to do this. At some point then we need to discuss how RUNTIME_VERSION is computed. Iris (via Mandy) has pushed jdk.Version to jdk9/dev but having it exported by java.base conflicts with the design principles in JEP 200. Moving it to another module means that code in java.base cannot use it and thus the JAR file can't use it. -Alan.
Thanks for the comments Alan. Responses in-line.
I created a new webrev, http://cr.openjdk.java.net/~sdrach/8132734/webrev.05/ <http://cr.openjdk.java.net/%7Esdrach/8132734/webrev.05/>, that implements what I outlined above. In particular I enhanced the JarEntryIterator class and I added additional commentary to the entries() and stream() methods. I also added a new test, MultiReleaseJarIterators, to test entries() and stream().
I think having stream and entries do this is right although I would like to see some performance data if possible.
I’ll see what I can do. I suspect the non-multi-release jar will be very comparable since there’s just a couple boolean tests that were added for this case.
Also I would expect that if a JAR file is not mult-release but the library opens it with Release.RUNTIME to perform the same as opening the JAR file with the Release-less constructors.
Perhaps. There is a slightly different path with an additional method call and boolean test in this case, but I’ll try to get some metrics here too.
I think the javadoc will need to also need to make it clear whether entries with names starting with META-INF/versions/ are returned.
It was a bit difficult to explain in a succinct way, but the careful reader should be able to infer that the META-INF/versions/ entries are not returned when the constructor with the Release argument is invoked. I’ll try to add some additional detail.
I see you've added @since 9 to the existing methods, I assume you didn't mean to do this.
I did mean to do it, but now that you mention it, I see it was a mistake. I’ll fix that.
At some point then we need to discuss how RUNTIME_VERSION is computed. Iris (via Mandy) has pushed jdk.Version to jdk9/dev but having it exported by java.base conflicts with the design principles in JEP 200. Moving it to another module means that code in java.base cannot use it and thus the JAR file can't use it.
I guess I need to wait until that settles down a bit.
Hi, Yet another webrev, http://cr.openjdk.java.net/~sdrach/8132734/webrev.06/ <http://cr.openjdk.java.net/~sdrach/8132734/webrev.06/>, with a change to JarEntryIterator to fix a problem discovered by performance tests — calling hasNext() twice as often as needed. I also removed the @since 9 tags from the methods entries() and stream(), and added an additional sentence to the spec for each of those methods that clarifies what a base entry is (actually is not).
I think having stream and entries do this is right although I would like to see some performance data if possible.
See http://cr.openjdk.java.net/~sdrach/8132734/JarFile%20Performance.pdf <http://cr.openjdk.java.net/~sdrach/8132734/JarFile%20Performance.pdf> I used JMH to run the benchmark. See http://cr.openjdk.java.net/~sdrach/8132734/MyBenchmark.java <http://cr.openjdk.java.net/~sdrach/8132734/MyBenchmark.java>. I used two jar files, the rt.jar file from JDK 8 that has 20653 entries and the multi-release.jar found in the test directory with 14 entries. Obviously rt.jar is not a multi-release jar file. The first two tables (1 and 2) are comparable and the second two tables are somewhat comparable (3 and 4). Tables 1 and 2 have 4 sections that show the results of tests on the two jar files in 4 configurations of JarFile. The tests were done with a JarFile object constructed without the Release object argument, essentially the legacy constructor. The section labeled "JDK 8 JarFile” was done with JDK 8u66. The section labeled “JDK 9 JarFile” was done with the latest build of openjdk/jdk9/dev without any changes in my 8132734 changeset. I chose this section as the reference, so the last column shows the values normalized to 1 micro/milli second per operation (rt.jar times are in milliseconds and multi-release.jar times are in microseconds). It should be obvious that JDK 9 is much faster than JDK 8, somewhere on the order of 5-6 times faster. I think that is because ZipFile doesn’t use JNI in JDK 9. Of the two remaining sections in Tables 1 and 2, the section labeled “MultiRelease JarFile” differs from the section labeled “MultiRelease JarFile, new iterator” only in the JarEntryIterator class. The first one uses the original iterator in JarFile.java that can be seen starting with line 551 of webrev.04 <http://cr.openjdk.java.net/~sdrach/8132734/webrev.04/>, and the new iterator starts with line 553 of webrev.06 <http://cr.openjdk.java.net/~sdrach/8132734/webrev.06/>. The results are strange. The new, more complex, iterator appears to be faster than the old, simpler, iterator. I double, and triple, checked it, but it was always faster. I used jitwatch to look at the hotspot logs generated during compilation and neither method was compiled. I suppose I could dig into it further but decided not to. Consider it good news. The results do show that the multi-release enhancement slows JarFile entries/stream down by 2-18% depending on the size of the jar file. But they are still much better than the JDK 8 values.
Also I would expect that if a JAR file is not mult-release but the library opens it with Release.RUNTIME to perform the same as opening the JAR file with the Release-less constructors.
The results in Table 3 attempts to answer this question since rt.jar is not a multi-release jar file. This tells me that if one opens the JarFile with Release.RUNTIME, that there is a performance penalty of 2-6% on this very large jar file. Finally, the results in Table 4 tell me that processing a true multi-release jar file takes about 80% more time per entries() or stream() operation. I’ve looked at this in a profiler and there is no particular area that stands out to me, it’s just more complicated to process a multi-release jar file, as would be expected.
I think the javadoc will need to also need to make it clear whether entries with names starting with META-INF/versions/ are returned.
Fixed.
I see you've added @since 9 to the existing methods, I assume you didn't mean to do this.
Fixed.
At some point then we need to discuss how RUNTIME_VERSION is computed. Iris (via Mandy) has pushed jdk.Version to jdk9/dev but having it exported by java.base conflicts with the design principles in JEP 200. Moving it to another module means that code in java.base cannot use it and thus the JAR file can't use it.
Left as is.
Hi Steve, This patch has been cooking and re-heated enough times that i think it’s as good as done in terms of being reviewed. We can follow up with further issues for other aspects, such as performance if need be. Paul.
On 10 Feb 2016, at 02:04, Steve Drach <steve.drach@oracle.com> wrote:
Hi,
Yet another webrev, http://cr.openjdk.java.net/~sdrach/8132734/webrev.06/ <http://cr.openjdk.java.net/~sdrach/8132734/webrev.06/>, with a change to JarEntryIterator to fix a problem discovered by performance tests — calling hasNext() twice as often as needed. I also removed the @since 9 tags from the methods entries() and stream(), and added an additional sentence to the spec for each of those methods that clarifies what a base entry is (actually is not).
I think having stream and entries do this is right although I would like to see some performance data if possible.
See http://cr.openjdk.java.net/~sdrach/8132734/JarFile%20Performance.pdf <http://cr.openjdk.java.net/~sdrach/8132734/JarFile%20Performance.pdf>
I used JMH to run the benchmark. See http://cr.openjdk.java.net/~sdrach/8132734/MyBenchmark.java <http://cr.openjdk.java.net/~sdrach/8132734/MyBenchmark.java>. I used two jar files, the rt.jar file from JDK 8 that has 20653 entries and the multi-release.jar found in the test directory with 14 entries. Obviously rt.jar is not a multi-release jar file.
The first two tables (1 and 2) are comparable and the second two tables are somewhat comparable (3 and 4).
Tables 1 and 2 have 4 sections that show the results of tests on the two jar files in 4 configurations of JarFile. The tests were done with a JarFile object constructed without the Release object argument, essentially the legacy constructor. The section labeled "JDK 8 JarFile” was done with JDK 8u66. The section labeled “JDK 9 JarFile” was done with the latest build of openjdk/jdk9/dev without any changes in my 8132734 changeset. I chose this section as the reference, so the last column shows the values normalized to 1 micro/milli second per operation (rt.jar times are in milliseconds and multi-release.jar times are in microseconds). It should be obvious that JDK 9 is much faster than JDK 8, somewhere on the order of 5-6 times faster. I think that is because ZipFile doesn’t use JNI in JDK 9.
Of the two remaining sections in Tables 1 and 2, the section labeled “MultiRelease JarFile” differs from the section labeled “MultiRelease JarFile, new iterator” only in the JarEntryIterator class. The first one uses the original iterator in JarFile.java that can be seen starting with line 551 of webrev.04 <http://cr.openjdk.java.net/~sdrach/8132734/webrev.04/>, and the new iterator starts with line 553 of webrev.06 <http://cr.openjdk.java.net/~sdrach/8132734/webrev.06/>. The results are strange. The new, more complex, iterator appears to be faster than the old, simpler, iterator. I double, and triple, checked it, but it was always faster. I used jitwatch to look at the hotspot logs generated during compilation and neither method was compiled. I suppose I could dig into it further but decided not to. Consider it good news. The results do show that the multi-release enhancement slows JarFile entries/stream down by 2-18% depending on the size of the jar file. But they are still much better than the JDK 8 values.
Also I would expect that if a JAR file is not mult-release but the library opens it with Release.RUNTIME to perform the same as opening the JAR file with the Release-less constructors.
The results in Table 3 attempts to answer this question since rt.jar is not a multi-release jar file. This tells me that if one opens the JarFile with Release.RUNTIME, that there is a performance penalty of 2-6% on this very large jar file.
Finally, the results in Table 4 tell me that processing a true multi-release jar file takes about 80% more time per entries() or stream() operation. I’ve looked at this in a profiler and there is no particular area that stands out to me, it’s just more complicated to process a multi-release jar file, as would be expected.
I think the javadoc will need to also need to make it clear whether entries with names starting with META-INF/versions/ are returned.
Fixed.
I see you've added @since 9 to the existing methods, I assume you didn't mean to do this.
Fixed.
At some point then we need to discuss how RUNTIME_VERSION is computed. Iris (via Mandy) has pushed jdk.Version to jdk9/dev but having it exported by java.base conflicts with the design principles in JEP 200. Moving it to another module means that code in java.base cannot use it and thus the JAR file can't use it.
Left as is.
Steve, I would assume the difference of the webrev.04 "old" iterator and the webrev.06 "new" iterator that might trigger a performance is how you create the JarFileEntry. The one parameter constructor invokes isMultiRelease(), which might be relatively expensive, when the "mr" is enabled. There are couple "if" checks involved. For the "new" iterator is slower than the current jdk9 one. It might be desired to have two JarEntryIterator classes defined, one for "notVersioned", one for the "versioned". Of course keep the two parameter constructor for the "notVersioned". This might bring performance back for the normal "notVersioned" usage, which I would assume is the majority. There is also a "bug" in the new iterator. The iterator/Enumeration returned from ZipFile.entries() checks "ensureOpen() in both hasNext() and next(). So close() the ZipFile after itr.hasNext() will fails the next next() invocation in the old implementation. The latest itr will returns the cached "ze". Not a big issue for sure, but a kind of "regression". Defining two iterator classes as suggested above will also workaround this issue, as the "notVersioned" one will work just as expected, no regression. -Sherman On 2/9/16 5:04 PM, Steve Drach wrote:
Hi,
Yet another webrev, http://cr.openjdk.java.net/~sdrach/8132734/webrev.06/ <http://cr.openjdk.java.net/%7Esdrach/8132734/webrev.06/>, with a change to JarEntryIterator to fix a problem discovered by performance tests — calling hasNext() twice as often as needed. I also removed the @since 9 tags from the methods entries() and stream(), and added an additional sentence to the spec for each of those methods that clarifies what a base entry is (actually is not).
I think having stream and entries do this is right although I would like to see some performance data if possible.
See http://cr.openjdk.java.net/~sdrach/8132734/JarFile%20Performance.pdf <http://cr.openjdk.java.net/%7Esdrach/8132734/JarFile%20Performance.pdf>
I used JMH to run the benchmark. See http://cr.openjdk.java.net/~sdrach/8132734/MyBenchmark.java <http://cr.openjdk.java.net/%7Esdrach/8132734/MyBenchmark.java>. I used two jar files, the rt.jar file from JDK 8 that has 20653 entries and the multi-release.jar found in the test directory with 14 entries. Obviously rt.jar is not a multi-release jar file.
The first two tables (1 and 2) are comparable and the second two tables are somewhat comparable (3 and 4).
Tables 1 and 2 have 4 sections that show the results of tests on the two jar files in 4 configurations of JarFile. The tests were done with a JarFile object constructed without the Release object argument, essentially the legacy constructor. The section labeled "JDK 8 JarFile” was done with JDK 8u66. The section labeled “JDK 9 JarFile” was done with the latest build of openjdk/jdk9/dev without any changes in my 8132734 changeset. I chose this section as the reference, so the last column shows the values normalized to 1 micro/milli second per operation (rt.jar times are in milliseconds and multi-release.jar times are in microseconds). It should be obvious that JDK 9 is much faster than JDK 8, somewhere on the order of 5-6 times faster. I think that is because ZipFile doesn’t use JNI in JDK 9.
Of the two remaining sections in Tables 1 and 2, the section labeled “MultiRelease JarFile” differs from the section labeled “MultiRelease JarFile, new iterator” only in the JarEntryIterator class. The first one uses the original iterator in JarFile.java that can be seen starting with line 551 of webrev.04 <http://cr.openjdk.java.net/%7Esdrach/8132734/webrev.04/>, and the new iterator starts with line 553 of webrev.06 <http://cr.openjdk.java.net/%7Esdrach/8132734/webrev.06/>. The results are strange. The new, more complex, iterator appears to be faster than the old, simpler, iterator. I double, and triple, checked it, but it was always faster. I used jitwatch to look at the hotspot logs generated during compilation and neither method was compiled. I suppose I could dig into it further but decided not to. Consider it good news. The results do show that the multi-release enhancement slows JarFile entries/stream down by 2-18% depending on the size of the jar file. But they are still much better than the JDK 8 values.
Also I would expect that if a JAR file is not mult-release but the library opens it with Release.RUNTIME to perform the same as opening the JAR file with the Release-less constructors.
The results in Table 3 attempts to answer this question since rt.jar is not a multi-release jar file. This tells me that if one opens the JarFile with Release.RUNTIME, that there is a performance penalty of 2-6% on this very large jar file.
Finally, the results in Table 4 tell me that processing a true multi-release jar file takes about 80% more time per entries() or stream() operation. I’ve looked at this in a profiler and there is no particular area that stands out to me, it’s just more complicated to process a multi-release jar file, as would be expected.
I think the javadoc will need to also need to make it clear whether entries with names starting with META-INF/versions/ are returned.
Fixed.
I see you've added @since 9 to the existing methods, I assume you didn't mean to do this.
Fixed.
At some point then we need to discuss how RUNTIME_VERSION is computed. Iris (via Mandy) has pushed jdk.Version to jdk9/dev but having it exported by java.base conflicts with the design principles in JEP 200. Moving it to another module means that code in java.base cannot use it and thus the JAR file can't use it.
Left as is.
Sherman, I like your suggestions and will open an issue to work on the performance after integration. I’ll fix the minor “bug” you pointed out at the same time. Steve
On Feb 12, 2016, at 5:19 PM, Xueming Shen <xueming.shen@oracle.com> wrote:
Steve,
I would assume the difference of the webrev.04 "old" iterator and the webrev.06 "new" iterator that might trigger a performance is how you create the JarFileEntry. The one parameter constructor invokes isMultiRelease(), which might be relatively expensive, when the "mr" is enabled. There are couple "if" checks involved.
For the "new" iterator is slower than the current jdk9 one. It might be desired to have two JarEntryIterator classes defined, one for "notVersioned", one for the "versioned". Of course keep the two parameter constructor for the "notVersioned". This might bring performance back for the normal "notVersioned" usage, which I would assume is the majority.
There is also a "bug" in the new iterator. The iterator/Enumeration returned from ZipFile.entries() checks "ensureOpen() in both hasNext() and next(). So close() the ZipFile after itr.hasNext() will fails the next next() invocation in the old implementation. The latest itr will returns the cached "ze". Not a big issue for sure, but a kind of "regression". Defining two iterator classes as suggested above will also workaround this issue, as the "notVersioned" one will work just as expected, no regression.
-Sherman
On 2/9/16 5:04 PM, Steve Drach wrote:
Hi,
Yet another webrev, http://cr.openjdk.java.net/~sdrach/8132734/webrev.06/ <http://cr.openjdk.java.net/%7Esdrach/8132734/webrev.06/>, with a change to JarEntryIterator to fix a problem discovered by performance tests — calling hasNext() twice as often as needed. I also removed the @since 9 tags from the methods entries() and stream(), and added an additional sentence to the spec for each of those methods that clarifies what a base entry is (actually is not).
I think having stream and entries do this is right although I would like to see some performance data if possible.
See http://cr.openjdk.java.net/~sdrach/8132734/JarFile%20Performance.pdf <http://cr.openjdk.java.net/%7Esdrach/8132734/JarFile%20Performance.pdf>
I used JMH to run the benchmark. See http://cr.openjdk.java.net/~sdrach/8132734/MyBenchmark.java <http://cr.openjdk.java.net/%7Esdrach/8132734/MyBenchmark.java>. I used two jar files, the rt.jar file from JDK 8 that has 20653 entries and the multi-release.jar found in the test directory with 14 entries. Obviously rt.jar is not a multi-release jar file.
The first two tables (1 and 2) are comparable and the second two tables are somewhat comparable (3 and 4).
Tables 1 and 2 have 4 sections that show the results of tests on the two jar files in 4 configurations of JarFile. The tests were done with a JarFile object constructed without the Release object argument, essentially the legacy constructor. The section labeled "JDK 8 JarFile” was done with JDK 8u66. The section labeled “JDK 9 JarFile” was done with the latest build of openjdk/jdk9/dev without any changes in my 8132734 changeset. I chose this section as the reference, so the last column shows the values normalized to 1 micro/milli second per operation (rt.jar times are in milliseconds and multi-release.jar times are in microseconds). It should be obvious that JDK 9 is much faster than JDK 8, somewhere on the order of 5-6 times faster. I think that is because ZipFile doesn’t use JNI in JDK 9.
Of the two remaining sections in Tables 1 and 2, the section labeled “MultiRelease JarFile” differs from the section labeled “MultiRelease JarFile, new iterator” only in the JarEntryIterator class. The first one uses the original iterator in JarFile.java that can be seen starting with line 551 of webrev.04 <http://cr.openjdk.java.net/%7Esdrach/8132734/webrev.04/>, and the new iterator starts with line 553 of webrev.06 <http://cr.openjdk.java.net/%7Esdrach/8132734/webrev.06/>. The results are strange. The new, more complex, iterator appears to be faster than the old, simpler, iterator. I double, and triple, checked it, but it was always faster. I used jitwatch to look at the hotspot logs generated during compilation and neither method was compiled. I suppose I could dig into it further but decided not to. Consider it good news. The results do show that the multi-release enhancement slows JarFile entries/stream down by 2-18% depending on the size of the jar file. But they are still much better than the JDK 8 values.
Also I would expect that if a JAR file is not mult-release but the library opens it with Release.RUNTIME to perform the same as opening the JAR file with the Release-less constructors.
The results in Table 3 attempts to answer this question since rt.jar is not a multi-release jar file. This tells me that if one opens the JarFile with Release.RUNTIME, that there is a performance penalty of 2-6% on this very large jar file.
Finally, the results in Table 4 tell me that processing a true multi-release jar file takes about 80% more time per entries() or stream() operation. I’ve looked at this in a profiler and there is no particular area that stands out to me, it’s just more complicated to process a multi-release jar file, as would be expected.
I think the javadoc will need to also need to make it clear whether entries with names starting with META-INF/versions/ are returned.
Fixed.
I see you've added @since 9 to the existing methods, I assume you didn't mean to do this.
Fixed.
At some point then we need to discuss how RUNTIME_VERSION is computed. Iris (via Mandy) has pushed jdk.Version to jdk9/dev but having it exported by java.base conflicts with the design principles in JEP 200. Moving it to another module means that code in java.base cannot use it and thus the JAR file can't use it.
Left as is.
On 10/02/2016 01:04, Steve Drach wrote:
Hi,
Yet another webrev, http://cr.openjdk.java.net/~sdrach/8132734/webrev.06/ <http://cr.openjdk.java.net/%7Esdrach/8132734/webrev.06/>, with a change to JarEntryIterator to fix a problem discovered by performance tests — calling hasNext() twice as often as needed. I also removed the @since 9 tags from the methods entries() and stream(), and added an additional sentence to the spec for each of those methods that clarifies what a base entry is (actually is not).
I went through the latest webrev and it looks quite good. A few comments on the javadoc: "... partitioned by the major version of Java platform releases" - this might be better as "... partitioned by the major version of the Java platform". In JarFile.Release then it uses the phrase "top-most (base) directory". I thought we had purged "top-*" from the javadoc in previous iterations because it hints of classes or resources in the top most directory (which isn't the case with classes in a named package). "... will not be accessible by this JarFile" hints of access control or even security manager. Would it clearer to re-word this to something like "will not be located by methods such as getEntry" ? "returned depends whether" -> "returned depends on whether". In the javadoc for entries() and stream() then it mentions "the constructor" many times. I would be tempted to replace many of these - for example "all entries are returned, regardless of the constructor" might be better as "all entries of returned, regards of how the JarFile is created". A couple of nits on the implementation: vze = JarFile.super.getEntry(META_INF_VERSIONS + i-- + sname); - it would be more readable if you move the decrement to its own line. Also I assume that JarFile is not needed here. L942-943 looks messy too, I assume that can be cleaned up. JarFileFactory - "earl" will confuse readers, needs a comment or a better name. I think this is all that I have for now. -Alan.
Thank you Alan. I’ll address the issues you bring up before integration.
On Feb 15, 2016, at 4:30 AM, Alan Bateman <Alan.Bateman@oracle.com> wrote:
On 10/02/2016 01:04, Steve Drach wrote:
Hi,
Yet another webrev, http://cr.openjdk.java.net/~sdrach/8132734/webrev.06/ <http://cr.openjdk.java.net/%7Esdrach/8132734/webrev.06/>, with a change to JarEntryIterator to fix a problem discovered by performance tests — calling hasNext() twice as often as needed. I also removed the @since 9 tags from the methods entries() and stream(), and added an additional sentence to the spec for each of those methods that clarifies what a base entry is (actually is not).
I went through the latest webrev and it looks quite good.
A few comments on the javadoc:
"... partitioned by the major version of Java platform releases" - this might be better as "... partitioned by the major version of the Java platform".
In JarFile.Release then it uses the phrase "top-most (base) directory". I thought we had purged "top-*" from the javadoc in previous iterations because it hints of classes or resources in the top most directory (which isn't the case with classes in a named package).
"... will not be accessible by this JarFile" hints of access control or even security manager. Would it clearer to re-word this to something like "will not be located by methods such as getEntry" ?
"returned depends whether" -> "returned depends on whether".
In the javadoc for entries() and stream() then it mentions "the constructor" many times. I would be tempted to replace many of these - for example "all entries are returned, regardless of the constructor" might be better as "all entries of returned, regards of how the JarFile is created".
A couple of nits on the implementation:
vze = JarFile.super.getEntry(META_INF_VERSIONS + i-- + sname); - it would be more readable if you move the decrement to its own line. Also I assume that JarFile is not needed here.
L942-943 looks messy too, I assume that can be cleaned up.
JarFileFactory - "earl" will confuse readers, needs a comment or a better name.
I think this is all that I have for now.
-Alan.
On 15/02/2016 16:30, Steve Drach wrote:
Thank you Alan. I’ll address the issues you bring up before integration. Thanks. Are you planning to update the webrev too as it would be nice to see the final javadoc?
-Alan
Thank you Alan. I’ll address the issues you bring up before integration. Thanks. Are you planning to update the webrev too as it would be nice to see the final javadoc?
http://cr.openjdk.java.net/~sdrach/8132734/webrev.07/index.html <http://cr.openjdk.java.net/~sdrach/8132734/webrev.07/index.html>
participants (9)
-
Alan Bateman
-
David M. Lloyd
-
Joseph D. Darcy
-
Mandy Chung
-
Paul Benedict
-
Paul Sandoz
-
Steve Drach
-
Wang Weijun
-
Xueming Shen