Create java.util.stream.Stream from Iterator / Enumeration
Hi there, I know you are busy getting the latest release ready. Still I have question according the java.util.stream.Stream's static helper methods. During my work I ran into a couple places where creating a Stream out of a Iterator or Enumeration directly would be quite handy. At the moment I have made me two methods like this on a helper class in oder to not duplicate the code: public static <T> Stream<T> iterate(Iterator<T> iterator) { Objects.requireNonNull(iterator); return StreamSupport.stream(Spliterators .spliteratorUnknownSize(iterator, Spliterator.ORDERED | Spliterator.IMMUTABLE), false); } public static <T> Stream<T> iterate(Enumeration<T> enumeration) { Objects.requireNonNull(enumeration); final Iterator<T> iterator = new Iterator<T>() { @Override public boolean hasNext() { return enumeration.hasMoreElements(); } @Override public T next() { return enumeration.nextElement(); } }; return iterate(iterator); } My question is now, if it would be worth while having something like this on the Stream itself? Cheers Patrick
Hi Patrick, Enumeration now has an asIterator method, that’s our attempt to bridge the old traversal world to the less old ( :-) ) world. Bridging the gap between the less old world and the new world (streams) is more subtle, and we have been holding out on that. We wanted to avoid providing simple methods to create a Stream from an Iterator, that might serve as an attractive nuisance e.g. Stream.of(arrayList.iterator()), rather than arrayList.stream(), and there are details of size, order etc for which any defaults might be poor or just wrong depending the Iterator’s source of elements. For example, your helper method assumes that the Iterator is covering a structure that is both immutable and ordered, but what if the Iterator was obtained from a HashSet? If we chose to provide such a helper method we would have to assume that the Iterator’s source of elements is of unknown size, order and mutability, essentially no Spliterator characteristics can be derived from the Iterator. Paul.
On 14 Jun 2016, at 01:13, Patrick Reinhart <patrick@reini.net> wrote:
Hi there,
I know you are busy getting the latest release ready. Still I have question according the java.util.stream.Stream's static helper methods. During my work I ran into a couple places where creating a Stream out of a Iterator or Enumeration directly would be quite handy.
At the moment I have made me two methods like this on a helper class in oder to not duplicate the code:
public static <T> Stream<T> iterate(Iterator<T> iterator) { Objects.requireNonNull(iterator); return StreamSupport.stream(Spliterators .spliteratorUnknownSize(iterator, Spliterator.ORDERED | Spliterator.IMMUTABLE), false); }
public static <T> Stream<T> iterate(Enumeration<T> enumeration) { Objects.requireNonNull(enumeration); final Iterator<T> iterator = new Iterator<T>() { @Override public boolean hasNext() { return enumeration.hasMoreElements(); }
@Override public T next() { return enumeration.nextElement(); } }; return iterate(iterator); }
My question is now, if it would be worth while having something like this on the Stream itself?
Cheers
Patrick
Hi Paul, I see the point, that making it too easy there. I might have to some little explain where I have started from. I wanted to go over all resource URL's from a ClassLoader and read them as a Stream: Enumeration<URL> resources = myInstance.getClassLoader().getResources("resource.name"); Collections.list(resources).stream().... Which will internally copy all elements right away into an ArrayList and does more than I wanted... :-) So, with that your good reasons in mind, in my case leaded me in a wrong solution in the first attempt too - thanks good I looked into the source code ;-) In my case then is the used approach to get a Stream correct? Cheers Patrick On 14.06.2016 18:26, Paul Sandoz wrote:
Hi Patrick,
Enumeration now has an asIterator method, that’s our attempt to bridge the old traversal world to the less old ( :-) ) world.
Bridging the gap between the less old world and the new world (streams) is more subtle, and we have been holding out on that.
We wanted to avoid providing simple methods to create a Stream from an Iterator, that might serve as an attractive nuisance e.g. Stream.of(arrayList.iterator()), rather than arrayList.stream(), and there are details of size, order etc for which any defaults might be poor or just wrong depending the Iterator’s source of elements.
For example, your helper method assumes that the Iterator is covering a structure that is both immutable and ordered, but what if the Iterator was obtained from a HashSet?
If we chose to provide such a helper method we would have to assume that the Iterator’s source of elements is of unknown size, order and mutability, essentially no Spliterator characteristics can be derived from the Iterator.
Paul.
On 14 Jun 2016, at 01:13, Patrick Reinhart <patrick@reini.net> wrote:
Hi there,
I know you are busy getting the latest release ready. Still I have question according the java.util.stream.Stream's static helper methods. During my work I ran into a couple places where creating a Stream out of a Iterator or Enumeration directly would be quite handy.
At the moment I have made me two methods like this on a helper class in oder to not duplicate the code:
public static <T> Stream<T> iterate(Iterator<T> iterator) { Objects.requireNonNull(iterator); return StreamSupport.stream(Spliterators .spliteratorUnknownSize(iterator, Spliterator.ORDERED | Spliterator.IMMUTABLE), false); }
public static <T> Stream<T> iterate(Enumeration<T> enumeration) { Objects.requireNonNull(enumeration); final Iterator<T> iterator = new Iterator<T>() { @Override public boolean hasNext() { return enumeration.hasMoreElements(); }
@Override public T next() { return enumeration.nextElement(); } }; return iterate(iterator); }
My question is now, if it would be worth while having something like this on the Stream itself?
Cheers
Patrick
On 14 Jun 2016, at 13:11, Patrick Reinhart <patrick@reini.net> wrote:
Hi Paul,
I see the point, that making it too easy there. I might have to some little explain where I have started from. I wanted to go over all resource URL's from a ClassLoader and read them as a Stream:
Enumeration<URL> resources = myInstance.getClassLoader().getResources("resource.name"); Collections.list(resources).stream()....
Which will internally copy all elements right away into an ArrayList and does more than I wanted... :-)
Right. A Spliterator from an Iterator will only copy elements (a prefix of increasing size) when splitting.
So, with that your good reasons in mind, in my case leaded me in a wrong solution in the first attempt too - thanks good I looked into the source code ;-)
In my case then is the used approach to get a Stream correct?
Almost: - you can use Enumeration.asIterator() rather than creating your own. - I don’t think you can assume the Iterator has an encounter order (even though there is a form of order related to class loader hierarchy, i.e. you cannot assume resources from a particular class loader are presented in any particular order, it might depend on how the zip/jar was created or the order in which resources are presented in the JDK image, which IIRC the order might optimized for booting up). I had marked ClassLoader as an area to use Stream (we went through a bunch of areas that return Enumeration and add Stream-based methods e.g. NetworkInterface) but we held off because Jigsaw was doing a lot of plumbing work. It might be possible to revisit, it’s the type of enhancement we could get a Feature Complete (FC) extension for. I cannot promise anything here, but if you are looking for something to contribute that may be a good area of focus on now Jigsaw is settling down. Paul.
Almost:
- you can use Enumeration.asIterator() rather than creating your own.
Right, for JDK 9 that will the right way. In the meantime under JDK 8 I will have to write my own ;-)
- I don’t think you can assume the Iterator has an encounter order (even though there is a form of order related to class loader hierarchy, i.e. you cannot assume resources from a particular class loader are presented in any particular order, it might depend on how the zip/jar was created or the order in which resources are presented in the JDK image, which IIRC the order might optimized for booting up).
So in that case only IMMUTABLE will be appropriate, possibly also NONNULL as far I understand the getResources() method documentation.
I had marked ClassLoader as an area to use Stream (we went through a bunch of areas that return Enumeration and add Stream-based methods e.g. NetworkInterface) but we held off because Jigsaw was doing a lot of plumbing work.
I did already some hacking on the Jigsaw stuff and I liked it quit a lot. The biggest problem I see is the time it takes to have all required libraries being converted to Jigsaw too. All in all you all did a great job there and I hope this will be appreciated in the end...
It might be possible to revisit, it’s the type of enhancement we could get a Feature Complete (FC) extension for. I cannot promise anything here, but if you are looking for something to contribute that may be a good area of focus on now Jigsaw is settling down.
So, what do you suggest that should do now? Should I open a enhancement Issue for that? Patrick
On 15 Jun 2016, at 09:35, Patrick Reinhart <patrick@reini.net> wrote:
Almost: - you can use Enumeration.asIterator() rather than creating your own.
Right, for JDK 9 that will the right way. In the meantime under JDK 8 I will have to write my own ;-)
- I don’t think you can assume the Iterator has an encounter order (even though there is a form of order related to class loader hierarchy, i.e. you cannot assume resources from a particular class loader are presented in any particular order, it might depend on how the zip/jar was created or the order in which resources are presented in the JDK image, which IIRC the order might optimized for booting up).
So in that case only IMMUTABLE will be appropriate, possibly also NONNULL as far I understand the getResources() method documentation.
Yes.
I had marked ClassLoader as an area to use Stream (we went through a bunch of areas that return Enumeration and add Stream-based methods e.g. NetworkInterface) but we held off because Jigsaw was doing a lot of plumbing work.
I did already some hacking on the Jigsaw stuff and I liked it quit a lot. The biggest problem I see is the time it takes to have all required libraries being converted to Jigsaw too.
Automatic modules should help: http://openjdk.java.net/projects/jigsaw/spec/sotms/#automatic-modules <http://openjdk.java.net/projects/jigsaw/spec/sotms/#automatic-modules>
All in all you all did a great job there and I hope this will be appreciated in the end...
It might be possible to revisit, it’s the type of enhancement we could get a Feature Complete (FC) extension for. I cannot promise anything here, but if you are looking for something to contribute that may be a good area of focus on now Jigsaw is settling down.
So, what do you suggest that should do now? Should I open a enhancement Issue for that?
Yes, thanks, Paul.
Hi Paul, I finally got the time to create the enhancement issue JDK-8161230 for the alternative methods returning a Stream instead of an Enumeration. Cheers Patrick
I had marked ClassLoader as an area to use Stream (we went through a
bunch of areas that return Enumeration and add Stream-based methods e.g. NetworkInterface) but we held off because Jigsaw was doing a lot of plumbing work.
I did already some hacking on the Jigsaw stuff and I liked it quit a lot. The biggest problem I see is the time it takes to have all required libraries being converted to Jigsaw too. All in all you all did a great job there and I hope this will be appreciated in the end...
It might be possible to revisit, it’s the type of enhancement we could get a Feature Complete (FC) extension for. I cannot promise anything here, but if you are looking for something to contribute that may be a good area of focus on now Jigsaw is settling down.
Hi Patrick,
On 12 Jul 2016, at 21:25, Patrick Reinhart <patrick@reini.net> wrote:
Hi Paul,
I finally got the time to create the enhancement issue JDK-8161230 for the alternative methods returning a Stream instead of an Enumeration.
Ok. I see some comments already by Stuart and Alan, and concur with Alan about working closely with jigsaw-dev. Focusing on the public methods is good and then it will come down to naming them appropriately. Given that sub-classes can override the existing public Enumeration returning methods we most likely need to specify the implementation behaviour of the Stream returning methods i.e. specify in an @implSpec that they call the Enumeration returning methods. Later on it may be possible to sub-types to implement more optimally (but the Enumeration “string” is quite long and intertwined with the internal code base) The protected findResources method may be trickier, i suggest leaving that one alone for now. Paul.
Hi Paul, On 2016-07-13 10:28, Paul Sandoz wrote:
I see some comments already by Stuart and Alan, and concur with Alan about working closely with jigsaw-dev.
Focusing on the public methods is good and then it will come down to naming them appropriately. Given that sub-classes can override the existing public Enumeration returning methods we most likely need to specify the implementation behaviour of the Stream returning methods i.e. specify in an @implSpec that they call the Enumeration returning methods. Later on it may be possible to sub-types to implement more optimally (but the Enumeration “string” is quite long and intertwined with the internal code base)
When I understand you correctly here we should concentrate on the public methods naming firstly? I initially was not sure, what a proper naming for the steams method was. It seem to me reasonable the way Stuart pointed them out on his first comment to name them something like this: Stream<URL> resources(String name) Stream<URL> systemResources(String name) Has anyone a better naming suggestion? For me those names would fit so far. If we look into the stream characteristics I would suggest that it has a unknown size and is immutable in both cases. Maybe the entries are also distinct, but there I'm not sure.
The protected findResources method may be trickier, i suggest leaving that one alone for now.
From the point of a consumer of the public API I have no problems with this approach, as I created the enhancement issue on the public methods anyway. Better to deal with that when we know how to do it better. Cheers Patrick
On 14 Jul 2016, at 17:55, Patrick Reinhart <patrick@reini.net> wrote:
Hi Paul,
On 2016-07-13 10:28, Paul Sandoz wrote:
I see some comments already by Stuart and Alan, and concur with Alan about working closely with jigsaw-dev. Focusing on the public methods is good and then it will come down to naming them appropriately. Given that sub-classes can override the existing public Enumeration returning methods we most likely need to specify the implementation behaviour of the Stream returning methods i.e. specify in an @implSpec that they call the Enumeration returning methods. Later on it may be possible to sub-types to implement more optimally (but the Enumeration “string” is quite long and intertwined with the internal code base)
When I understand you correctly here we should concentrate on the public methods naming firstly? I initially was not sure, what a proper naming for the steams method was. It seem to me reasonable the way Stuart pointed them out on his first comment to name them something like this:
Stream<URL> resources(String name) Stream<URL> systemResources(String name)
Yes.
Has anyone a better naming suggestion? For me those names would fit so far. If we look into the stream characteristics I would suggest that it has a unknown size and is immutable in both cases. Maybe the entries are also distinct, but there I'm not sure.
I would expect the URLs to be distinct, but that might not be consistent with URL.equals i.e. i don’t trust URL handlers :-) therefore i would be wary of including the DISTINCT characteristic. Paul.
The protected findResources method may be trickier, i suggest leaving that one alone for now.
From the point of a consumer of the public API I have no problems with this approach, as I created the enhancement issue on the public methods anyway.
Better to deal with that when we know how to do it better.
Cheers Patrick
Hi Paul, I was quit busy lately and this comes a bit late, I guess you do not have less work ;-) On 15.07.2016 17:10, Paul Sandoz wrote:
When I understand you correctly here we should concentrate on the public methods naming firstly? I initially was not sure, what a proper naming for the steams method was. It seem to me reasonable the way Stuart pointed them out on his first comment to name them something like this:
Stream<URL> resources(String name) Stream<URL> systemResources(String name)
Yes. I have a first proposal for the new methods and their documentation to start with the discussion about the actual API without the implementation jet:
/** * Finds all the resources with the given name. A resource is some data * (images, audio, text, etc) that can be accessed by class code in a way * that is independent of the location of the code. * * Resources in a named module are private to that module. This method does * not find resources in named modules. * * <p>The name of a resource is a <tt>/</tt>-separated path name that * identifies the resource. * * <p> The search order is described in the documentation for {@link * #getResource(String)}. </p> * * @apiNote When overriding this method it is recommended that an * implementation ensures that any delegation is consistent with the {@link * #getResource(java.lang.String) getResource(String)} method. This should * ensure that the first element returned by the stream is the same * resource that the {@code getResource(String)} method would return. * * @param name * The resource name * * @return An stream of {@link java.net.URL <tt>URL</tt>} objects for * the resource. If no resources could be found, the stream * will be empty. Resources that the class loader doesn't have * access to will not be in the stream. * * @throws IOException * If I/O errors occur * * @see #findResources(String) * * @since 1.9 */ public Stream<URL> resources(String name) throws IOException { // to be implemented later } /** * Finds all resources of the specified name from the search path used to * load classes. The resources thus found are returned as an * {@link java.util.stream.Stream <tt>Stream</tt>} of {@link * java.net.URL <tt>URL</tt>} objects. * * Resources in a named module are private to that module. This method does * not find resources in named modules. * * <p> The search order is described in the documentation for {@link * #getSystemResource(String)}. </p> * * @param name * The resource name * * @return An stream of resource {@link java.net.URL <tt>URL</tt>} * objects * * @throws IOException * If I/O errors occur * @since 1.9 */ public static Stream<URL> systemResources(String name) throws IOException { // to be implemented later }
Has anyone a better naming suggestion? For me those names would fit so far. If we look into the stream characteristics I would suggest that it has a unknown size and is immutable in both cases. Maybe the entries are also distinct, but there I'm not sure.
I would expect the URLs to be distinct, but that might not be consistent with URL.equals i.e. i don’t trust URL handlers :-) therefore i would be wary of including the DISTINCT characteristic.
Paul.
So, I was right to no be completely sure about the DISTINCT :-) - then I would go for NONNULL and IMMUTABLE characteristics to start with...
On 04/08/2016 10:33, Patrick Reinhart wrote:
Hi Paul,
I was quit busy lately and this comes a bit late, I guess you do not have less work ;-)
On 15.07.2016 17:10, Paul Sandoz wrote:
When I understand you correctly here we should concentrate on the public methods naming firstly? I initially was not sure, what a proper naming for the steams method was. It seem to me reasonable the way Stuart pointed them out on his first comment to name them something like this:
Stream<URL> resources(String name) Stream<URL> systemResources(String name)
Yes. I have a first proposal for the new methods and their documentation to start with the discussion about the actual API without the implementation jet: The method names look right but I don't think `throws IOException` is needed. If overridden then the implementations could be truely lazy and the method will need to specify how stream operations will wrap the errors in UncheckedIOExceptions.
For the initial sentence then it might be better to say that it "Returns a stream that loads the resources ...". As I was mentioned previously, we will be replacing the javadoc for the existing methods and this will impact the wording for the new methods. It's okay to align the wording for the new methods with the old and we'll adjust once there is agreement on the proposal in JSR 376 and we bring the changes to JDK 9. -Alan
On 05.08.2016 06:18, Alan Bateman wrote:
On 04/08/2016 10:33, Patrick Reinhart wrote:
I was quit busy lately and this comes a bit late, I guess you do not have less work ;-)
On 15.07.2016 17:10, Paul Sandoz wrote:
When I understand you correctly here we should concentrate on the public methods naming firstly? I initially was not sure, what a proper naming for the steams method was. It seem to me reasonable the way Stuart pointed them out on his first comment to name them something like this:
Stream<URL> resources(String name) Stream<URL> systemResources(String name)
Yes. I have a first proposal for the new methods and their documentation to start with the discussion about the actual API without the implementation jet: The method names look right but I don't think `throws IOException` is needed. If overridden then the implementations could be truely lazy and the method will need to specify how stream operations will wrap the errors in UncheckedIOExceptions.
For the initial sentence then it might be better to say that it "Returns a stream that loads the resources ...".
As I was mentioned previously, we will be replacing the javadoc for the existing methods and this will impact the wording for the new methods. It's okay to align the wording for the new methods with the old and we'll adjust once there is agreement on the proposal in JSR 376 and we bring the changes to JDK 9.
-Alan
I tried to integrate your suggested changes here: http://cr.openjdk.java.net/~reinhapa/reviews/8161230/ClassLoader_StreamMetho... Patrick
On 08/08/2016 17:29, Patrick Reinhart wrote:
: I tried to integrate your suggested changes here: http://cr.openjdk.java.net/~reinhapa/reviews/8161230/ClassLoader_StreamMetho...
I should have been clearer. A lazy implementation of resources/systemResources methods won't throw any exceptions, instead any I/O exceptions will be wrapped with an UncheckedIOException and then thrown from the method that caused the access to take place. There are several examples of this already. For the javadoc then this will be described in the method description rather than a @throws. -Alan
Am 08.08.2016 um 18:55 schrieb Alan Bateman <Alan.Bateman@oracle.com>:
On 08/08/2016 17:29, Patrick Reinhart wrote:
: I tried to integrate your suggested changes here: http://cr.openjdk.java.net/~reinhapa/reviews/8161230/ClassLoader_StreamMetho...
I should have been clearer. A lazy implementation of resources/systemResources methods won't throw any exceptions, instead any I/O exceptions will be wrapped with an UncheckedIOException and then thrown from the method that caused the access to take place. There are several examples of this already. For the javadoc then this will be described in the method description rather than a @throws.
-Alan
I hope that this version is more likely that what you meant… http://cr.openjdk.java.net/~reinhapa/reviews/8161230/ClassLoader_StreamMetho... <http://cr.openjdk.java.net/~reinhapa/reviews/8161230/ClassLoader_StreamMethods.02> Patrick
On 8 Aug 2016, at 12:14, Patrick Reinhart <patrick@reini.net> wrote:
Am 08.08.2016 um 18:55 schrieb Alan Bateman <Alan.Bateman@oracle.com>:
On 08/08/2016 17:29, Patrick Reinhart wrote:
: I tried to integrate your suggested changes here: http://cr.openjdk.java.net/~reinhapa/reviews/8161230/ClassLoader_StreamMetho...
I should have been clearer. A lazy implementation of resources/systemResources methods won't throw any exceptions, instead any I/O exceptions will be wrapped with an UncheckedIOException and then thrown from the method that caused the access to take place. There are several examples of this already. For the javadoc then this will be described in the method description rather than a @throws.
-Alan
I hope that this version is more likely that what you meant…
http://cr.openjdk.java.net/~reinhapa/reviews/8161230/ClassLoader_StreamMetho... <http://cr.openjdk.java.net/~reinhapa/reviews/8161230/ClassLoader_StreamMethods.02>
Perhaps consider: The loading of resources will occur when the returned stream is evaluated. If the loading of resources results in an {@code IOException} then the I/O exception is wrapped in an {@link UncheckedIOException} that is then thrown. Instead of <tt>…</tt> use {@code … } Paul.
On 17.08.2016 01:50, Paul Sandoz wrote:
On 8 Aug 2016, at 12:14, Patrick Reinhart <patrick@reini.net> wrote:
Am 08.08.2016 um 18:55 schrieb Alan Bateman <Alan.Bateman@oracle.com>:
On 08/08/2016 17:29, Patrick Reinhart wrote:
: I tried to integrate your suggested changes here: http://cr.openjdk.java.net/~reinhapa/reviews/8161230/ClassLoader_StreamMetho...
I should have been clearer. A lazy implementation of resources/systemResources methods won't throw any exceptions, instead any I/O exceptions will be wrapped with an UncheckedIOException and then thrown from the method that caused the access to take place. There are several examples of this already. For the javadoc then this will be described in the method description rather than a @throws.
-Alan I hope that this version is more likely that what you meant…
http://cr.openjdk.java.net/~reinhapa/reviews/8161230/ClassLoader_StreamMetho... <http://cr.openjdk.java.net/~reinhapa/reviews/8161230/ClassLoader_StreamMethods.02>
Perhaps consider:
The loading of resources will occur when the returned stream is evaluated. If the loading of resources results in an {@code IOException} then the I/O exception is wrapped in an {@link UncheckedIOException} that is then thrown.
Instead of <tt>…</tt> use {@code … }
Paul.
Hi Paul, Thanks for the input. I integrated that into the version here: http://cr.openjdk.java.net/~reinhapa/reviews/8161230/ClassLoader_StreamMetho... Patrick
On 17 Aug 2016, at 02:39, Patrick Reinhart <patrick@reini.net> wrote:
Hi Paul,
Thanks for the input. I integrated that into the version here:
http://cr.openjdk.java.net/~reinhapa/reviews/8161230/ClassLoader_StreamMetho...
Onward to the implementation! Paul.
On 19.08.2016 00:17, Paul Sandoz wrote:
Onward to the implementation!
Paul.
I did not manage it to get the OpenJDK to compile stressfully under my Fedora Linux as in the past now. Anyhow here a sample implementation, that I will integrate into the existing ClassLoader and the according Test, when we are all happy (or I get the compilation up and running again): http://cr.openjdk.java.net/~reinhapa/reviews/8161230/ImplementationProposal.... - Patrick
participants (3)
-
Alan Bateman
-
Patrick Reinhart
-
Paul Sandoz