Request/discussion: BufferedReader reading using async API while providing sync API
Brunoais
brunoaiss at gmail.com
Wed Oct 26 19:46:30 UTC 2016
Thank you.
Only one thing left. How can I "burn" the OS' file read cache?
I only know how to do that by allocating a very large amount of memory
based on the information I see in the resource manager (windows) or
system monitor (linux) of the cached I/O and run the program. In this
case, I have no idea how much memory each one's computer has so I cannot
use the same method. How would you do such program excerpt?
As for the rest of the pointers: thank you I'll start building the
benchmark code based on that information.
On 26/10/2016 18:24, Peter Levart wrote:
> Hi Brunoais,
>
> I'll try to tell what I know from my JMH practice:
>
> On 10/26/2016 10:30 AM, Brunoais wrote:
>> Hey guys. Any idea where I can find instructions on how to use JMH to:
>>
>> 1. Clear OS' file reading cache.
>
> You can create a public void method and make it called by JMH before each:
> - trial (a set of iterations)
> - iteration (a set of test method invocations)
> - invocation
>
> ...simply by annotating it by @Setup( [ Level.Trial | Level.Iteration
> | Level.Invocation ] ).
>
> So create a method that spawns a script that clears the cache.
>
>> 2. Warm up whatever it needs to (maybe reading from a Channel in memory).
>
> JMH already warms-up the code and VM simply be executing "warmup"
> iterations before starting real measured iterations. You can control
> the number of warm-up iterations and real measured iterations by
> annotating either the class or the method(s) with:
>
> @Warmup(iterations = ...)
> @Measurement(iterations = ...)
>
> If you want to warm-up resources by code that is not equal to code in
> test method(s) then maybe @Setup methods on different levels could be
> used for that.
>
>>
>> 3. Create a BufferedInputStream with a FileInputStream inside, with
>> configurable buffer sizes.
>
> You can annotate a field of int, long or String type of a class
> annotated with @State annotation (can be the benchmark class itself)
> with @Param annotation, enumerating values this field will get before
> executing the @Setup(Level.Trial) method(s). So you enumerate the
> buffer sizes in @Param annotation and instantiate the
> BufferedInputStream using the value in @Setup method. Viola.
>
>> 4. Execute iterations to read the file fully.
>
> Then perhaps you could use only one invocation per iteration and
> measured it using @BenchmarkMode(Mode.SingleShotTime), constructing
> the loop by yourself.
>
>> 1. Allow setting the byte[] size.
>
> Use @Parameter on a field to hold the byte[] size and create the
> byte[] in @Setup method...
>
>> 2. On each iteration, burn a set number of CPU cycles.
>
> BlackHole.consumeCPU(tokens)
>
>> 5. Re-execute 1, 3 and 4 but with a BufferedNonBlockStream and a
>> FileChannel.
>
> If you wrap them all into a common API (by delegation), you can use
> @Parameter String implType, with @Setup method to instantiate the
> appropriate implementation. Then just invoke the common API in the
> test method.
>
>>
>> So far I still can't find how to:
>>
>> 1 (clear OS' cache)
>> 3 (the configuration part)
>> 4 (variable number of iterations)
>> 4.1 (the configuration)
>>
>> Can someone please point me in the right direction?
>
> I can create an example test if you like and you can then extend it...
>
> Regards, Peter
>
>>
>>
>> On 26/10/2016 07:57, Brunoais wrote:
>>>
>>> Hey Bernd!
>>>
>>> I don't know how far back you did such thing but I'm getting
>>> positive results with my non-JMH tests. I do have to evaluate my
>>> results against logic. After some reads, the OS starts caching the
>>> file which is not what I want. It's easy to know when that happens,
>>> though. The times fall from ~30s to ~5s and the HDD keeps near idle
>>> reading (just looking at the LED is enough to understand).
>>>
>>> If you don't test synchronous work and you only run the reads, you
>>> will only get marginal results as the OS has no real time to fill
>>> the buffer.
>>> My research shows the 2 major kernels (windows' and GNU/Linux) have
>>> non-blocking user-level buffer handling where I give a buffer for
>>> the OS to read and it keeps filling it and sending messages/signals
>>> as it writes chunks. Linux has an OS interrupt that only sends the
>>> signal after it is full, though. There's also another version of
>>> them where they use an internal buffer of same size as the buffer
>>> you allocate for the OS and then internally call memcopy() into your
>>> user-level memory when asked. Tests on the internet show that
>>> memcopy is as fast (for 0-1 elements) or faster than
>>> System.arraycopy(). I have no idea if they are true.
>>>
>>> All this was for me to add that, that code is tuned to copy from the
>>> read buffer only when it is, at least, at half capacity and the
>>> internal buffer has enough storage space. The process is forced only
>>> if nothing had been read on the previous fill() call. It is built to
>>> use JNI as little as possible while providing the major contract
>>> BufferedInputStream has.
>>> Finally, I never, ever compact the read buffer. It requires doing a
>>> memcopy which is definitely not necessary.
>>>
>>> Anyway, those tests about time I made were just to get an order of
>>> magnitude about speed difference. I intended to do them differently
>>> but JMH looks good so I'll use JMH to test now.
>>>
>>> Short reads only happen when fill(true) is called. That happens for
>>> desperate get of data.
>>>
>>> I'll look into the avoiding double reading requests. I do think it
>>> won't bring significant improvements if any at all. It only happens
>>> when the buffer is nearly empty and any byte of data is welcome "at
>>> any cost".
>>> Besides, whomever called read at that point would also have had an
>>> availability() of 0 and still called read()/read(byte[]).
>>>
>>>
>>> On 26/10/2016 06:14, Bernd Eckenfels wrote:
>>>> Hallo Brunoais,
>>>>
>>>> In the past I die some experiments with non-blocking file channels
>>>> in the hope to increase throughput in a similiar way then your
>>>> buffered stream. I also used direct allocated buffers. However my
>>>> results have not been that encouraging (especially If a upper layer
>>>> used larger reads). I thought back in the time this was mostly die
>>>> to the fact that it NOT wraps to real AsyncFIO on most platforms.
>>>> But maybe I just measured it wrong, so I will have a closer look on
>>>> your impl.
>>>>
>>>> Generally I would recommend to make the Benchmark a bit more
>>>> reliable with JMH and in order to do this to externalize the direct
>>>> buffer allocation (as it ist slow if done repeatingly). This also
>>>> allows you to publish some results with varrying workloads (on
>>>> different machines).
>>>>
>>>> I would also measure the readCount to see if short reads happen.
>>>>
>>>> BTW, I might as well try to only read till the end of the buffer
>>>> in the backfilling-wraps-around case and not issue two requests,
>>>> that might remove some additional latency.
>>>>
>>>> Gruss
>>>> Bernd
>>>> --
>>>> http://bernd.eckenfels.net
>>>>
>>>> _____________________________
>>>> From: Brunoais <brunoaiss at gmail.com <mailto:brunoaiss at gmail.com>>
>>>> Sent: Montag, Oktober 24, 2016 6:30 PM
>>>> Subject: Re: Request/discussion: BufferedReader reading using async
>>>> API while providing sync API
>>>> To: Pavel Rappo <pavel.rappo at oracle.com
>>>> <mailto:pavel.rappo at oracle.com>>
>>>> Cc: <core-libs-dev at openjdk.java.net
>>>> <mailto:core-libs-dev at openjdk.java.net>>
>>>>
>>>>
>>>> Attached and sending!
>>>>
>>>>
>>>> On 24/10/2016 13:48, Pavel Rappo wrote:
>>>> > Could you please send a new email on this list with the source
>>>> attached as a
>>>> > text file?
>>>> >
>>>> >> On 23 Oct 2016, at 19:14, Brunoais <brunoaiss at gmail.com
>>>> <mailto:brunoaiss at gmail.com>> wrote:
>>>> >>
>>>> >> Here's my poc/prototype:
>>>> >> http://pastebin.com/WRpYWDJF
>>>> >>
>>>> >> I've implemented the bare minimum of the class that follows the
>>>> same contract of BufferedReader while signaling all issues I think
>>>> it may have or has in comments.
>>>> >> I also wrote some javadoc to help guiding through the class.
>>>> >>
>>>> >> I could have used more fields from BufferedReader but the names
>>>> were so minimalistic that were confusing me. I intent to change
>>>> them before sending this to openJDK.
>>>> >>
>>>> >> One of the major problems this has is long overflowing. It is
>>>> major because it is hidden, it will be extremely rare and it takes
>>>> a really long time to reproduce. There are different ways of
>>>> dealing with it. From just documenting to actually making code that
>>>> works with it.
>>>> >>
>>>> >> I built a simple test code for it to have some ideas about
>>>> performance and correctness.
>>>> >>
>>>> >> http://pastebin.com/eh6LFgwT
>>>> >>
>>>> >> This doesn't do a through test if it is actually working
>>>> correctly but I see no reason for it not working correctly after
>>>> fixing the 2 bugs that test found.
>>>> >>
>>>> >> I'll also leave here some conclusions about speed and resource
>>>> consumption I found.
>>>> >>
>>>> >> I made tests with default buffer sizes, 5000B 15_000B and
>>>> 500_000B. I noticed that, with my hardware, with the 1 530 000 000B
>>>> file, I was getting around:
>>>> >>
>>>> >> In all buffers and fake work: 10~15s speed improvement ( from
>>>> 90% HDD speed to 100% HDD speed)
>>>> >> In all buffers and no fake work: 1~2s speed improvement ( from
>>>> 90% HDD speed to 100% HDD speed)
>>>> >>
>>>> >> Changing the buffer size was giving different reading speeds but
>>>> both were quite equal in how much they would change when changing
>>>> the buffer size.
>>>> >> Finally, I could always confirm that I/O was always the slowest
>>>> thing while this code was running.
>>>> >>
>>>> >> For the ones wondering about the file size; it is both to avoid
>>>> OS cache and to make the reading at the main use-case these objects
>>>> are for (large streams of bytes).
>>>> >>
>>>> >> @Pavel, are you open for discussion now ;)? Need anything else?
>>>> >>
>>>> >> On 21/10/2016 19:21, Pavel Rappo wrote:
>>>> >>> Just to append to my previous email. BufferedReader wraps any
>>>> Reader out there.
>>>> >>> Not specifically FileReader. While you're talking about the
>>>> case of effective
>>>> >>> reading from a file.
>>>> >>>
>>>> >>> I guess there's one existing possibility to provide exactly
>>>> what you need (as I
>>>> >>> understand it) under this method:
>>>> >>>
>>>> >>> /**
>>>> >>> * Opens a file for reading, returning a {@code BufferedReader}
>>>> to read text
>>>> >>> * from the file in an efficient manner...
>>>> >>> ...
>>>> >>> */
>>>> >>> java.nio.file.Files#newBufferedReader(java.nio.file.Path)
>>>> >>>
>>>> >>> It can return _anything_ as long as it is a BufferedReader. We
>>>> can do it, but it
>>>> >>> needs to be investigated not only for your favorite OS but for
>>>> other OSes as
>>>> >>> well. Feel free to prototype this and we can discuss it on the
>>>> list later.
>>>> >>>
>>>> >>> Thanks,
>>>> >>> -Pavel
>>>> >>>
>>>> >>>> On 21 Oct 2016, at 18:56, Brunoais <brunoaiss at gmail.com
>>>> <mailto:brunoaiss at gmail.com>> wrote:
>>>> >>>>
>>>> >>>> Pavel is right.
>>>> >>>>
>>>> >>>> In reality, I was expecting such BufferedReader to use only a
>>>> single buffer and have that Buffer being filled asynchronously, not
>>>> in a different Thread.
>>>> >>>> Additionally, I don't have the intention of having a larger
>>>> buffer than before unless stated through the API (the constructor).
>>>> >>>>
>>>> >>>> In my idea, internally, it is supposed to use
>>>> java.nio.channels.AsynchronousFileChannel or equivalent.
>>>> >>>>
>>>> >>>> It does not prevent having two buffers and I do not intent to
>>>> change BufferedReader itself. I'd do an BufferedAsyncReader of
>>>> sorts (any name suggestion is welcome as I'm an awful namer).
>>>> >>>>
>>>> >>>>
>>>> >>>> On 21/10/2016 18:38, Roger Riggs wrote:
>>>> >>>>> Hi Pavel,
>>>> >>>>>
>>>> >>>>> I think Brunoais asking for a double buffering scheme in
>>>> which the implementation of
>>>> >>>>> BufferReader fills (a second buffer) in parallel with the
>>>> application reading from the 1st buffer
>>>> >>>>> and managing the swaps and async reads transparently.
>>>> >>>>> It would not change the API but would change the interactions
>>>> between the buffered reader
>>>> >>>>> and the underlying stream. It would also increase memory
>>>> requirements and processing
>>>> >>>>> by introducing or using a separate thread and the necessary
>>>> synchronization.
>>>> >>>>>
>>>> >>>>> Though I think the formal interface semantics could be
>>>> maintained, I have doubts
>>>> >>>>> about compatibility and its unintended consequences on
>>>> existing subclasses,
>>>> >>>>> applications and libraries.
>>>> >>>>>
>>>> >>>>> $.02, Roger
>>>> >>>>>
>>>> >>>>> On 10/21/16 1:22 PM, Pavel Rappo wrote:
>>>> >>>>>> Off the top of my head, I would say it's not possible to
>>>> change the design of an
>>>> >>>>>> _extensible_ type that has been out there for 20 or so
>>>> years. All these I/O
>>>> >>>>>> streams from java.io <http://java.io> were designed for
>>>> simple synchronous use case.
>>>> >>>>>>
>>>> >>>>>> It's not that their design is flawed in some way, it's that
>>>> they doesn't seem to
>>>> >>>>>> suit your needs. Have you considered using
>>>> java.nio.channels.AsynchronousFileChannel
>>>> >>>>>> in your applications?
>>>> >>>>>>
>>>> >>>>>> -Pavel
>>>> >>>>>>
>>>> >>>>>>> On 21 Oct 2016, at 17:08, Brunoais <brunoaiss at gmail.com
>>>> <mailto:brunoaiss at gmail.com>> wrote:
>>>> >>>>>>>
>>>> >>>>>>> Any feedback on this? I'm really interested in implementing
>>>> such BufferedReader/BufferedStreamReader to allow speeding up my
>>>> applications without having to think in an asynchronous way or
>>>> multi-threading while programming with it.
>>>> >>>>>>>
>>>> >>>>>>> That's why I'm asking this here.
>>>> >>>>>>>
>>>> >>>>>>>
>>>> >>>>>>> On 13/10/2016 14:45, Brunoais wrote:
>>>> >>>>>>>> Hi,
>>>> >>>>>>>>
>>>> >>>>>>>> I looked at BufferedReader source code for java 9 long
>>>> with the source code of the channels/streams used. I noticed that,
>>>> like in java 7, BufferedReader does not use an Async API to load
>>>> data from files, instead, the data loading is all done
>>>> synchronously even when the OS allows requesting a file to be read
>>>> and getting a warning later when the file is effectively read.
>>>> >>>>>>>>
>>>> >>>>>>>> Why Is BufferedReader not async while providing a sync API?
>>>> >>>>>>>>
>>>> >
>>>>
>>>>
>>>>
>>>
>>
>
More information about the core-libs-dev
mailing list