Request/discussion: BufferedReader reading using async API while providing sync API

Thu Oct 27 12:20:53 UTC 2016

Hey.

Any idea how to skip tests? When testing for BufferedInputStream, the 
directBufferSize is not used so testing with different directBufferSize 
makes no sense.

I already tried "return 0" on the benchmarked test but jmh fills the 
output with " (*interrupt*)" if I do that.

On 26/10/2016 21:41, Peter Levart wrote:
>
>
>
> On 10/26/2016 09:46 PM, Brunoais wrote:
>>
>> Thank you.
>>
>> Only one thing left. How can I "burn" the OS' file read cache?
>> I only know how to do that by allocating a very large amount of 
>> memory based on the information I see in the resource manager 
>> (windows) or system monitor (linux) of the cached I/O and run the 
>> program. In this case, I have no idea how much memory each one's 
>> computer has so I cannot use the same method. How would you do such 
>> program excerpt?
>>
>> As for the rest of the pointers: thank you I'll start building the 
>> benchmark code based on that information.
>>
>
> Here's a prototype you can extend. It'll make you jump-start in 5th gear:
>
> http://cr.openjdk.java.net/~plevart/misc/FileReadBench/FileReadBench.java
>
> Regards, Peter
>
>>
>> On 26/10/2016 18:24, Peter Levart wrote:
>>> Hi Brunoais,
>>>
>>> I'll try to tell what I know from my JMH practice:
>>>
>>> On 10/26/2016 10:30 AM, Brunoais wrote:
>>>> Hey guys. Any idea where I can find instructions on how to use JMH to:
>>>>
>>>> 1. Clear OS' file reading cache.
>>>
>>> You can create a public void method and make it called by JMH before 
>>> each:
>>> - trial (a set of iterations)
>>> - iteration (a set of test method invocations)
>>> - invocation
>>>
>>> ...simply by annotating it by @Setup( [ Level.Trial | 
>>> Level.Iteration | Level.Invocation ] ).
>>>
>>> So create a method that spawns a script that clears the cache.
>>>
>>>> 2. Warm up whatever it needs to (maybe reading from a Channel in 
>>>> memory).
>>>
>>> JMH already warms-up the code and VM simply be executing "warmup" 
>>> iterations before starting real measured iterations. You can control 
>>> the number of warm-up iterations and real measured iterations by 
>>> annotating either the class or the method(s) with:
>>>
>>> @Warmup(iterations = ...)
>>> @Measurement(iterations = ...)
>>>
>>> If you want to warm-up resources by code that is not equal to code 
>>> in test method(s) then maybe @Setup methods on different levels 
>>> could be used for that.
>>>
>>>>
>>>> 3. Create a BufferedInputStream with a FileInputStream inside, with
>>>>    configurable buffer sizes.
>>>
>>> You can annotate a field of int, long or String type of a class 
>>> annotated with @State annotation (can be the benchmark class itself) 
>>> with @Param annotation, enumerating values this field will get 
>>> before executing the @Setup(Level.Trial) method(s). So you enumerate 
>>> the buffer sizes in @Param annotation and instantiate the 
>>> BufferedInputStream using the value in @Setup method. Viola.
>>>
>>>> 4. Execute iterations to read the file fully.
>>>
>>> Then perhaps you could use only one invocation per iteration and 
>>> measured it using @BenchmarkMode(Mode.SingleShotTime), constructing 
>>> the loop by yourself.
>>>
>>>> 1. Allow setting the byte[] size.
>>>
>>> Use @Parameter on a field to hold the byte[] size and create the 
>>> byte[] in @Setup method...
>>>
>>>> 2. On each iteration, burn a set number of CPU cycles.
>>>
>>> BlackHole.consumeCPU(tokens)
>>>
>>>> 5. Re-execute 1, 3 and 4 but with a BufferedNonBlockStream and a
>>>>    FileChannel.
>>>
>>> If you wrap them all into a common API (by delegation), you can use 
>>> @Parameter String implType, with @Setup method to instantiate the 
>>> appropriate implementation. Then just invoke the common API in the 
>>> test method.
>>>
>>>>
>>>> So far I still can't find how to:
>>>>
>>>> 1 (clear OS' cache)
>>>> 3 (the configuration part)
>>>> 4 (variable number of iterations)
>>>> 4.1 (the configuration)
>>>>
>>>> Can someone please point me in the right direction?
>>>
>>> I can create an example test if you like and you can then extend it...
>>>
>>> Regards, Peter
>>>
>>>>
>>>>
>>>> On 26/10/2016 07:57, Brunoais wrote:
>>>>>
>>>>> Hey Bernd!
>>>>>
>>>>> I don't know how far back you did such thing but I'm getting 
>>>>> positive results with my non-JMH tests. I do have to evaluate my 
>>>>> results against logic. After some reads, the OS starts caching the 
>>>>> file which is not what I want. It's easy to know when that 
>>>>> happens, though. The times fall from ~30s to ~5s and the HDD keeps 
>>>>> near idle reading (just looking at the LED is enough to understand).
>>>>>
>>>>> If you don't test synchronous work and you only run the reads, you 
>>>>> will only get marginal results as the OS has no real time to fill 
>>>>> the buffer.
>>>>> My research shows the 2 major kernels (windows' and GNU/Linux) 
>>>>> have non-blocking user-level buffer handling where I give a buffer 
>>>>> for the OS to read and it keeps filling it and sending 
>>>>> messages/signals as it writes chunks. Linux has an OS interrupt 
>>>>> that only sends the signal after it is full, though. There's also 
>>>>> another version of them where they use an internal buffer of same 
>>>>> size as the buffer you allocate for the OS and then internally 
>>>>> call memcopy() into your user-level memory when asked. Tests on 
>>>>> the internet show that memcopy is as fast (for 0-1 elements) or 
>>>>> faster than System.arraycopy(). I have no idea if they are true.
>>>>>
>>>>> All this was for me to add that, that code is tuned to copy from 
>>>>> the read buffer only when it is, at least, at half capacity and 
>>>>> the internal buffer has enough storage space. The process is 
>>>>> forced only if nothing had been read on the previous fill() call. 
>>>>> It is built to use JNI as little as possible while providing the 
>>>>> major contract BufferedInputStream has.
>>>>> Finally, I never, ever compact the read buffer. It requires doing 
>>>>> a memcopy which is definitely not necessary.
>>>>>
>>>>> Anyway, those tests about time I made were just to get an order of 
>>>>> magnitude about speed difference. I intended to do them 
>>>>> differently but JMH looks good so I'll use JMH to test now.
>>>>>
>>>>> Short reads only happen when fill(true) is called. That happens 
>>>>> for desperate get of data.
>>>>>
>>>>> I'll look into the avoiding double reading requests. I do think it 
>>>>> won't bring significant improvements if any at all. It only 
>>>>> happens when the buffer is nearly empty and any byte of data is 
>>>>> welcome "at any cost".
>>>>> Besides, whomever called read at that point would also have had an 
>>>>> availability() of 0 and still called read()/read(byte[]).
>>>>>
>>>>>
>>>>> On 26/10/2016 06:14, Bernd Eckenfels wrote:
>>>>>>  Hallo Brunoais,
>>>>>>
>>>>>> In the past I die some experiments with non-blocking file 
>>>>>> channels in the hope to increase throughput in a similiar way 
>>>>>> then your buffered stream. I also used direct allocated buffers. 
>>>>>> However my results have not been that encouraging (especially If 
>>>>>> a upper layer used larger reads). I thought back in the time this 
>>>>>> was mostly die to the fact that it NOT wraps to real AsyncFIO on 
>>>>>> most platforms. But maybe I just measured it wrong, so I will 
>>>>>> have a closer look on your impl.
>>>>>>
>>>>>> Generally I would recommend to make the Benchmark a bit more 
>>>>>> reliable with JMH and in order to do this to externalize the 
>>>>>> direct buffer allocation (as it ist slow if done repeatingly). 
>>>>>> This also allows you to publish some results with varrying 
>>>>>> workloads (on different machines).
>>>>>>
>>>>>> I would also measure the readCount to see if short reads happen.
>>>>>>
>>>>>>  BTW, I might as well try to only read till the end of the buffer 
>>>>>> in the backfilling-wraps-around case and not issue two requests, 
>>>>>> that might remove some additional latency.
>>>>>>
>>>>>> Gruss
>>>>>> Bernd
>>>>>> -- 
>>>>>> http://bernd.eckenfels.net
>>>>>>
>>>>>> _____________________________
>>>>>> From: Brunoais <brunoaiss at gmail.com <mailto:brunoaiss at gmail.com>>
>>>>>> Sent: Montag, Oktober 24, 2016 6:30 PM
>>>>>> Subject: Re: Request/discussion: BufferedReader reading using 
>>>>>> async API while providing sync API
>>>>>> To: Pavel Rappo <pavel.rappo at oracle.com 
>>>>>> <mailto:pavel.rappo at oracle.com>>
>>>>>> Cc: <core-libs-dev at openjdk.java.net 
>>>>>> <mailto:core-libs-dev at openjdk.java.net>>
>>>>>>
>>>>>>
>>>>>> Attached and sending!
>>>>>>
>>>>>>
>>>>>> On 24/10/2016 13:48, Pavel Rappo wrote:
>>>>>> > Could you please send a new email on this list with the source 
>>>>>> attached as a
>>>>>> > text file?
>>>>>> >
>>>>>> >> On 23 Oct 2016, at 19:14, Brunoais <brunoaiss at gmail.com 
>>>>>> <mailto:brunoaiss at gmail.com>> wrote:
>>>>>> >>
>>>>>> >> Here's my poc/prototype:
>>>>>> >> http://pastebin.com/WRpYWDJF
>>>>>> >>
>>>>>> >> I've implemented the bare minimum of the class that follows 
>>>>>> the same contract of BufferedReader while signaling all issues I 
>>>>>> think it may have or has in comments.
>>>>>> >> I also wrote some javadoc to help guiding through the class.
>>>>>> >>
>>>>>> >> I could have used more fields from BufferedReader but the 
>>>>>> names were so minimalistic that were confusing me. I intent to 
>>>>>> change them before sending this to openJDK.
>>>>>> >>
>>>>>> >> One of the major problems this has is long overflowing. It is 
>>>>>> major because it is hidden, it will be extremely rare and it 
>>>>>> takes a really long time to reproduce. There are different ways 
>>>>>> of dealing with it. From just documenting to actually making code 
>>>>>> that works with it.
>>>>>> >>
>>>>>> >> I built a simple test code for it to have some ideas about 
>>>>>> performance and correctness.
>>>>>> >>
>>>>>> >> http://pastebin.com/eh6LFgwT
>>>>>> >>
>>>>>> >> This doesn't do a through test if it is actually working 
>>>>>> correctly but I see no reason for it not working correctly after 
>>>>>> fixing the 2 bugs that test found.
>>>>>> >>
>>>>>> >> I'll also leave here some conclusions about speed and resource 
>>>>>> consumption I found.
>>>>>> >>
>>>>>> >> I made tests with default buffer sizes, 5000B 15_000B and 
>>>>>> 500_000B. I noticed that, with my hardware, with the 1 530 000 
>>>>>> 000B file, I was getting around:
>>>>>> >>
>>>>>> >> In all buffers and fake work: 10~15s speed improvement ( from 
>>>>>> 90% HDD speed to 100% HDD speed)
>>>>>> >> In all buffers and no fake work: 1~2s speed improvement ( from 
>>>>>> 90% HDD speed to 100% HDD speed)
>>>>>> >>
>>>>>> >> Changing the buffer size was giving different reading speeds 
>>>>>> but both were quite equal in how much they would change when 
>>>>>> changing the buffer size.
>>>>>> >> Finally, I could always confirm that I/O was always the 
>>>>>> slowest thing while this code was running.
>>>>>> >>
>>>>>> >> For the ones wondering about the file size; it is both to 
>>>>>> avoid OS cache and to make the reading at the main use-case these 
>>>>>> objects are for (large streams of bytes).
>>>>>> >>
>>>>>> >> @Pavel, are you open for discussion now ;)? Need anything else?
>>>>>> >>
>>>>>> >> On 21/10/2016 19:21, Pavel Rappo wrote:
>>>>>> >>> Just to append to my previous email. BufferedReader wraps any 
>>>>>> Reader out there.
>>>>>> >>> Not specifically FileReader. While you're talking about the 
>>>>>> case of effective
>>>>>> >>> reading from a file.
>>>>>> >>>
>>>>>> >>> I guess there's one existing possibility to provide exactly 
>>>>>> what you need (as I
>>>>>> >>> understand it) under this method:
>>>>>> >>>
>>>>>> >>> /**
>>>>>> >>> * Opens a file for reading, returning a {@code 
>>>>>> BufferedReader} to read text
>>>>>> >>> * from the file in an efficient manner...
>>>>>> >>> ...
>>>>>> >>> */
>>>>>> >>> java.nio.file.Files#newBufferedReader(java.nio.file.Path)
>>>>>> >>>
>>>>>> >>> It can return _anything_ as long as it is a BufferedReader. 
>>>>>> We can do it, but it
>>>>>> >>> needs to be investigated not only for your favorite OS but 
>>>>>> for other OSes as
>>>>>> >>> well. Feel free to prototype this and we can discuss it on 
>>>>>> the list later.
>>>>>> >>>
>>>>>> >>> Thanks,
>>>>>> >>> -Pavel
>>>>>> >>>
>>>>>> >>>> On 21 Oct 2016, at 18:56, Brunoais <brunoaiss at gmail.com 
>>>>>> <mailto:brunoaiss at gmail.com>> wrote:
>>>>>> >>>>
>>>>>> >>>> Pavel is right.
>>>>>> >>>>
>>>>>> >>>> In reality, I was expecting such BufferedReader to use only 
>>>>>> a single buffer and have that Buffer being filled asynchronously, 
>>>>>> not in a different Thread.
>>>>>> >>>> Additionally, I don't have the intention of having a larger 
>>>>>> buffer than before unless stated through the API (the constructor).
>>>>>> >>>>
>>>>>> >>>> In my idea, internally, it is supposed to use 
>>>>>> java.nio.channels.AsynchronousFileChannel or equivalent.
>>>>>> >>>>
>>>>>> >>>> It does not prevent having two buffers and I do not intent 
>>>>>> to change BufferedReader itself. I'd do an BufferedAsyncReader of 
>>>>>> sorts (any name suggestion is welcome as I'm an awful namer).
>>>>>> >>>>
>>>>>> >>>>
>>>>>> >>>> On 21/10/2016 18:38, Roger Riggs wrote:
>>>>>> >>>>> Hi Pavel,
>>>>>> >>>>>
>>>>>> >>>>> I think Brunoais asking for a double buffering scheme in 
>>>>>> which the implementation of
>>>>>> >>>>> BufferReader fills (a second buffer) in parallel with the 
>>>>>> application reading from the 1st buffer
>>>>>> >>>>> and managing the swaps and async reads transparently.
>>>>>> >>>>> It would not change the API but would change the 
>>>>>> interactions between the buffered reader
>>>>>> >>>>> and the underlying stream. It would also increase memory 
>>>>>> requirements and processing
>>>>>> >>>>> by introducing or using a separate thread and the necessary 
>>>>>> synchronization.
>>>>>> >>>>>
>>>>>> >>>>> Though I think the formal interface semantics could be 
>>>>>> maintained, I have doubts
>>>>>> >>>>> about compatibility and its unintended consequences on 
>>>>>> existing subclasses,
>>>>>> >>>>> applications and libraries.
>>>>>> >>>>>
>>>>>> >>>>> $.02, Roger
>>>>>> >>>>>
>>>>>> >>>>> On 10/21/16 1:22 PM, Pavel Rappo wrote:
>>>>>> >>>>>> Off the top of my head, I would say it's not possible to 
>>>>>> change the design of an
>>>>>> >>>>>> _extensible_ type that has been out there for 20 or so 
>>>>>> years. All these I/O
>>>>>> >>>>>> streams from java.io <http://java.io> were designed for 
>>>>>> simple synchronous use case.
>>>>>> >>>>>>
>>>>>> >>>>>> It's not that their design is flawed in some way, it's 
>>>>>> that they doesn't seem to
>>>>>> >>>>>> suit your needs. Have you considered using 
>>>>>> java.nio.channels.AsynchronousFileChannel
>>>>>> >>>>>> in your applications?
>>>>>> >>>>>>
>>>>>> >>>>>> -Pavel
>>>>>> >>>>>>
>>>>>> >>>>>>> On 21 Oct 2016, at 17:08, Brunoais <brunoaiss at gmail.com 
>>>>>> <mailto:brunoaiss at gmail.com>> wrote:
>>>>>> >>>>>>>
>>>>>> >>>>>>> Any feedback on this? I'm really interested in 
>>>>>> implementing such BufferedReader/BufferedStreamReader to allow 
>>>>>> speeding up my applications without having to think in an 
>>>>>> asynchronous way or multi-threading while programming with it.
>>>>>> >>>>>>>
>>>>>> >>>>>>> That's why I'm asking this here.
>>>>>> >>>>>>>
>>>>>> >>>>>>>
>>>>>> >>>>>>> On 13/10/2016 14:45, Brunoais wrote:
>>>>>> >>>>>>>> Hi,
>>>>>> >>>>>>>>
>>>>>> >>>>>>>> I looked at BufferedReader source code for java 9 long 
>>>>>> with the source code of the channels/streams used. I noticed 
>>>>>> that, like in java 7, BufferedReader does not use an Async API to 
>>>>>> load data from files, instead, the data loading is all done 
>>>>>> synchronously even when the OS allows requesting a file to be 
>>>>>> read and getting a warning later when the file is effectively read.
>>>>>> >>>>>>>>
>>>>>> >>>>>>>> Why Is BufferedReader not async while providing a sync API?
>>>>>> >>>>>>>>
>>>>>> >
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>