Request/discussion: BufferedReader reading using async API while providing sync API

Wed Oct 26 20:41:52 UTC 2016

On 10/26/2016 09:46 PM, Brunoais wrote:
>
> Thank you.
>
> Only one thing left. How can I "burn" the OS' file read cache?
> I only know how to do that by allocating a very large amount of memory 
> based on the information I see in the resource manager (windows) or 
> system monitor (linux) of the cached I/O and run the program. In this 
> case, I have no idea how much memory each one's computer has so I 
> cannot use the same method. How would you do such program excerpt?
>
> As for the rest of the pointers: thank you I'll start building the 
> benchmark code based on that information.
>

Here's a prototype you can extend. It'll make you jump-start in 5th gear:

http://cr.openjdk.java.net/~plevart/misc/FileReadBench/FileReadBench.java

Regards, Peter

>
> On 26/10/2016 18:24, Peter Levart wrote:
>> Hi Brunoais,
>>
>> I'll try to tell what I know from my JMH practice:
>>
>> On 10/26/2016 10:30 AM, Brunoais wrote:
>>> Hey guys. Any idea where I can find instructions on how to use JMH to:
>>>
>>> 1. Clear OS' file reading cache.
>>
>> You can create a public void method and make it called by JMH before 
>> each:
>> - trial (a set of iterations)
>> - iteration (a set of test method invocations)
>> - invocation
>>
>> ...simply by annotating it by @Setup( [ Level.Trial | Level.Iteration 
>> | Level.Invocation ] ).
>>
>> So create a method that spawns a script that clears the cache.
>>
>>> 2. Warm up whatever it needs to (maybe reading from a Channel in 
>>> memory).
>>
>> JMH already warms-up the code and VM simply be executing "warmup" 
>> iterations before starting real measured iterations. You can control 
>> the number of warm-up iterations and real measured iterations by 
>> annotating either the class or the method(s) with:
>>
>> @Warmup(iterations = ...)
>> @Measurement(iterations = ...)
>>
>> If you want to warm-up resources by code that is not equal to code in 
>> test method(s) then maybe @Setup methods on different levels could be 
>> used for that.
>>
>>>
>>> 3. Create a BufferedInputStream with a FileInputStream inside, with
>>>    configurable buffer sizes.
>>
>> You can annotate a field of int, long or String type of a class 
>> annotated with @State annotation (can be the benchmark class itself) 
>> with @Param annotation, enumerating values this field will get before 
>> executing the @Setup(Level.Trial) method(s). So you enumerate the 
>> buffer sizes in @Param annotation and instantiate the 
>> BufferedInputStream using the value in @Setup method. Viola.
>>
>>> 4. Execute iterations to read the file fully.
>>
>> Then perhaps you could use only one invocation per iteration and 
>> measured it using @BenchmarkMode(Mode.SingleShotTime), constructing 
>> the loop by yourself.
>>
>>> 1. Allow setting the byte[] size.
>>
>> Use @Parameter on a field to hold the byte[] size and create the 
>> byte[] in @Setup method...
>>
>>> 2. On each iteration, burn a set number of CPU cycles.
>>
>> BlackHole.consumeCPU(tokens)
>>
>>> 5. Re-execute 1, 3 and 4 but with a BufferedNonBlockStream and a
>>>    FileChannel.
>>
>> If you wrap them all into a common API (by delegation), you can use 
>> @Parameter String implType, with @Setup method to instantiate the 
>> appropriate implementation. Then just invoke the common API in the 
>> test method.
>>
>>>
>>> So far I still can't find how to:
>>>
>>> 1 (clear OS' cache)
>>> 3 (the configuration part)
>>> 4 (variable number of iterations)
>>> 4.1 (the configuration)
>>>
>>> Can someone please point me in the right direction?
>>
>> I can create an example test if you like and you can then extend it...
>>
>> Regards, Peter
>>
>>>
>>>
>>> On 26/10/2016 07:57, Brunoais wrote:
>>>>
>>>> Hey Bernd!
>>>>
>>>> I don't know how far back you did such thing but I'm getting 
>>>> positive results with my non-JMH tests. I do have to evaluate my 
>>>> results against logic. After some reads, the OS starts caching the 
>>>> file which is not what I want. It's easy to know when that happens, 
>>>> though. The times fall from ~30s to ~5s and the HDD keeps near idle 
>>>> reading (just looking at the LED is enough to understand).
>>>>
>>>> If you don't test synchronous work and you only run the reads, you 
>>>> will only get marginal results as the OS has no real time to fill 
>>>> the buffer.
>>>> My research shows the 2 major kernels (windows' and GNU/Linux) have 
>>>> non-blocking user-level buffer handling where I give a buffer for 
>>>> the OS to read and it keeps filling it and sending messages/signals 
>>>> as it writes chunks. Linux has an OS interrupt that only sends the 
>>>> signal after it is full, though. There's also another version of 
>>>> them where they use an internal buffer of same size as the buffer 
>>>> you allocate for the OS and then internally call memcopy() into 
>>>> your user-level memory when asked. Tests on the internet show that 
>>>> memcopy is as fast (for 0-1 elements) or faster than 
>>>> System.arraycopy(). I have no idea if they are true.
>>>>
>>>> All this was for me to add that, that code is tuned to copy from 
>>>> the read buffer only when it is, at least, at half capacity and the 
>>>> internal buffer has enough storage space. The process is forced 
>>>> only if nothing had been read on the previous fill() call. It is 
>>>> built to use JNI as little as possible while providing the major 
>>>> contract BufferedInputStream has.
>>>> Finally, I never, ever compact the read buffer. It requires doing a 
>>>> memcopy which is definitely not necessary.
>>>>
>>>> Anyway, those tests about time I made were just to get an order of 
>>>> magnitude about speed difference. I intended to do them differently 
>>>> but JMH looks good so I'll use JMH to test now.
>>>>
>>>> Short reads only happen when fill(true) is called. That happens for 
>>>> desperate get of data.
>>>>
>>>> I'll look into the avoiding double reading requests. I do think it 
>>>> won't bring significant improvements if any at all. It only happens 
>>>> when the buffer is nearly empty and any byte of data is welcome "at 
>>>> any cost".
>>>> Besides, whomever called read at that point would also have had an 
>>>> availability() of 0 and still called read()/read(byte[]).
>>>>
>>>>
>>>> On 26/10/2016 06:14, Bernd Eckenfels wrote:
>>>>>  Hallo Brunoais,
>>>>>
>>>>> In the past I die some experiments with non-blocking file channels 
>>>>> in the hope to increase throughput in a similiar way then your 
>>>>> buffered stream. I also used direct allocated buffers. However my 
>>>>> results have not been that encouraging (especially If a upper 
>>>>> layer used larger reads). I thought back in the time this was 
>>>>> mostly die to the fact that it NOT wraps to real AsyncFIO on most 
>>>>> platforms. But maybe I just measured it wrong, so I will have a 
>>>>> closer look on your impl.
>>>>>
>>>>> Generally I would recommend to make the Benchmark a bit more 
>>>>> reliable with JMH and in order to do this to externalize the 
>>>>> direct buffer allocation (as it ist slow if done repeatingly). 
>>>>> This also allows you to publish some results with varrying 
>>>>> workloads (on different machines).
>>>>>
>>>>> I would also measure the readCount to see if short reads happen.
>>>>>
>>>>>  BTW, I might as well try to only read till the end of the buffer 
>>>>> in the backfilling-wraps-around case and not issue two requests, 
>>>>> that might remove some additional latency.
>>>>>
>>>>> Gruss
>>>>> Bernd
>>>>> -- 
>>>>> http://bernd.eckenfels.net
>>>>>
>>>>> _____________________________
>>>>> From: Brunoais <brunoaiss at gmail.com <mailto:brunoaiss at gmail.com>>
>>>>> Sent: Montag, Oktober 24, 2016 6:30 PM
>>>>> Subject: Re: Request/discussion: BufferedReader reading using 
>>>>> async API while providing sync API
>>>>> To: Pavel Rappo <pavel.rappo at oracle.com 
>>>>> <mailto:pavel.rappo at oracle.com>>
>>>>> Cc: <core-libs-dev at openjdk.java.net 
>>>>> <mailto:core-libs-dev at openjdk.java.net>>
>>>>>
>>>>>
>>>>> Attached and sending!
>>>>>
>>>>>
>>>>> On 24/10/2016 13:48, Pavel Rappo wrote:
>>>>> > Could you please send a new email on this list with the source 
>>>>> attached as a
>>>>> > text file?
>>>>> >
>>>>> >> On 23 Oct 2016, at 19:14, Brunoais <brunoaiss at gmail.com 
>>>>> <mailto:brunoaiss at gmail.com>> wrote:
>>>>> >>
>>>>> >> Here's my poc/prototype:
>>>>> >> http://pastebin.com/WRpYWDJF
>>>>> >>
>>>>> >> I've implemented the bare minimum of the class that follows the 
>>>>> same contract of BufferedReader while signaling all issues I think 
>>>>> it may have or has in comments.
>>>>> >> I also wrote some javadoc to help guiding through the class.
>>>>> >>
>>>>> >> I could have used more fields from BufferedReader but the names 
>>>>> were so minimalistic that were confusing me. I intent to change 
>>>>> them before sending this to openJDK.
>>>>> >>
>>>>> >> One of the major problems this has is long overflowing. It is 
>>>>> major because it is hidden, it will be extremely rare and it takes 
>>>>> a really long time to reproduce. There are different ways of 
>>>>> dealing with it. From just documenting to actually making code 
>>>>> that works with it.
>>>>> >>
>>>>> >> I built a simple test code for it to have some ideas about 
>>>>> performance and correctness.
>>>>> >>
>>>>> >> http://pastebin.com/eh6LFgwT
>>>>> >>
>>>>> >> This doesn't do a through test if it is actually working 
>>>>> correctly but I see no reason for it not working correctly after 
>>>>> fixing the 2 bugs that test found.
>>>>> >>
>>>>> >> I'll also leave here some conclusions about speed and resource 
>>>>> consumption I found.
>>>>> >>
>>>>> >> I made tests with default buffer sizes, 5000B 15_000B and 
>>>>> 500_000B. I noticed that, with my hardware, with the 1 530 000 
>>>>> 000B file, I was getting around:
>>>>> >>
>>>>> >> In all buffers and fake work: 10~15s speed improvement ( from 
>>>>> 90% HDD speed to 100% HDD speed)
>>>>> >> In all buffers and no fake work: 1~2s speed improvement ( from 
>>>>> 90% HDD speed to 100% HDD speed)
>>>>> >>
>>>>> >> Changing the buffer size was giving different reading speeds 
>>>>> but both were quite equal in how much they would change when 
>>>>> changing the buffer size.
>>>>> >> Finally, I could always confirm that I/O was always the slowest 
>>>>> thing while this code was running.
>>>>> >>
>>>>> >> For the ones wondering about the file size; it is both to avoid 
>>>>> OS cache and to make the reading at the main use-case these 
>>>>> objects are for (large streams of bytes).
>>>>> >>
>>>>> >> @Pavel, are you open for discussion now ;)? Need anything else?
>>>>> >>
>>>>> >> On 21/10/2016 19:21, Pavel Rappo wrote:
>>>>> >>> Just to append to my previous email. BufferedReader wraps any 
>>>>> Reader out there.
>>>>> >>> Not specifically FileReader. While you're talking about the 
>>>>> case of effective
>>>>> >>> reading from a file.
>>>>> >>>
>>>>> >>> I guess there's one existing possibility to provide exactly 
>>>>> what you need (as I
>>>>> >>> understand it) under this method:
>>>>> >>>
>>>>> >>> /**
>>>>> >>> * Opens a file for reading, returning a {@code BufferedReader} 
>>>>> to read text
>>>>> >>> * from the file in an efficient manner...
>>>>> >>> ...
>>>>> >>> */
>>>>> >>> java.nio.file.Files#newBufferedReader(java.nio.file.Path)
>>>>> >>>
>>>>> >>> It can return _anything_ as long as it is a BufferedReader. We 
>>>>> can do it, but it
>>>>> >>> needs to be investigated not only for your favorite OS but for 
>>>>> other OSes as
>>>>> >>> well. Feel free to prototype this and we can discuss it on the 
>>>>> list later.
>>>>> >>>
>>>>> >>> Thanks,
>>>>> >>> -Pavel
>>>>> >>>
>>>>> >>>> On 21 Oct 2016, at 18:56, Brunoais <brunoaiss at gmail.com 
>>>>> <mailto:brunoaiss at gmail.com>> wrote:
>>>>> >>>>
>>>>> >>>> Pavel is right.
>>>>> >>>>
>>>>> >>>> In reality, I was expecting such BufferedReader to use only a 
>>>>> single buffer and have that Buffer being filled asynchronously, 
>>>>> not in a different Thread.
>>>>> >>>> Additionally, I don't have the intention of having a larger 
>>>>> buffer than before unless stated through the API (the constructor).
>>>>> >>>>
>>>>> >>>> In my idea, internally, it is supposed to use 
>>>>> java.nio.channels.AsynchronousFileChannel or equivalent.
>>>>> >>>>
>>>>> >>>> It does not prevent having two buffers and I do not intent to 
>>>>> change BufferedReader itself. I'd do an BufferedAsyncReader of 
>>>>> sorts (any name suggestion is welcome as I'm an awful namer).
>>>>> >>>>
>>>>> >>>>
>>>>> >>>> On 21/10/2016 18:38, Roger Riggs wrote:
>>>>> >>>>> Hi Pavel,
>>>>> >>>>>
>>>>> >>>>> I think Brunoais asking for a double buffering scheme in 
>>>>> which the implementation of
>>>>> >>>>> BufferReader fills (a second buffer) in parallel with the 
>>>>> application reading from the 1st buffer
>>>>> >>>>> and managing the swaps and async reads transparently.
>>>>> >>>>> It would not change the API but would change the 
>>>>> interactions between the buffered reader
>>>>> >>>>> and the underlying stream. It would also increase memory 
>>>>> requirements and processing
>>>>> >>>>> by introducing or using a separate thread and the necessary 
>>>>> synchronization.
>>>>> >>>>>
>>>>> >>>>> Though I think the formal interface semantics could be 
>>>>> maintained, I have doubts
>>>>> >>>>> about compatibility and its unintended consequences on 
>>>>> existing subclasses,
>>>>> >>>>> applications and libraries.
>>>>> >>>>>
>>>>> >>>>> $.02, Roger
>>>>> >>>>>
>>>>> >>>>> On 10/21/16 1:22 PM, Pavel Rappo wrote:
>>>>> >>>>>> Off the top of my head, I would say it's not possible to 
>>>>> change the design of an
>>>>> >>>>>> _extensible_ type that has been out there for 20 or so 
>>>>> years. All these I/O
>>>>> >>>>>> streams from java.io <http://java.io> were designed for 
>>>>> simple synchronous use case.
>>>>> >>>>>>
>>>>> >>>>>> It's not that their design is flawed in some way, it's that 
>>>>> they doesn't seem to
>>>>> >>>>>> suit your needs. Have you considered using 
>>>>> java.nio.channels.AsynchronousFileChannel
>>>>> >>>>>> in your applications?
>>>>> >>>>>>
>>>>> >>>>>> -Pavel
>>>>> >>>>>>
>>>>> >>>>>>> On 21 Oct 2016, at 17:08, Brunoais <brunoaiss at gmail.com 
>>>>> <mailto:brunoaiss at gmail.com>> wrote:
>>>>> >>>>>>>
>>>>> >>>>>>> Any feedback on this? I'm really interested in 
>>>>> implementing such BufferedReader/BufferedStreamReader to allow 
>>>>> speeding up my applications without having to think in an 
>>>>> asynchronous way or multi-threading while programming with it.
>>>>> >>>>>>>
>>>>> >>>>>>> That's why I'm asking this here.
>>>>> >>>>>>>
>>>>> >>>>>>>
>>>>> >>>>>>> On 13/10/2016 14:45, Brunoais wrote:
>>>>> >>>>>>>> Hi,
>>>>> >>>>>>>>
>>>>> >>>>>>>> I looked at BufferedReader source code for java 9 long 
>>>>> with the source code of the channels/streams used. I noticed that, 
>>>>> like in java 7, BufferedReader does not use an Async API to load 
>>>>> data from files, instead, the data loading is all done 
>>>>> synchronously even when the OS allows requesting a file to be read 
>>>>> and getting a warning later when the file is effectively read.
>>>>> >>>>>>>>
>>>>> >>>>>>>> Why Is BufferedReader not async while providing a sync API?
>>>>> >>>>>>>>
>>>>> >
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>