Request/discussion: BufferedReader reading using async API while providing sync API

Wed Oct 26 17:24:04 UTC 2016

Hi Brunoais,

I'll try to tell what I know from my JMH practice:

On 10/26/2016 10:30 AM, Brunoais wrote:
> Hey guys. Any idea where I can find instructions on how to use JMH to:
>
> 1. Clear OS' file reading cache.

You can create a public void method and make it called by JMH before each:
- trial (a set of iterations)
- iteration (a set of test method invocations)
- invocation

...simply by annotating it by @Setup( [ Level.Trial | Level.Iteration | 
Level.Invocation ] ).

So create a method that spawns a script that clears the cache.

> 2. Warm up whatever it needs to (maybe reading from a Channel in memory).

JMH already warms-up the code and VM simply be executing "warmup" 
iterations before starting real measured iterations. You can control the 
number of warm-up iterations and real measured iterations by annotating 
either the class or the method(s) with:

@Warmup(iterations = ...)
@Measurement(iterations = ...)

If you want to warm-up resources by code that is not equal to code in 
test method(s) then maybe @Setup methods on different levels could be 
used for that.

>
> 3. Create a BufferedInputStream with a FileInputStream inside, with
>    configurable buffer sizes.

You can annotate a field of int, long or String type of a class 
annotated with @State annotation (can be the benchmark class itself) 
with @Param annotation, enumerating values this field will get before 
executing the @Setup(Level.Trial) method(s). So you enumerate the buffer 
sizes in @Param annotation and instantiate the BufferedInputStream using 
the value in @Setup method. Viola.

> 4. Execute iterations to read the file fully.

Then perhaps you could use only one invocation per iteration and 
measured it using @BenchmarkMode(Mode.SingleShotTime), constructing the 
loop by yourself.

> 1. Allow setting the byte[] size.

Use @Parameter on a field to hold the byte[] size and create the byte[] 
in @Setup method...

> 2. On each iteration, burn a set number of CPU cycles.

BlackHole.consumeCPU(tokens)

> 5. Re-execute 1, 3 and 4 but with a BufferedNonBlockStream and a
>    FileChannel.

If you wrap them all into a common API (by delegation), you can use 
@Parameter String implType, with @Setup method to instantiate the 
appropriate implementation. Then just invoke the common API in the test 
method.

>
> So far I still can't find how to:
>
> 1 (clear OS' cache)
> 3 (the configuration part)
> 4 (variable number of iterations)
> 4.1 (the configuration)
>
> Can someone please point me in the right direction?

I can create an example test if you like and you can then extend it...

Regards, Peter

>
>
> On 26/10/2016 07:57, Brunoais wrote:
>>
>> Hey Bernd!
>>
>> I don't know how far back you did such thing but I'm getting positive 
>> results with my non-JMH tests. I do have to evaluate my results 
>> against logic. After some reads, the OS starts caching the file which 
>> is not what I want. It's easy to know when that happens, though. The 
>> times fall from ~30s to ~5s and the HDD keeps near idle reading (just 
>> looking at the LED is enough to understand).
>>
>> If you don't test synchronous work and you only run the reads, you 
>> will only get marginal results as the OS has no real time to fill the 
>> buffer.
>> My research shows the 2 major kernels (windows' and GNU/Linux) have 
>> non-blocking user-level buffer handling where I give a buffer for the 
>> OS to read and it keeps filling it and sending messages/signals as it 
>> writes chunks. Linux has an OS interrupt that only sends the signal 
>> after it is full, though. There's also another version of them where 
>> they use an internal buffer of same size as the buffer you allocate 
>> for the OS and then internally call memcopy() into your user-level 
>> memory when asked. Tests on the internet show that memcopy is as fast 
>> (for 0-1 elements) or faster than System.arraycopy(). I have no idea 
>> if they are true.
>>
>> All this was for me to add that, that code is tuned to copy from the 
>> read buffer only when it is, at least, at half capacity and the 
>> internal buffer has enough storage space. The process is forced only 
>> if nothing had been read on the previous fill() call. It is built to 
>> use JNI as little as possible while providing the major contract 
>> BufferedInputStream has.
>> Finally, I never, ever compact the read buffer. It requires doing a 
>> memcopy which is definitely not necessary.
>>
>> Anyway, those tests about time I made were just to get an order of 
>> magnitude about speed difference. I intended to do them differently 
>> but JMH looks good so I'll use JMH to test now.
>>
>> Short reads only happen when fill(true) is called. That happens for 
>> desperate get of data.
>>
>> I'll look into the avoiding double reading requests. I do think it 
>> won't bring significant improvements if any at all. It only happens 
>> when the buffer is nearly empty and any byte of data is welcome "at 
>> any cost".
>> Besides, whomever called read at that point would also have had an 
>> availability() of 0 and still called read()/read(byte[]).
>>
>>
>> On 26/10/2016 06:14, Bernd Eckenfels wrote:
>>>  Hallo Brunoais,
>>>
>>> In the past I die some experiments with non-blocking file channels 
>>> in the hope to increase throughput in a similiar way then your 
>>> buffered stream. I also used direct allocated buffers. However my 
>>> results have not been that encouraging (especially If a upper layer 
>>> used larger reads). I thought back in the time this was mostly die 
>>> to the fact that it NOT wraps to real AsyncFIO on most platforms. 
>>> But maybe I just measured it wrong, so I will have a closer look on 
>>> your impl.
>>>
>>> Generally I would recommend to make the Benchmark a bit more 
>>> reliable with JMH and in order to do this to externalize the direct 
>>> buffer allocation (as it ist slow if done repeatingly). This also 
>>> allows you to publish some results with varrying workloads (on 
>>> different machines).
>>>
>>> I would also measure the readCount to see if short reads happen.
>>>
>>>  BTW, I might as well try to only read till the end of the buffer in 
>>> the backfilling-wraps-around case and not issue two requests, that 
>>> might remove some additional latency.
>>>
>>> Gruss
>>> Bernd
>>> -- 
>>> http://bernd.eckenfels.net
>>>
>>> _____________________________
>>> From: Brunoais <brunoaiss at gmail.com <mailto:brunoaiss at gmail.com>>
>>> Sent: Montag, Oktober 24, 2016 6:30 PM
>>> Subject: Re: Request/discussion: BufferedReader reading using async 
>>> API while providing sync API
>>> To: Pavel Rappo <pavel.rappo at oracle.com 
>>> <mailto:pavel.rappo at oracle.com>>
>>> Cc: <core-libs-dev at openjdk.java.net 
>>> <mailto:core-libs-dev at openjdk.java.net>>
>>>
>>>
>>> Attached and sending!
>>>
>>>
>>> On 24/10/2016 13:48, Pavel Rappo wrote:
>>> > Could you please send a new email on this list with the source 
>>> attached as a
>>> > text file?
>>> >
>>> >> On 23 Oct 2016, at 19:14, Brunoais <brunoaiss at gmail.com 
>>> <mailto:brunoaiss at gmail.com>> wrote:
>>> >>
>>> >> Here's my poc/prototype:
>>> >> http://pastebin.com/WRpYWDJF
>>> >>
>>> >> I've implemented the bare minimum of the class that follows the 
>>> same contract of BufferedReader while signaling all issues I think 
>>> it may have or has in comments.
>>> >> I also wrote some javadoc to help guiding through the class.
>>> >>
>>> >> I could have used more fields from BufferedReader but the names 
>>> were so minimalistic that were confusing me. I intent to change them 
>>> before sending this to openJDK.
>>> >>
>>> >> One of the major problems this has is long overflowing. It is 
>>> major because it is hidden, it will be extremely rare and it takes a 
>>> really long time to reproduce. There are different ways of dealing 
>>> with it. From just documenting to actually making code that works 
>>> with it.
>>> >>
>>> >> I built a simple test code for it to have some ideas about 
>>> performance and correctness.
>>> >>
>>> >> http://pastebin.com/eh6LFgwT
>>> >>
>>> >> This doesn't do a through test if it is actually working 
>>> correctly but I see no reason for it not working correctly after 
>>> fixing the 2 bugs that test found.
>>> >>
>>> >> I'll also leave here some conclusions about speed and resource 
>>> consumption I found.
>>> >>
>>> >> I made tests with default buffer sizes, 5000B 15_000B and 
>>> 500_000B. I noticed that, with my hardware, with the 1 530 000 000B 
>>> file, I was getting around:
>>> >>
>>> >> In all buffers and fake work: 10~15s speed improvement ( from 90% 
>>> HDD speed to 100% HDD speed)
>>> >> In all buffers and no fake work: 1~2s speed improvement ( from 
>>> 90% HDD speed to 100% HDD speed)
>>> >>
>>> >> Changing the buffer size was giving different reading speeds but 
>>> both were quite equal in how much they would change when changing 
>>> the buffer size.
>>> >> Finally, I could always confirm that I/O was always the slowest 
>>> thing while this code was running.
>>> >>
>>> >> For the ones wondering about the file size; it is both to avoid 
>>> OS cache and to make the reading at the main use-case these objects 
>>> are for (large streams of bytes).
>>> >>
>>> >> @Pavel, are you open for discussion now ;)? Need anything else?
>>> >>
>>> >> On 21/10/2016 19:21, Pavel Rappo wrote:
>>> >>> Just to append to my previous email. BufferedReader wraps any 
>>> Reader out there.
>>> >>> Not specifically FileReader. While you're talking about the case 
>>> of effective
>>> >>> reading from a file.
>>> >>>
>>> >>> I guess there's one existing possibility to provide exactly what 
>>> you need (as I
>>> >>> understand it) under this method:
>>> >>>
>>> >>> /**
>>> >>> * Opens a file for reading, returning a {@code BufferedReader} 
>>> to read text
>>> >>> * from the file in an efficient manner...
>>> >>> ...
>>> >>> */
>>> >>> java.nio.file.Files#newBufferedReader(java.nio.file.Path)
>>> >>>
>>> >>> It can return _anything_ as long as it is a BufferedReader. We 
>>> can do it, but it
>>> >>> needs to be investigated not only for your favorite OS but for 
>>> other OSes as
>>> >>> well. Feel free to prototype this and we can discuss it on the 
>>> list later.
>>> >>>
>>> >>> Thanks,
>>> >>> -Pavel
>>> >>>
>>> >>>> On 21 Oct 2016, at 18:56, Brunoais <brunoaiss at gmail.com 
>>> <mailto:brunoaiss at gmail.com>> wrote:
>>> >>>>
>>> >>>> Pavel is right.
>>> >>>>
>>> >>>> In reality, I was expecting such BufferedReader to use only a 
>>> single buffer and have that Buffer being filled asynchronously, not 
>>> in a different Thread.
>>> >>>> Additionally, I don't have the intention of having a larger 
>>> buffer than before unless stated through the API (the constructor).
>>> >>>>
>>> >>>> In my idea, internally, it is supposed to use 
>>> java.nio.channels.AsynchronousFileChannel or equivalent.
>>> >>>>
>>> >>>> It does not prevent having two buffers and I do not intent to 
>>> change BufferedReader itself. I'd do an BufferedAsyncReader of sorts 
>>> (any name suggestion is welcome as I'm an awful namer).
>>> >>>>
>>> >>>>
>>> >>>> On 21/10/2016 18:38, Roger Riggs wrote:
>>> >>>>> Hi Pavel,
>>> >>>>>
>>> >>>>> I think Brunoais asking for a double buffering scheme in which 
>>> the implementation of
>>> >>>>> BufferReader fills (a second buffer) in parallel with the 
>>> application reading from the 1st buffer
>>> >>>>> and managing the swaps and async reads transparently.
>>> >>>>> It would not change the API but would change the interactions 
>>> between the buffered reader
>>> >>>>> and the underlying stream. It would also increase memory 
>>> requirements and processing
>>> >>>>> by introducing or using a separate thread and the necessary 
>>> synchronization.
>>> >>>>>
>>> >>>>> Though I think the formal interface semantics could be 
>>> maintained, I have doubts
>>> >>>>> about compatibility and its unintended consequences on 
>>> existing subclasses,
>>> >>>>> applications and libraries.
>>> >>>>>
>>> >>>>> $.02, Roger
>>> >>>>>
>>> >>>>> On 10/21/16 1:22 PM, Pavel Rappo wrote:
>>> >>>>>> Off the top of my head, I would say it's not possible to 
>>> change the design of an
>>> >>>>>> _extensible_ type that has been out there for 20 or so years. 
>>> All these I/O
>>> >>>>>> streams from java.io <http://java.io> were designed for 
>>> simple synchronous use case.
>>> >>>>>>
>>> >>>>>> It's not that their design is flawed in some way, it's that 
>>> they doesn't seem to
>>> >>>>>> suit your needs. Have you considered using 
>>> java.nio.channels.AsynchronousFileChannel
>>> >>>>>> in your applications?
>>> >>>>>>
>>> >>>>>> -Pavel
>>> >>>>>>
>>> >>>>>>> On 21 Oct 2016, at 17:08, Brunoais <brunoaiss at gmail.com 
>>> <mailto:brunoaiss at gmail.com>> wrote:
>>> >>>>>>>
>>> >>>>>>> Any feedback on this? I'm really interested in implementing 
>>> such BufferedReader/BufferedStreamReader to allow speeding up my 
>>> applications without having to think in an asynchronous way or 
>>> multi-threading while programming with it.
>>> >>>>>>>
>>> >>>>>>> That's why I'm asking this here.
>>> >>>>>>>
>>> >>>>>>>
>>> >>>>>>> On 13/10/2016 14:45, Brunoais wrote:
>>> >>>>>>>> Hi,
>>> >>>>>>>>
>>> >>>>>>>> I looked at BufferedReader source code for java 9 long with 
>>> the source code of the channels/streams used. I noticed that, like 
>>> in java 7, BufferedReader does not use an Async API to load data 
>>> from files, instead, the data loading is all done synchronously even 
>>> when the OS allows requesting a file to be read and getting a 
>>> warning later when the file is effectively read.
>>> >>>>>>>>
>>> >>>>>>>> Why Is BufferedReader not async while providing a sync API?
>>> >>>>>>>>
>>> >
>>>
>>>
>>>
>>
>