best practice for processing large binary file

Andrew M andrew at oc384.net
Mon Sep 26 17:26:56 PDT 2011


Thanks Christian.  I will keep that in mind.

Assuming the file is larger than not just JVM heap but total system RAM 
also, what then?

In those cases I would like some way for the JVM to be smartly loading 
the file into RAM ahead of me accessing it with ByteBuffer's getter 
methods.  Once the data has been accessed then that portion of the 
buffered data could be freed up.  Is that how it works?

-Andrew

On 9/26/2011 5:48 PM, Christian Schlichtherle wrote:
> According to the Javadoc a MappedByteBuffer is a direct byte buffer, so it is not
> allocated on the heap. However you should know that on Windows, obtaining a
 >MappedByteBuffer may block access to the file by third parties even 
after a call to
> close() on the channel, because the byte buffer is still effective. This may not change
 > even if you call System.gc() because as said before, the byte buffer 
is not
 > allocated on the heap. So using mapped byte buffers is a good choice 
for short
 > running applications only. As far as I remember there is a bug report 
for this,
 > which you may google for the latest update.
>>> For JDK7 I don't know if there's a better / new way but I would
>>> use MappedByteBuffer for such cases:
>>> http://download.oracle.com/javase/7/docs/api/java/nio/MappedByteBuffer.html
>>
>> Should I still have a thread reading the file's records into a queue while another consumer thread consumes the byte arrays from the queue? Or is it now just as good for me to have just a single thread calling myMappedByteBuffer.getFloat or getInt, etc since the myMappedByteBuffer is reading the file in another thread?
>>
>> What if the file being read is larger than available heap?
>>
>> Thanks!
>> Andrew


More information about the nio-discuss mailing list