AW: best practice for processing large binary file

David M. Lloyd david.lloyd at redhat.com
Tue Sep 27 11:33:29 PDT 2011


On 09/27/2011 01:10 PM, Andrew M wrote:
>> This should be covered by the Javadoc at
>> http://download.oracle.com/javase/7/docs/api/java/nio/channels/FileChannel.html#map(java.nio.channels.FileChannel.MapMode,%20long,%20long)
>>
>>
>> The size parameter for the region to map cannot exceed Integer.MAX_VALUE.
>> This may or may not be more than the available RAM. So in any case, you
>> would need to prepare your code to map a file in chunks.
>
> Yeah I could do it in chunks. Say I have 2 GB of RAM to play with:
> 1) map first 1GB of file into chunk A.
> 2) start processing chunk A in a new thread
> 3) map second GB of file into chunk B.
> 4) Wait for processing chunk A to complete.
> 5) discard chunk A
> 6) start processing chunk B in a new thread.
> 7) map third GB of file into chunk C.
> 8) Wait for processing B to complete.
> etc...
>
> It would be nice if java provided some sliding buffer which does this
> for me using better code than what I'd probably write, automatically
> discarding read data and reading new data in its place.

Mapping a file doesn't actually use RAM, it uses address space (and page 
table entries, and any of their corresponding OS-specific resources). 
So you could map a bunch of big GB-size regions and then just access the 
buffers randomly as needed, and use the force() method to sync out your 
changes.  Just avoid using load() or you'll cause some major problems.

TBH though, it might be simpler (depending on your use case) to just use 
RandomAccessFile and be done with it.

-- 
- DML


More information about the nio-discuss mailing list