AW: best practice for processing large binary file
David M. Lloyd
david.lloyd at redhat.com
Tue Sep 27 11:33:29 PDT 2011
On 09/27/2011 01:10 PM, Andrew M wrote:
>> This should be covered by the Javadoc at
>> http://download.oracle.com/javase/7/docs/api/java/nio/channels/FileChannel.html#map(java.nio.channels.FileChannel.MapMode,%20long,%20long)
>>
>>
>> The size parameter for the region to map cannot exceed Integer.MAX_VALUE.
>> This may or may not be more than the available RAM. So in any case, you
>> would need to prepare your code to map a file in chunks.
>
> Yeah I could do it in chunks. Say I have 2 GB of RAM to play with:
> 1) map first 1GB of file into chunk A.
> 2) start processing chunk A in a new thread
> 3) map second GB of file into chunk B.
> 4) Wait for processing chunk A to complete.
> 5) discard chunk A
> 6) start processing chunk B in a new thread.
> 7) map third GB of file into chunk C.
> 8) Wait for processing B to complete.
> etc...
>
> It would be nice if java provided some sliding buffer which does this
> for me using better code than what I'd probably write, automatically
> discarding read data and reading new data in its place.
Mapping a file doesn't actually use RAM, it uses address space (and page
table entries, and any of their corresponding OS-specific resources).
So you could map a bunch of big GB-size regions and then just access the
buffers randomly as needed, and use the force() method to sync out your
changes. Just avoid using load() or you'll cause some major problems.
TBH though, it might be simpler (depending on your use case) to just use
RandomAccessFile and be done with it.
--
- DML
More information about the nio-discuss
mailing list