AW: best practice for processing large binary file

Andrew M andrew at oc384.net
Tue Sep 27 11:10:41 PDT 2011


> This should be covered by the Javadoc at
> http://download.oracle.com/javase/7/docs/api/java/nio/channels/FileChannel.html#map(java.nio.channels.FileChannel.MapMode,%20long,%20long)
>
> The size parameter for the region to map cannot exceed Integer.MAX_VALUE.
> This may or may not be more than the available RAM. So in any case, you
> would need to prepare your code to map a file in chunks.

Yeah I could do it in chunks.  Say I have 2 GB of RAM to play with:
	1) map first 1GB of file into chunk A.
	2) start processing chunk A in a new thread
	3) map second GB of file into chunk B.
	4) Wait for processing chunk A to complete.
	5) discard chunk A
	6) start processing chunk B in a new thread.
	7) map third GB of file into chunk C.
	8) Wait for processing B to complete.
	etc...

It would be nice if java provided some sliding buffer which does this 
for me using better code than what I'd probably write, automatically 
discarding read data and reading new data in its place.

> Given all these limitations of FileChannel.map(), I would look for something
> else, in particular for a long running application. A traditional buffering
> solution might be better. If you just need read-only access to the large
> file, you might want to consider this (shameless self-advertising):
> http://truezip.java.net/apidocs/de/schlichtherle/truezip/rof/BufferedReadOnlyFile.html

TrueZIP looks cool.  Any plans for 7z compression support?  I store my 
large stock/options market data files in 7z.


More information about the nio-discuss mailing list